Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect use of labels for GPT2? #41

Closed
soumyasanyal opened this issue Jun 2, 2022 · 2 comments
Closed

Incorrect use of labels for GPT2? #41

soumyasanyal opened this issue Jun 2, 2022 · 2 comments

Comments

@soumyasanyal
Copy link

Hi,

As far as I understand the model and the usage of GPT2, shouldn't the get_dummy_token function return torch.ones() * -100 instead of torch.zeros()? This is because we should be ignoring the outputs of GPT2 for these prefix inputs. Currently, it's forcing the model to predict token 0 which is the exclamation mark ("!").

Reference lines: https://github.com/rmokady/CLIP_prefix_caption/blob/main/train.py#L222-L223

Thanks!

@robertodessi
Copy link

This line sets the loss to ignore 0-index tokens, it is equivalent to the default -100 behavior of pytorch cross entropy https://github.com/rmokady/CLIP_prefix_caption/blob/main/train.py#L316

@soumyasanyal
Copy link
Author

I see. Forgot to check there. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants