Incorrect use of labels for GPT2? #41

soumyasanyal · 2022-06-02T01:58:21Z

Hi,

As far as I understand the model and the usage of GPT2, shouldn't the get_dummy_token function return torch.ones() * -100 instead of torch.zeros()? This is because we should be ignoring the outputs of GPT2 for these prefix inputs. Currently, it's forcing the model to predict token 0 which is the exclamation mark ("!").

Reference lines: https://github.com/rmokady/CLIP_prefix_caption/blob/main/train.py#L222-L223

Thanks!

The text was updated successfully, but these errors were encountered:

robertodessi · 2022-06-06T13:56:00Z

This line sets the loss to ignore 0-index tokens, it is equivalent to the default -100 behavior of pytorch cross entropy https://github.com/rmokady/CLIP_prefix_caption/blob/main/train.py#L316

soumyasanyal · 2022-06-06T16:05:52Z

I see. Forgot to check there. Thanks!

soumyasanyal closed this as completed Jun 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect use of labels for GPT2? #41

Incorrect use of labels for GPT2? #41

soumyasanyal commented Jun 2, 2022

robertodessi commented Jun 6, 2022

soumyasanyal commented Jun 6, 2022

Incorrect use of labels for GPT2? #41

Incorrect use of labels for GPT2? #41

Comments

soumyasanyal commented Jun 2, 2022

robertodessi commented Jun 6, 2022

soumyasanyal commented Jun 6, 2022