Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Labels for CLS, SEP and PAD as well as X #30

Closed
namiyousef opened this issue Mar 29, 2022 · 0 comments
Closed

Labels for CLS, SEP and PAD as well as X #30

namiyousef opened this issue Mar 29, 2022 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@namiyousef
Copy link
Owner

namiyousef commented Mar 29, 2022

  • Do CLS and SEP need to have separate labels than your training labels?
    Yes, these should be labelled as -100.

  • Does PAD need to have a separate label?
    No, this should be labelled as -100.

  • Does the attention mask basically ignore the effect of those things with attention mask 0, or is it still important?
    Yes it does. You should have attention mask set to zero for PAD, but not for CLS and SEP because they can contain important information about the training items. The idea is:
    for PAD: don't attend and don't compute loss
    for CLS/SEP: attend but don't compute loss
    There is a thread on huggingface forums on this.

It is a requirement to use -100 for the CLS and SEP tokens because of the crossentropy function in PyTorch. You might apply this to the subtokens as well if you wanted to ignore them.

@namiyousef namiyousef added the bug Something isn't working label Apr 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants