Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using different loss #19

Open
vinevix opened this issue Feb 8, 2023 · 1 comment
Open

Using different loss #19

vinevix opened this issue Feb 8, 2023 · 1 comment

Comments

@vinevix
Copy link

vinevix commented Feb 8, 2023

Hi, I'm doing experiments with your code and I need to use different loss than CrossEntropy. Since you added -1 to the targets (during padding), you use ignore_index=-1 flag, how can I prevent to compute gradient on those -1 using different losses? (Of course they can't offer the ignore_index flag)

@buxiangzhiren
Copy link
Owner

You could find the index of each token which denotes ignore_index!=-1 and then keep the corresponding loss in your sentence. In other words, it is not necessary to sum the losses of all tokens in a sentence, you just sum the losses of some tokens whose ignore_index!=-1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants