Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why specify ignore_index=0 in the NLLLoss function in BERTTrainer? #98

Open
Jasmine969 opened this issue Jul 7, 2022 · 1 comment
Open

Comments

@Jasmine969
Copy link

Jasmine969 commented Jul 7, 2022

trainer/pretrain.py

class BERTTrainer:
    def __init__(self, ...):
        ... 
        # Using Negative Log Likelihood Loss function for predicting the masked_token
        self.criterion = nn.NLLLoss(ignore_index=0)
        ...

I cannot understand why ignore index=0 is specified when calculating NLLLoss. If the ground truth of is_next is False (label = 0) in terms of the NSP task but BERT predicts True, then NLLLoss will be 0 (or nan)... so what's the aim of ignore_index = 0 ???

====================

Well, I've found that ignore_index = 0 is useful to the MLM task, but I still can't agree the NSP task should share the same NLLLoss with MLM.

@MingchangLi
Copy link

MingchangLi commented Jan 10, 2023

see #32
change
self.criterion = nn.NLLLoss(ignore_index=0)
to
self.criterion = nn.NLLLoss()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants