-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Closed
Description
Hi,
Thank you for your clean code on Bert. I have a question about Mask LM loss after I read your code. Your program computes a mask language model loss on both positive sentence pairs and negative pairs.
Does it make sense to compute Mask LM loss on negative sentence pairs? I am not sure how Google computes this loss.
Metadata
Metadata
Assignees
Labels
No labels