Skip to content

Conversation

mattdangerw
Copy link
Member

We were calling cumsum with the wrong axis, meaning we were not correctly masking all positions after an end token.

Copy link
Contributor

@chenmoneygithub chenmoneygithub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I am surprised we did not catch this....

self.preprocessed_batch["padding_mask"][:, :5],
)

def test_early_stopping(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the test case, this is very cool

We were calling cumsum with the wrong axis, meaning we were not
correctly masking all positions after an end token.
@mattdangerw mattdangerw force-pushed the fix-update-mask-computation branch from a27e239 to 675fba6 Compare May 11, 2023 00:17
@mattdangerw
Copy link
Member Author

/gcbrun

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants