Skip to content
This repository has been archived by the owner on Nov 22, 2022. It is now read-only.

Division by Zero bug in MLM Metric Reporter #968

Closed

Conversation

kartikayk
Copy link
Contributor

Summary:
While using fp16 during MLM training there's a weird bug during metric reporting that causes a division by zero failure while computing the training speed (an example is below). Preventing this failure by adding a small value to the denominator.

Example: f136667996

Reviewed By: chenyangyu1988

Differential Revision: D17293910

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Sep 10, 2019
Summary:
Pull Request resolved: facebookresearch#968

While using fp16 during MLM training there's a weird bug during metric reporting that causes a division by zero failure while computing the training speed (an example is below). Preventing this failure by adding a small value to the denominator.

Example: f136667996

Reviewed By: chenyangyu1988

Differential Revision: D17293910

fbshipit-source-id: 2243e5fecb05ac666c3e084900e95d428ec55428
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in f64cb10.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CLA Signed Do not delete this pull request or issue due to inactivity. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants