Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: MNIST example produces NaN for loss #208

Closed
antimora opened this issue Mar 7, 2023 · 0 comments · Fixed by #209
Closed

Regression: MNIST example produces NaN for loss #208

antimora opened this issue Mar 7, 2023 · 0 comments · Fixed by #209

Comments

@antimora
Copy link
Collaborator

antimora commented Mar 7, 2023

Describe the bug
Running the training gives NaN for loss:


[Metrics]
  - Train Accuracy: epoch 0.00 % - batch 0.00 %
  - Train Loss: epoch NaN - batch NaN
  - Valid Accuracy: epoch 0.00 % - batch 0.00 %
  - Valid Loss: epoch NaN - batch NaN

[Progress]
  - Iteration 1422 Epoch 2/6
  - iteration [############################>-----------------------------------------------------------------------] (84s)
  - epoch     [#################################>-------------------------------------------------------------------] (4m)

To Reproduce

  1. Check out the latest commit: be96160
  2. cd to examples/mnist
  3. run: cargo run --example mnist --release --features tch-gpu

Expected behavior

Screenshots
image

Desktop (please complete the following information):

[mnist]$ uname -a
Darwin MacBook-Pro.local 22.3.0 Darwin Kernel Version 22.3.0: 
Mon Jan 30 20:38:37 PST 2023; root:xnu-8792.81.3~2/RELEASE_ARM64_T6000 arm64

Additional context
It worked on a commit before a few days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant