Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loss returns Nan #6

Open
terencenwz opened this issue Apr 8, 2021 · 3 comments
Open

Loss returns Nan #6

terencenwz opened this issue Apr 8, 2021 · 3 comments

Comments

@terencenwz
Copy link

Loss returns Nan

Some of my settings
causal=true
blindspot_size=1
n_local_attn_heads
ff_chunks=1
reversible=false
use_axial_pos_emb=false

@lucidrains
Copy link
Owner

Hi Terence! do you want to give the latest version a try?

@terencenwz
Copy link
Author

Hi Terence! do you want to give the latest version a try?

Loss goes to Nan after 50k updates for 0.18.1

@ShivanshuPurohit
Copy link

How do you train that far? I'm using the deepspeed example and it terminates after 3k steps with seq_len 256, but at least until then the loss doesn't nan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants