Loss returns Nan #6

terencenwz · 2021-04-08T13:06:59Z

Loss returns Nan

Some of my settings
causal=true
blindspot_size=1
n_local_attn_heads
ff_chunks=1
reversible=false
use_axial_pos_emb=false

lucidrains · 2021-04-14T03:32:55Z

Hi Terence! do you want to give the latest version a try?

terencenwz · 2021-04-14T08:39:04Z

Hi Terence! do you want to give the latest version a try?

Loss goes to Nan after 50k updates for 0.18.1

ShivanshuPurohit · 2021-05-13T07:32:37Z

How do you train that far? I'm using the deepspeed example and it terminates after 3k steps with seq_len 256, but at least until then the loss doesn't nan.

tatp22 mentioned this issue Apr 9, 2021

Loss goes to 0 when using LinformerLM tatp22/linformer-pytorch#25

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loss returns Nan #6

Loss returns Nan #6

terencenwz commented Apr 8, 2021

lucidrains commented Apr 14, 2021

terencenwz commented Apr 14, 2021

ShivanshuPurohit commented May 13, 2021

Loss returns Nan #6

Loss returns Nan #6

Comments

terencenwz commented Apr 8, 2021

lucidrains commented Apr 14, 2021

terencenwz commented Apr 14, 2021

ShivanshuPurohit commented May 13, 2021