RM training loss becomes NAN when finish the first training step. #288

lixsh6 · 2024-05-11T07:31:45Z

I used a large model (> 170B) as my reward model. In the very beginning, loss is normal. But when training one step, the loss becomes NAN. This situation didn't happen when I used a smaller base model (e.g., 30B) to train RM. Do you have any suggestions about this?

hijkzzz · 2024-05-11T14:17:06Z

We don't have such a big model, maybe it is related to DeepSpeed?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RM training loss becomes NAN when finish the first training step. #288

RM training loss becomes NAN when finish the first training step. #288

lixsh6 commented May 11, 2024

hijkzzz commented May 11, 2024

RM training loss becomes NAN when finish the first training step. #288

RM training loss becomes NAN when finish the first training step. #288

Comments

lixsh6 commented May 11, 2024

hijkzzz commented May 11, 2024