Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix convergence issues and switch to LLaMA in the SST-2 example #343

Merged
merged 2 commits into from
Jul 12, 2023

Conversation

mryab
Copy link
Member

@mryab mryab commented Jul 12, 2023

This PR switches the model to LLaMA in the SST-2 example notebook, as well as fixes NaNs appearing during training by using proper mixed precision with loss scaling. I've removed deep p-tuning for now for the sake of simplicity, will restore and fix the PersonaChat notebook once I fix it as well.

The example run is here: https://wandb.ai/mryab/bloom-sst-2/runs/3e0suayd?workspace=user-mryab. As we can see, the loss is still a bit unstable, bug at least we have no outright divergence:
image

When merged, this should fix #327, #283 and #241.

Copy link
Collaborator

@justheuristic justheuristic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mryab mryab merged commit 13f4e3a into main Jul 12, 2023
6 of 7 checks passed
@mryab mryab deleted the fix_sst2_example branch July 12, 2023 12:50
@poedator
Copy link
Collaborator

poedator commented Jul 21, 2023

The example run is here: https://wandb.ai/mryab/bloom-sst-2/runs/3e0suayd?workspace=user-mryab

@mryab , this link is not valid. 404

@mryab
Copy link
Member Author

mryab commented Jul 21, 2023

Just made it public, thanks for noticing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants