Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert notebook 05 #80

Merged
merged 7 commits into from
Jan 25, 2023
Merged

Convert notebook 05 #80

merged 7 commits into from
Jan 25, 2023

Conversation

edbeeching
Copy link
Collaborator

@edbeeching edbeeching commented Jan 7, 2023

This PR converts notebook 5, gpt2 sentiment control to work with new API.

I benchmarked for 2 hours - 200 iterations, the wandb report is here, here is the original one for comparison. (currently private, owned by @lvwerra )

Resolves issues #71 and #79

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jan 7, 2023

The documentation is not available anymore as the PR was closed or merged.

@huggingface huggingface deleted a comment from leandro Jan 9, 2023
@lvwerra
Copy link
Member

lvwerra commented Jan 13, 2023

Hi @edbeeching, thanks for updating the notebook! Looks really good, here a few points:

  • similar to the main sentiment notebook we could just use the pipeline for the classification and get rid of build_bert_batch_from_text
  • we can indeed use accelerate for the device placement. ppo_trainer.accelerator.device should be where you get the device.
  • if we replace sentiment_model.to(device) with sentiment_model.to(device); we can avoid printing the whole model graph
  • I wonder if we should remove most of the logging from the loop for simplicity, what do you think?
  • All the controlled continuation could be done much easier with the text-generation pipeline. I was young and didn't know any better :)

@lvwerra lvwerra mentioned this pull request Jan 16, 2023
26 tasks
Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very good to me! Thanks for working on this !

@lvwerra
Copy link
Member

lvwerra commented Jan 25, 2023

There are still some issues regarding loss spikes (see #101). Will merge for now and investigate further. @natolambert you can use this notebook to see investigate the spikes. Full logs of a run can be found here.

@lvwerra lvwerra merged commit 6b37618 into main Jan 25, 2023
@lvwerra lvwerra deleted the convert-notebook-05 branch January 25, 2023 12:19
@edbeeching
Copy link
Collaborator Author

@lvwerra and @younesbelkada , thanks for looking at, fixing and merging this. I have gone a bit silent due to pat leave, looking forward to getting back to work :)

@natolambert
Copy link
Contributor

natolambert commented Jan 30, 2023

I am going to make another PR where this notebook is in example form -- much easier for doing multiple jobs and wider scale experimentation. It's also interesting that @edbeeching 's example didn't have the reward spike. I keep finding things to play with, so that's good for now.

@lvwerra lvwerra mentioned this pull request Feb 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants