Convert notebook 05 #80

edbeeching · 2023-01-07T20:22:00Z

This PR converts notebook 5, gpt2 sentiment control to work with new API.

I benchmarked for 2 hours - 200 iterations, the wandb report is here, here is the original one for comparison. (currently private, owned by @lvwerra )

Resolves issues #71 and #79

…batches

HuggingFaceDocBuilderDev · 2023-01-07T20:31:47Z

The documentation is not available anymore as the PR was closed or merged.

lvwerra · 2023-01-13T15:10:49Z

Hi @edbeeching, thanks for updating the notebook! Looks really good, here a few points:

similar to the main sentiment notebook we could just use the pipeline for the classification and get rid of build_bert_batch_from_text
we can indeed use accelerate for the device placement. ppo_trainer.accelerator.device should be where you get the device.
if we replace sentiment_model.to(device) with sentiment_model.to(device); we can avoid printing the whole model graph
I wonder if we should remove most of the logging from the loop for simplicity, what do you think?
All the controlled continuation could be done much easier with the text-generation pipeline. I was young and didn't know any better :)

younesbelkada

This looks very good to me! Thanks for working on this !

lvwerra · 2023-01-25T12:03:11Z

There are still some issues regarding loss spikes (see #101). Will merge for now and investigate further. @natolambert you can use this notebook to see investigate the spikes. Full logs of a run can be found here.

edbeeching · 2023-01-25T14:46:21Z

@lvwerra and @younesbelkada , thanks for looking at, fixing and merging this. I have gone a bit silent due to pat leave, looking forward to getting back to work :)

natolambert · 2023-01-30T21:48:56Z

I am going to make another PR where this notebook is in example form -- much easier for doing multiple jobs and wider scale experimentation. It's also interesting that @edbeeching 's example didn't have the reward spike. I keep finding things to play with, so that's good for now.

edbeeching added 5 commits January 7, 2023 21:24

converting notebook

6c6ba2d

converts sentiment control notebook to new API

4f1e689

adds drop last to PPOTrainer for consistent batch size

977cb08

delete copy of original notebook

85b0db7

Removes change to ppo_trainer.py, uses next(iter(dataloader)) to get …

320826c

…batches

edbeeching force-pushed the convert-notebook-05 branch from 0cbeb81 to 320826c Compare January 7, 2023 20:28

huggingface deleted a comment from leandro Jan 9, 2023

edbeeching requested a review from lvwerra January 10, 2023 10:52

lvwerra mentioned this pull request Jan 16, 2023

Roadmap - trl 0.2 #64

Closed

26 tasks

leandro added 2 commits January 25, 2023 12:44

Merge branch 'main' into convert-notebook-05

4ec3c11

update notebook

1c36cf5

lvwerra requested a review from younesbelkada January 25, 2023 11:52

younesbelkada approved these changes Jan 25, 2023

View reviewed changes

lvwerra merged commit 6b37618 into main Jan 25, 2023

lvwerra deleted the convert-notebook-05 branch January 25, 2023 12:19

natolambert mentioned this pull request Jan 28, 2023

Error in gpt2-sentiment-control.ipynb #79

Closed

lvwerra mentioned this pull request Jan 30, 2023

05-gpt2-sentiment-control.ipynb notebook, gpt2_tokeinizer is missing #48

Closed

lvwerra mentioned this pull request Feb 3, 2023

Spikes in PPO policy loss #101

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert notebook 05 #80

Convert notebook 05 #80

edbeeching commented Jan 7, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Jan 7, 2023 •

edited

Loading

lvwerra commented Jan 13, 2023

younesbelkada left a comment

lvwerra commented Jan 25, 2023

edbeeching commented Jan 25, 2023

natolambert commented Jan 30, 2023 •

edited

Loading

Convert notebook 05 #80

Convert notebook 05 #80

Conversation

edbeeching commented Jan 7, 2023 • edited Loading

HuggingFaceDocBuilderDev commented Jan 7, 2023 • edited Loading

lvwerra commented Jan 13, 2023

younesbelkada left a comment

Choose a reason for hiding this comment

lvwerra commented Jan 25, 2023

edbeeching commented Jan 25, 2023

natolambert commented Jan 30, 2023 • edited Loading

edbeeching commented Jan 7, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Jan 7, 2023 •

edited

Loading

natolambert commented Jan 30, 2023 •

edited

Loading