Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] support for large forward_batch_size in seq2seq models #100

Closed

Conversation

younesbelkada
Copy link
Contributor

@younesbelkada younesbelkada commented Jan 23, 2023

What does this PR do?

First of all, please welcome the PR number 100!

Before this PR, it was not possible to run the T5 example with forward_batch_size > 1. We suspected that something was wrong with the computation of the loss function when padding tokens are present.

This PR fixes a bug we had for seq2seq models (and I believe for causalLM models too). It seems that we forgot to mask out the logits/hidden states that correspond to the pad tokens when computing the loss function.

See a similar implementation here: https://github.com/CarperAI/trlx/blob/main/trlx/trainer/nn/ppo_models.py#L166-L191 where the loss takes care of ignoring the terms corresponding to pad tokens

Can also add tests!

GPT2 run: https://wandb.ai/distill-bloom/trl/runs/4cs5z6j3?workspace=user-younesbelkada
T5 run: https://wandb.ai/distill-bloom/trl/runs/lxgi5ae9?workspace=user-younesbelkada
It seems that now the reward_std is higher, leading to less smooth reward_mean curves, so putting this PR as draft

cc @lvwerra

- fix attention mask & loss issue
- fix example
- add attention mask construction in model
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

- comment calls to `_return_slice_index` function
- add `offset` for `padding_side=left` support
@lvwerra
Copy link
Member

lvwerra commented Feb 7, 2023

Closing in favour of #133.

@lvwerra lvwerra closed this Feb 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants