fix bug in batched_forward_pass by ArvinZhuang · Pull Request #144 · huggingface/trl

ArvinZhuang · 2023-02-11T02:03:39Z

fix the bug that will cause the devices not match issue in the batched_forward_pass method.

The reason:
self. data_collator returned tensors will be on CPU, thus the later self.model(**input_kwargs) will give error as model is on GPU.

The proposed solution:
do .to(self.accelerator.device) after self.data_collator

My transformers version: 4.26.1

younesbelkada

Wow thanks a lot for fixing the bug!
I am ok with this fix in the principle that it is a safety checker to make sure all the returned tensors will be on the correct device (regardless where the dataloader will send the device)
Let's run the tests and see!
Wdyt @lvwerra ?

@ArvinZhuang can you run the styling and quality checks? (make style && make quality) Thanks!

younesbelkada · 2023-02-11T08:26:29Z

+                [{"input_ids": q, "attention_mask": torch.ones_like(q)} for q in queries]
+            ).to(self.accelerator.device)

-            attention_mask = [torch.ones_like(r) for r in responses]


Hum why this has been removed? 🤔

because seems attention_mask variable is never used

Perfect, can you run the styling checks so that the testing suite will be executed?

HuggingFaceDocBuilderDev · 2023-02-11T08:27:23Z

The documentation is not available anymore as the PR was closed or merged.

younesbelkada

Thanks for fixing! Could you revert your changes in all the scripts inside examples/ after that we should be good to merge!

ArvinZhuang · 2023-02-11T10:43:47Z

Thanks for fixing! Could you revert your changes in all the scripts inside examples/ after that we should be good to merge!

Hi
these changes are automatically generated by style && quality, but I reverted them anyway

younesbelkada

Thanks for iterating! 🚀

ArvinZhuang · 2023-02-11T10:59:31Z

Also I notice this "Forward batch size > 1 is not well supported yet for encoder-decoder models.
Could you please give me more hits why this is the case? Looks like forward batch size =1 will greatly slow down the inference, I probably can give it a shot to fix this.

Rebecca-Qian · 2023-02-14T05:41:01Z

Had just prepared a PR to fix this 😄 thanks @ArvinZhuang

ArvinZhuang · 2023-02-14T09:45:03Z

Had just prepared a PR to fix this 😄 thanks @ArvinZhuang

@Rebecca-Qian You are welcome :) Im glad to contribute as well

lvwerra

Thanks for the fix!

* fix bug in batched_forward_pass * style and quality * revert examples

fix bug in batched_forward_pass

6f0ac97

younesbelkada reviewed Feb 11, 2023

View reviewed changes

style and quality

3d3a8b0

younesbelkada reviewed Feb 11, 2023

View reviewed changes

revert examples

9ef5ee4

younesbelkada approved these changes Feb 11, 2023

View reviewed changes

younesbelkada requested a review from lvwerra February 11, 2023 10:59

DaehanKim mentioned this pull request Feb 12, 2023

data_collator is not assigned in PPOTrainer #143

Closed

lvwerra approved these changes Feb 14, 2023

View reviewed changes

lvwerra merged commit 00aa31e into huggingface:main Feb 14, 2023

yxliu-TAMU pushed a commit to mincheolseong/ECEN743-GRPO-Project-Proposal that referenced this pull request Apr 20, 2025

fix bug in batched_forward_pass (huggingface#144)

095c5cc

* fix bug in batched_forward_pass * style and quality * revert examples

Conversation

ArvinZhuang commented Feb 11, 2023

Uh oh!

younesbelkada left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

younesbelkada Feb 11, 2023

Choose a reason for hiding this comment

Uh oh!

ArvinZhuang Feb 11, 2023

Choose a reason for hiding this comment

Uh oh!

younesbelkada Feb 11, 2023

Choose a reason for hiding this comment

Uh oh!

ArvinZhuang Feb 11, 2023

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Feb 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

younesbelkada left a comment

Choose a reason for hiding this comment

Uh oh!

ArvinZhuang commented Feb 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

younesbelkada left a comment

Choose a reason for hiding this comment

Uh oh!

ArvinZhuang commented Feb 11, 2023

Uh oh!

Rebecca-Qian commented Feb 14, 2023

Uh oh!

ArvinZhuang commented Feb 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lvwerra left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

younesbelkada left a comment •

edited

Loading

HuggingFaceDocBuilderDev commented Feb 11, 2023 •

edited

Loading

ArvinZhuang commented Feb 11, 2023 •

edited

Loading

ArvinZhuang commented Feb 14, 2023 •

edited

Loading