Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Trainer in DataParallel setting #5685

Merged
merged 2 commits into from
Jul 13, 2020
Merged

Fix Trainer in DataParallel setting #5685

merged 2 commits into from
Jul 13, 2020

Conversation

sgugger
Copy link
Collaborator

@sgugger sgugger commented Jul 11, 2020

The new output types seem to break data parallel FYI, see comment on #5671. This is is because of the line

return type(out)(map(gather_map, zip(*outputs)))

in scatter_gather which tries to reconstruct an output of the same type as ours (and fails since it does not provide the necessary arguments). There is no way to fix our ModelOutput to work with this AFAICT.

However, we have the return_tuple argument to fix the issue :-)

@codecov
Copy link

codecov bot commented Jul 11, 2020

Codecov Report

Merging #5685 into master will decrease coverage by 0.20%.
The diff coverage is 25.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5685      +/-   ##
==========================================
- Coverage   78.11%   77.91%   -0.21%     
==========================================
  Files         146      146              
  Lines       25983    25987       +4     
==========================================
- Hits        20297    20247      -50     
- Misses       5686     5740      +54     
Impacted Files Coverage Δ
src/transformers/trainer.py 37.84% <25.00%> (-0.12%) ⬇️
src/transformers/modeling_tf_t5.py 44.56% <0.00%> (-46.35%) ⬇️
src/transformers/modeling_tf_gpt2.py 63.55% <0.00%> (-31.78%) ⬇️
src/transformers/generation_tf_utils.py 79.94% <0.00%> (-6.02%) ⬇️
src/transformers/modeling_tf_utils.py 86.92% <0.00%> (-1.97%) ⬇️
src/transformers/modeling_openai.py 82.31% <0.00%> (+1.28%) ⬆️
src/transformers/modeling_tf_roberta.py 93.36% <0.00%> (+49.37%) ⬆️
src/transformers/modeling_tf_openai.py 95.18% <0.00%> (+74.91%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7fad617...05ec8f6. Read the comment docs.

Copy link
Member

@thomwolf thomwolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sgugger sgugger merged commit ce374ba into master Jul 13, 2020
@sgugger sgugger deleted the fix_dp_model_output branch July 13, 2020 12:37
@stas00 stas00 mentioned this pull request Jul 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants