[s2s] fix --gpus clarg collision #6358

sshleifer · 2020-08-08T23:54:40Z

Problem

finetune.py adds all the default pl args with this line

parser = pl.Trainer.add_argparse_args(parser)

and all the generic args from add_generic_args.

Solution

This moves the overlapping arg from lightning_base.py to the 2 pl examples that need it.

CC @stas00 @patil-suraj

codecov · 2020-08-09T00:01:35Z

Codecov Report

Merging #6358 into master will increase coverage by 0.17%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #6358      +/-   ##
==========================================
+ Coverage   78.20%   78.38%   +0.17%     
==========================================
  Files         148      148              
  Lines       27196    27196              
==========================================
+ Hits        21269    21317      +48     
+ Misses       5927     5879      -48

Impacted Files	Coverage Δ
src/transformers/modeling_tf_distilbert.py	`64.47% <0.00%> (-32.95%)`	⬇️
src/transformers/file_utils.py	`82.44% <0.00%> (+0.25%)`	⬆️
src/transformers/tokenization_utils_base.py	`93.84% <0.00%> (+1.11%)`	⬆️
src/transformers/tokenization_transfo_xl.py	`42.48% <0.00%> (+8.92%)`	⬆️
src/transformers/generation_tf_utils.py	`86.46% <0.00%> (+9.27%)`	⬆️
src/transformers/modeling_tf_gpt2.py	`94.72% <0.00%> (+22.87%)`	⬆️
src/transformers/tokenization_xlnet.py	`90.09% <0.00%> (+23.42%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1aec991...b466e25. Read the comment docs.

stas00 · 2020-08-09T00:16:03Z

This is the issue I opened about it: #6310

it's more than just --gpus

sshleifer · 2020-08-09T00:31:04Z

which other ones besides --gpus?

stas00 · 2020-08-09T00:42:03Z

Anything else that is defined both in lightening_base.add_generic_args and PL's pl.Trainer.add_argparse_args(parser), if both get called.

With your PR nothing collides at the moment.

If we go into the direction of each module (and the base) defining its own args, most likely finetune.pyneeds to do the same and not use pl.Trainer.add_argparse_args(parser).

On the other hand, copying the same common args to every module is less than optimal. If transformers support --gpus, it shouldn't be too difficult to make all examples support it - or fail it's passed and it can't support it. then these common args can go into lightening_base and not be redefined by each module.

Additionally, we can make any of these args optional like it was done recently with #6149, so if the arg is not there, it will not fail if the example doesn't support it.

sshleifer · 2020-08-09T01:49:34Z

I don't understand exactly what you're proposing I don't think. This is just meant to fix a bug.
I agree that the current setup where only finetune.py uses Trainer.from_argparse_args is suboptimal, but I don't really want to mess with it since it's working and our test coverage isn't good enough to know if we've broken things.

stas00 · 2020-08-09T02:13:30Z

I'm trying to communicate that currently adding new args is difficult because they are scattered in various places. It's not easy to tell when to put them in lightning_base, and when inside an example class and the issue #6310 points to further collision with pl.Trainer.add_argparse_args(parser) use in finetune.py.

This PR duplicated a cl arg --gpus that ideally should be registered only once in lightning_base, and not repeated in every example, IMHO. You had to do it because finetune.py does things differently than the rest of examples and so it can't use lightening_base normally. And it's not over since other examples will want --gpus too.

Replacing pl.Trainer.add_argparse_args(parser) in finetune.py with the approach all other examples use will quickly uncover any missing cl args that it needs to register, and a quick grep will show them all:

perl -lne 'm|hparams\.(\w+)| && print $1' finetune.py | sort | uniq
accumulate_grad_batches
data_dir
eval_batch_size
freeze_embeds
freeze_encoder
git_sha
gpus
label_smoothing
max_epochs
max_source_length
max_target_length
n_test
n_train
num_workers
n_val
output_dir
pkl
sortish_sampler
src_lang
test_checkpoint
test_max_target_length
tgt_lang
train_batch_size
val_max_target_length
warmup_steps

fix clarg collision

ba106b4

Move clarg to the other .py files

c265fcd

sshleifer changed the title ~~fix clarg collision~~ [s2s] fix --gpus clarg collision Aug 9, 2020

style

b466e25

sshleifer merged commit 9a5ef83 into huggingface:master Aug 9, 2020

sshleifer deleted the fix-clarg branch August 9, 2020 01:51

stas00 mentioned this pull request Aug 9, 2020

the test now works again #6371

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[s2s] fix --gpus clarg collision #6358

[s2s] fix --gpus clarg collision #6358

sshleifer commented Aug 8, 2020 •

edited

Loading

codecov bot commented Aug 9, 2020 •

edited

Loading

stas00 commented Aug 9, 2020 •

edited

Loading

sshleifer commented Aug 9, 2020

stas00 commented Aug 9, 2020 •

edited

Loading

sshleifer commented Aug 9, 2020 •

edited

Loading

stas00 commented Aug 9, 2020 •

edited

Loading

[s2s] fix --gpus clarg collision #6358

[s2s] fix --gpus clarg collision #6358

Conversation

sshleifer commented Aug 8, 2020 • edited Loading

Problem

Solution

codecov bot commented Aug 9, 2020 • edited Loading

Codecov Report

stas00 commented Aug 9, 2020 • edited Loading

sshleifer commented Aug 9, 2020

stas00 commented Aug 9, 2020 • edited Loading

sshleifer commented Aug 9, 2020 • edited Loading

stas00 commented Aug 9, 2020 • edited Loading

sshleifer commented Aug 8, 2020 •

edited

Loading

codecov bot commented Aug 9, 2020 •

edited

Loading

stas00 commented Aug 9, 2020 •

edited

Loading

stas00 commented Aug 9, 2020 •

edited

Loading

sshleifer commented Aug 9, 2020 •

edited

Loading

stas00 commented Aug 9, 2020 •

edited

Loading