quick-fix for --gpus flag bug #2674

ananyahjha93 · 2020-07-22T19:15:33Z

What does this PR do?

Prevents setting self.on_gpu as True when --gpus flag is not passed by the user in the case where the following way is used to create the trainer

parser = Trainer.add_argparse_args(parser)
args = parser.parse_args()

trainer = Trainer.from_argparse_args(args)

Fixes #2669

Before submitting

Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?
Did you verify new and existing tests pass locally with your changes?
If you made a notable change (that affects users), did you update the CHANGELOG?

codecov · 2020-07-22T19:33:55Z

Codecov Report

Merging #2674 into master will increase coverage by 0%.
The diff coverage is 100%.

@@          Coverage Diff           @@
##           master   #2674   +/-   ##
======================================
  Coverage      92%     92%           
======================================
  Files          74      74           
  Lines        6381    6382    +1     
======================================
+ Hits         5850    5851    +1     
  Misses        531     531

awaelchli · 2020-07-22T20:37:12Z

I think the true fix should be to parse the gpus flag first and then set the on_gpu property, right?

ananyahjha93 · 2020-07-22T20:53:45Z

@awaelchli In this case type takes a callable as a value in form Trainer._arg_default which then evaluates to the gpus flag values if something is passed to the script by the user. If the user does not include the --gpus flag, args.gpus refers to Trainer._arg_default callable.

https://github.com/PyTorchLightning/pytorch-lightning/blob/bugfix/gpus-flag/pytorch_lightning/trainer/trainer.py#L786

awaelchli · 2020-07-22T20:59:44Z

Yes I understand that. We also have a function that takes the user input and converts it to a list.
See distrib_parts.py::_parse_gpu_ids.
There, a callable results in None, which is what we want, right?
It happens later in the trainer init, so if you just move the on_gpu down there after it was parsed, it should be fine, no?
https://github.com/PyTorchLightning/pytorch-lightning/blob/62ce00f96c09de6d137c810921a6cd9e7b60aff5/pytorch_lightning/trainer/trainer.py#L531
Basically after here

Borda · 2020-07-22T21:07:03Z

yes, I would not pass the function at all, it makes unnecessary complex...

ananyahjha93 · 2020-07-22T21:12:02Z

@awaelchli yeah, you are right, thanks for pointing this out.

@Borda what's your suggestion here, if not pass a function?

williamFalcon · 2020-07-23T11:08:23Z

pytorch_lightning/trainer/trainer.py

@@ -532,6 +532,10 @@ def __init__(
        self.root_gpu = determine_root_gpu_device(self.data_parallel_device_ids)
        self.root_device = torch.device("cpu")

+        # self.data_parallel_device_ids is None if gpus is callable


i don't understand what this means

self.data_parallel_device_ids = _parse_gpu_ids(self.gpus)

_parse_gpu_ids returns None is self.gpus is callable, if you don't pass --gpus flag then self.gpus points to Trainer._arg_default

We could move the self.on_gpu = True if (gpus and torch.cuda.is_available()) else False found a few lines above
down here and simply write

self.on_gpu = True if (self.data_parallel_device_ids and torch.cuda.is_available()) else False

then there is no need for an extra if clause or a comment :)

awaelchli

really nice this is fixed, it was annoying with the default arg not properly set.

mergify · 2020-07-24T08:10:47Z

Great job! =)

mergify · 2020-07-24T08:26:09Z

Great job! =)

Borda · 2020-07-24T08:34:23Z

we shall re-count nb tests we are running as this was merged automatically even one test was failing...

quick-fix for --gpus flag bug

7b3f505

mergify bot requested a review from a team July 22, 2020 19:16

warning added

62311ba

warning added

16e926c

ananyahjha93 changed the title ~~[wip] quick-fix for --gpus flag bug~~ quick-fix for --gpus flag bug Jul 22, 2020

ananyahjha93 requested review from Borda, williamFalcon and justusschock July 22, 2020 19:53

Borda approved these changes Jul 22, 2020

View reviewed changes

mergify bot requested a review from a team July 22, 2020 20:06

Borda added allowed_pre_1.0 bug Something isn't working labels Jul 22, 2020

Borda added this to the 0.8.x milestone Jul 22, 2020

This was referenced Jul 22, 2020

SimCLR doesn't seem to utilize GPU for forward/backward passes Lightning-Universe/lightning-bolts#35

Closed

GPUs are not utilized in any self_supervised models without the --gpus flag Lightning-Universe/lightning-bolts#124

Closed

set on_gpu using data_parallel_device_ids

238270b

williamFalcon reviewed Jul 23, 2020

View reviewed changes

mergify bot requested a review from a team July 23, 2020 11:10

self.on_gpu repositioned

3fc8cd9

awaelchli approved these changes Jul 24, 2020

View reviewed changes

mergify bot requested a review from a team July 24, 2020 07:14

justusschock approved these changes Jul 24, 2020

View reviewed changes

Merge branch 'master' into bugfix/gpus-flag

515fc40

mergify bot merged commit 6780214 into master Jul 24, 2020

Borda deleted the bugfix/gpus-flag branch July 24, 2020 08:33

Borda mentioned this pull request Jul 24, 2020

fix nb tests for auto-merge #2686

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

quick-fix for --gpus flag bug #2674

quick-fix for --gpus flag bug #2674

ananyahjha93 commented Jul 22, 2020 •

edited

codecov bot commented Jul 22, 2020 •

edited

awaelchli commented Jul 22, 2020

ananyahjha93 commented Jul 22, 2020

awaelchli commented Jul 22, 2020 •

edited

Borda commented Jul 22, 2020

ananyahjha93 commented Jul 22, 2020

williamFalcon Jul 23, 2020

ananyahjha93 Jul 23, 2020

awaelchli Jul 23, 2020

awaelchli left a comment

mergify bot commented Jul 24, 2020

mergify bot commented Jul 24, 2020

Borda commented Jul 24, 2020

quick-fix for --gpus flag bug #2674

quick-fix for --gpus flag bug #2674

Conversation

ananyahjha93 commented Jul 22, 2020 • edited

What does this PR do?

Before submitting

codecov bot commented Jul 22, 2020 • edited

Codecov Report

awaelchli commented Jul 22, 2020

ananyahjha93 commented Jul 22, 2020

awaelchli commented Jul 22, 2020 • edited

Borda commented Jul 22, 2020

ananyahjha93 commented Jul 22, 2020

williamFalcon Jul 23, 2020

Choose a reason for hiding this comment

ananyahjha93 Jul 23, 2020

Choose a reason for hiding this comment

awaelchli Jul 23, 2020

Choose a reason for hiding this comment

awaelchli left a comment

Choose a reason for hiding this comment

mergify bot commented Jul 24, 2020

mergify bot commented Jul 24, 2020

Borda commented Jul 24, 2020

ananyahjha93 commented Jul 22, 2020 •

edited

codecov bot commented Jul 22, 2020 •

edited

awaelchli commented Jul 22, 2020 •

edited