-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
quick-fix for --gpus flag bug #2674
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2674 +/- ##
======================================
Coverage 92% 92%
======================================
Files 74 74
Lines 6381 6382 +1
======================================
+ Hits 5850 5851 +1
Misses 531 531 |
I think the true fix should be to parse the gpus flag first and then set the on_gpu property, right? |
@awaelchli In this case type takes a callable as a value in form Trainer._arg_default which then evaluates to the gpus flag values if something is passed to the script by the user. If the user does not include the --gpus flag, args.gpus refers to Trainer._arg_default callable. |
Yes I understand that. We also have a function that takes the user input and converts it to a list. |
yes, I would not pass the function at all, it makes unnecessary complex... |
@awaelchli yeah, you are right, thanks for pointing this out. @Borda what's your suggestion here, if not pass a function? |
pytorch_lightning/trainer/trainer.py
Outdated
@@ -532,6 +532,10 @@ def __init__( | |||
self.root_gpu = determine_root_gpu_device(self.data_parallel_device_ids) | |||
self.root_device = torch.device("cpu") | |||
|
|||
# self.data_parallel_device_ids is None if gpus is callable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't understand what this means
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.data_parallel_device_ids = _parse_gpu_ids(self.gpus)
_parse_gpu_ids returns None is self.gpus is callable, if you don't pass --gpus flag then self.gpus points to Trainer._arg_default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could move the self.on_gpu = True if (gpus and torch.cuda.is_available()) else False
found a few lines above
down here and simply write
self.on_gpu = True if (self.data_parallel_device_ids and torch.cuda.is_available()) else False
then there is no need for an extra if clause or a comment :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really nice this is fixed, it was annoying with the default arg not properly set.
Great job! =) |
Great job! =) |
we shall re-count nb tests we are running as this was merged automatically even one test was failing... |
What does this PR do?
Prevents setting self.on_gpu as True when --gpus flag is not passed by the user in the case where the following way is used to create the trainer
Fixes #2669
Before submitting