Add torch_utils class, auto-detect CUDA availability #4403
Conversation
| # Add files or directories to the ignore list. They should be base names, not | ||
| # paths. | ||
| ignore=CVS | ||
| generated-members=torch.* |
There was a problem hiding this comment.
I'm pretty sure this isn't the best way to do this, but without it, pylint will complain about torch not having the right members everywhere. I believe this is because torch could be None, though it never actually happens.
setup.cfg
Outdated
|
|
||
| banned-modules = tensorflow = use mlagents.tf_utils instead (it handles tf2 compat). | ||
| logging = use mlagents_envs.logging_util instead | ||
| torch = use mlagents.torch_utils istead (handles GPU detection). |
There was a problem hiding this comment.
| torch = use mlagents.torch_utils istead (handles GPU detection). | |
| torch = use mlagents.torch_utils instead (handles GPU detection). |
vincentpierre
left a comment
There was a problem hiding this comment.
Looks good to me, but I would like ml-agents/mlagents/torch_utils/torch.py to contain a comment stating that the file is temporary and will be removed once torch is required. (To avoid adding functionality to this file in the mean time)
The file isn't temporary (will still be needed for GPU detection) but yeah the try/except is. Added comment. |
|
Only tensors are created on GPU but the models are still on CPU. |
|
Another point missing here is that when running Torch with CuDNN, if we want to get completely reproducible results, we'll also need to set these besides setting seeds for torch and numpy: while these could potentially hurt the running performance. |
How much of a performance hit do we expect by setting these flags? |
Not sure, it could be case by case, depending on how much the model changes during training and how much optimization cudnn does. |
|
I tested with removing that line. It works well. |
Proposed change(s)
This PR adds a torch_utils class (similiar to tf_utils) and requires importing
torchfrom there. This lets us do a couple things, the first being detect if CUDA is available and set the default tensor type appropriately. Requiring importingtorchfrom here ensures that this will be set before any torch functions are used at all. The PR also adds anis_available()method totorch_utils, and throws a nicer error if torch isn't available when you passed--torch.In the future we can also set the number of torch threads, etc. from here.
Types of change(s)
Checklist
Other comments