Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[refactor] Make PyTorch the default and TensorFlow optional #4517

Merged
merged 23 commits into from
Oct 21, 2020

Conversation

ervteng
Copy link
Contributor

@ervteng ervteng commented Sep 26, 2020

Proposed change(s)

This PR does a couple things:

  • Makes TensorFlow optional (i.e., code can be run without TF installed) and PyTorch required
  • Adds PyTorch to the setup.py, throwing an error in Windows telling the user to install PyTorch from the website (see:Pytorchvision 0.5.0 not published to pypi or Conda for Windows pytorch/vision#1774).
  • Reports PyTorch version in TrainingStatus and Timers. For now, TrainingStatus reports both TF and PyTorch versions, with -1 set if TF is not installed.
  • Switch TensorboardWriter to use torch.utils.tensorboard
  • Install documentation
  • Modify Yamato test to look for ONNX files only

Furthermore, we need to verify Barracuda compatibility and package versioning before merging

Types of change(s)

  • Bug fix
  • New feature
  • Code refactor
  • Breaking change
  • Documentation update
  • Other (please describe)

Checklist

  • Added tests that prove my fix is effective or that my feature works
  • Updated the changelog (if applicable)
  • Updated the documentation (if applicable)
  • Updated the migration guide (if applicable)

Other comments

Requires ML-Agents Cloud PR after merging

@vincentpierre
Copy link
Contributor

vincentpierre commented Oct 1, 2020

Leaving this here for posterity: This will not raise an error correctly on windows when installing via pip. That is because the cmdclass will NOT work when installing from whl

Edit : Removed cmdclass, the user will see an error when trying to use mlagents-learn without having torch installed. cannot directly install torch on windows via pip.

* Torch not imported error to raise at first usage

* Torch not imported error to raise at first usage
@ervteng ervteng marked this pull request as ready for review October 19, 2020 18:25
Ervin Teng and others added 8 commits October 19, 2020 11:38
* Convert stats writer to use PyTorch TB support

* Use common function to print params

* Update test

* Bump tensorboard to 1.15 to fix the tests

* putting tensorboard 1.15.0 as min version requirement

Co-authored-by: vincentpierre <vincentpierre@unity3d.com>
)

* Initial commit

* Forgotten doc

* Removing the `Installation-Anaconda-Windows.md` as it is deprecated

* Readding the depreacted Installation-Anaconda-Windows.md but leaving it unchanged

* more references to tensorflow removed

* Update README.md

Co-authored-by: Ervin T. <ervin@unity3d.com>

* Change references to .nn to .onnx in docs (#4583)

Co-authored-by: Ervin T. <ervin@unity3d.com>
* Add --tensorflow option

* Switch framework to Pytorch default

* Update changelog

* Re-add --torch

* Edit warning
self.stats_reporter.add_stat(
optimizer.reward_signals[name].stat_name,
f"Policy/{optimizer.reward_signals[name].name.capitalize()} Reward",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to a field/property on BaseRewardProvider?

Copy link
Contributor Author

@ervteng ervteng Oct 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be cleaner. @vincentpierre what do you think about this? (probably a separate PR).

@@ -27,6 +27,7 @@ def __init__(
init_path: str = None,
multi_gpu: bool = False,
force_torch: bool = False,
force_tensorflow: bool = False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are force_torch and force_tensorflow both needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We left force_torch in case users were still using --torch in places (discussion here: #4582)

It's behavior is pretty subtle, it will only take effect if the user specifies framework: tensorflow in the YAML and then adds --torch.

@@ -5,3 +5,4 @@ Pillow==4.2.1
protobuf==3.6
tensorflow==1.14.0
h5py==2.9.0
tensorboard==1.15.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this here? should be handled by setup.py.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do not add it here, tensorboard version 1.14.0 is installed (because tensorflow 1.14.0 is installed here) and we do not support tensorboard 1.14.0 with torch

@@ -128,6 +123,7 @@ To install the `mlagents` Python package, activate your virtual environment and
run from the command line:

```sh
pip3 install torch -f https://download.pytorch.org/whl/torch_stable.html
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we mention this is only for windows?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't hurt on the other OS's, but I agree it's clearer if we separate windows. Updated the instructions - let me know what you think

@dongruoping
Copy link
Contributor

Since PyTorch is default now, we should also update all .nn model files to .onnx files for the example scenes (or in separate PR?)

@chriselion
Copy link
Contributor

Since PyTorch is default now, we should also update all .nn model files to .onnx files for the example scenes (or in separate PR?)

@dongruoping - I have that logged as a followup in https://jira.unity3d.com/browse/MLA-1435

@chriselion
Copy link
Contributor

@ervteng is there a separate PR to update the CI configs to use --tensorflow?

@ervteng
Copy link
Contributor Author

ervteng commented Oct 21, 2020

@ervteng is there a separate PR to update the CI configs to use --tensorflow?

Yep - https://github.com/Unity-Technologies/ml-agents-cloud-internal/pull/223. It only affects the old CI, as the new CI doesn't use --torch but instead overrides the YAML/RunOptions and won't require changes.

@ervteng ervteng merged commit 0782df1 into master Oct 21, 2020
@delete-merged-branch delete-merged-branch bot deleted the develop-torchdefault branch October 21, 2020 19:06
@chriselion chriselion restored the develop-torchdefault branch October 26, 2020 17:38
@chriselion chriselion deleted the develop-torchdefault branch October 26, 2020 17:48
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants