Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pytorch] Change Torchvision version from 0.6.0 to 0.6.1 for PyTorch 1.5.1 #650

Merged
merged 14 commits into from
Oct 8, 2020

Conversation

choidongyeon
Copy link
Contributor

@choidongyeon choidongyeon commented Oct 6, 2020

Issue #, if available:

PR Checklist

  • I've prepended PR tag with frameworks/job this applies to : [mxnet, tensorflow, pytorch] | [ei/neuron] | [build] | [test] | [benchmark] | [ec2, ecs, eks, sagemaker]
  • (If applicable) I've documented below the DLC image/dockerfile this relates to
  • (If applicable) I've documented below the tests I've run on the DLC image
  • (If applicable) I've reviewed the licenses of updated and new binaries and their dependencies to make sure all licenses are on the Apache Software Foundation Third Party License Policy Category A or Category B license list. See https://www.apache.org/legal/resolved.html.
  • (If applicable) I've scanned the updated and new binaries to make sure they do not have vulnerabilities associated with them.

Pytest Marker Checklist

  • (If applicable) I have added the marker @pytest.mark.model("<model-type>") to the new tests which I have added, to specify the Deep Learning model that is used in the test (use "N/A" if the test doesn't use a model)
  • (If applicable) I have added the marker @pytest.mark.integration("<feature-being-tested>") to the new tests which I have added, to specify the feature that will be tested
  • (If applicable) I have added the marker @pytest.mark.multinode(<integer-num-nodes>) to the new tests which I have added, to specify the number of nodes used on a multi-node test
  • (If applicable) I have added the marker @pytest.mark.processor(<"cpu"/"gpu"/"eia"/"neuron">) to the new tests which I have added, if a test is specifically applicable to only one processor type

EIA/NEURON Checklist

  • When creating a PR:
  • I've modified src/config/build_config.py in my PR branch by setting ENABLE_EI_MODE = True or ENABLE_NEURON_MODE = True
  • When PR is reviewed and ready to be merged:
  • I've reverted the code change on the config file mentioned above

Benchmark Checklist

  • When creating a PR:
  • I've modified src/config/test_config.py in my PR branch by setting ENABLE_BENCHMARK_DEV_MODE = True
  • When PR is reviewed and ready to be merged:
  • I've reverted the code change on the config file mentioned above

Reviewer Checklist

  • For reviewer, before merging, please cross-check:
  • I've verified the code change on the config file mentioned above has already been reverted

Description: Fix #629 by updating the PyTorch 1.5.1 Dockerfiles with the S3 path to torchvision 0.6.1 binaries.

Tests run:

DLC image/dockerfile: PyTorch 1.5.1 images (CPU/GPU, training/inference).

Additional context:

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@choidongyeon choidongyeon changed the title [pytorch] Change Torchvision version from 0.6.0 to 0.6.1 [pytorch] Change Torchvision version from 0.6.0 to 0.6.1 for PyTorch 1.5.1 Oct 6, 2020
@choidongyeon
Copy link
Contributor Author

@SergTogul @saimidu Tagging because I can't seem to add reviewers. Please take a look at this PR so we can resolve the mentioned issue!

The EC2 test is failing because of the AMP test (introduced in a later version of PyTorch 1.6). I added an if-statement in the test as a workaround, per @saimidu 's suggestion but a more universal solution for disabling tests depending on framework version would be great.

@@ -123,6 +123,9 @@ def test_nvapex(pytorch_training, ec2_connection, gpu_only):
@pytest.mark.parametrize("ec2_instance_type", PT_EC2_GPU_INSTANCE_TYPE, indirect=True)
@pytest.mark.skipif(PT_EC2_GPU_INSTANCE_TYPE == ["g3.4xlarge"], reason="Skipping AMP DDP test on single gpu instance")
def test_pytorch_amp(pytorch_training, ec2_connection, gpu_only):
from packaging import Version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the CI tests on this PR are completed successfully, please move this import line to the top of the pytest script along with all the other standard module imports (i.e., below import os for standard module os, and above import pytest for 3P module pytest).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this should be from packaging.version import Version.

@choidongyeon
Copy link
Contributor Author

As discussed offline with @saimidu , I have reverted the buildspec file back to the current (1.6) file. For test results of code changes relevant to this PR, please refer to commit d9f09ce.

I have also tested the container produced by the PR for the above commit by installing detectron2-0.1.3 and do not run into the import error that is mentioned in the issue.

@SergTogul SergTogul merged commit 720f336 into aws:master Oct 8, 2020
@choidongyeon choidongyeon deleted the pt1.5.1-torchvision branch October 8, 2020 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Versioning of PyTorch and Torchvision
3 participants