[pytorch] Change Torchvision version from 0.6.0 to 0.6.1 for PyTorch 1.5.1 #650

choidongyeon · 2020-10-06T17:08:37Z

Issue #, if available:

PR Checklist

I've prepended PR tag with frameworks/job this applies to : [mxnet, tensorflow, pytorch] | [ei/neuron] | [build] | [test] | [benchmark] | [ec2, ecs, eks, sagemaker]
(If applicable) I've documented below the DLC image/dockerfile this relates to
(If applicable) I've documented below the tests I've run on the DLC image
(If applicable) I've reviewed the licenses of updated and new binaries and their dependencies to make sure all licenses are on the Apache Software Foundation Third Party License Policy Category A or Category B license list. See https://www.apache.org/legal/resolved.html.
(If applicable) I've scanned the updated and new binaries to make sure they do not have vulnerabilities associated with them.

Pytest Marker Checklist

(If applicable) I have added the marker @pytest.mark.model("<model-type>") to the new tests which I have added, to specify the Deep Learning model that is used in the test (use "N/A" if the test doesn't use a model)
(If applicable) I have added the marker @pytest.mark.integration("<feature-being-tested>") to the new tests which I have added, to specify the feature that will be tested
(If applicable) I have added the marker @pytest.mark.multinode(<integer-num-nodes>) to the new tests which I have added, to specify the number of nodes used on a multi-node test
(If applicable) I have added the marker @pytest.mark.processor(<"cpu"/"gpu"/"eia"/"neuron">) to the new tests which I have added, if a test is specifically applicable to only one processor type

EIA/NEURON Checklist

When creating a PR:

I've modified src/config/build_config.py in my PR branch by setting ENABLE_EI_MODE = True or ENABLE_NEURON_MODE = True

When PR is reviewed and ready to be merged:

I've reverted the code change on the config file mentioned above

Benchmark Checklist

When creating a PR:

I've modified src/config/test_config.py in my PR branch by setting ENABLE_BENCHMARK_DEV_MODE = True

When PR is reviewed and ready to be merged:

I've reverted the code change on the config file mentioned above

Reviewer Checklist

For reviewer, before merging, please cross-check:

I've verified the code change on the config file mentioned above has already been reverted

Description: Fix #629 by updating the PyTorch 1.5.1 Dockerfiles with the S3 path to torchvision 0.6.1 binaries.

Tests run:

DLC image/dockerfile: PyTorch 1.5.1 images (CPU/GPU, training/inference).

Additional context:

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

choidongyeon · 2020-10-07T00:14:25Z

@SergTogul @saimidu Tagging because I can't seem to add reviewers. Please take a look at this PR so we can resolve the mentioned issue!

The EC2 test is failing because of the AMP test (introduced in a later version of PyTorch 1.6). I added an if-statement in the test as a workaround, per @saimidu 's suggestion but a more universal solution for disabling tests depending on framework version would be great.

saimidu · 2020-10-07T01:14:48Z

test/dlc_tests/ec2/pytorch/training/test_pytorch_training.py

@@ -123,6 +123,9 @@ def test_nvapex(pytorch_training, ec2_connection, gpu_only):
 @pytest.mark.parametrize("ec2_instance_type", PT_EC2_GPU_INSTANCE_TYPE, indirect=True)
 @pytest.mark.skipif(PT_EC2_GPU_INSTANCE_TYPE == ["g3.4xlarge"], reason="Skipping AMP DDP test on single gpu instance")
 def test_pytorch_amp(pytorch_training, ec2_connection, gpu_only):
+    from packaging import Version


When the CI tests on this PR are completed successfully, please move this import line to the top of the pytest script along with all the other standard module imports (i.e., below import os for standard module os, and above import pytest for 3P module pytest).

Also, this should be from packaging.version import Version.

choidongyeon · 2020-10-08T00:46:58Z

As discussed offline with @saimidu , I have reverted the buildspec file back to the current (1.6) file. For test results of code changes relevant to this PR, please refer to commit d9f09ce.

I have also tested the container produced by the PR for the above commit by installing detectron2-0.1.3 and do not run into the import error that is mentioned in the issue.

choidongyeon changed the title ~~[pytorch] Change Torchvision version from 0.6.0 to 0.6.1~~ [pytorch] Change Torchvision version from 0.6.0 to 0.6.1 for PyTorch 1.5.1 Oct 6, 2020

Change Torchvision version from 0.6.0 to 0.6.1

04530b6

choidongyeon force-pushed the pt1.5.1-torchvision branch from 7525a65 to 04530b6 Compare October 6, 2020 17:15

Donna Choi added 4 commits October 6, 2020 12:00

Revert version to 1.5.1 in buildspec

40d1945

Change address to binary for inference images

27ef561

Empty commit to retrigger

607b801

Modifications to the buildspec file to use MMS

900771b

choidongyeon mentioned this pull request Oct 6, 2020

[feature] Mechanism for skipping tests for specific versions of a framework #652

Closed

5 tasks

Donna Choi added 2 commits October 6, 2020 17:01

Add old (PyTorch 1.6) buildspec

16891d9

Disable AMP test for PyTorch 1.5.1 and lower

44217c2

anankira requested review from saimidu and SergTogul October 7, 2020 00:26

saimidu reviewed Oct 7, 2020

View reviewed changes

saimidu and others added 3 commits October 6, 2020 18:17

Fix Version import

d44368b

Merge branch 'master' into pt1.5.1-torchvision

098f763

Modify AMP test

b7edc35

choidongyeon force-pushed the pt1.5.1-torchvision branch from ee9040f to b7edc35 Compare October 7, 2020 21:15

Donna Choi added 2 commits October 7, 2020 15:31

Remove reason

d9f09ce

Revert buildspec back to 1.6

7978054

Add empty line at end of buildspec

bf715e6

SergTogul approved these changes Oct 8, 2020

View reviewed changes

Merge branch 'master' into pt1.5.1-torchvision

421130f

SergTogul merged commit 720f336 into aws:master Oct 8, 2020

choidongyeon deleted the pt1.5.1-torchvision branch October 8, 2020 20:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pytorch] Change Torchvision version from 0.6.0 to 0.6.1 for PyTorch 1.5.1 #650

[pytorch] Change Torchvision version from 0.6.0 to 0.6.1 for PyTorch 1.5.1 #650

choidongyeon commented Oct 6, 2020 •

edited

Loading

choidongyeon commented Oct 7, 2020

saimidu Oct 7, 2020

saimidu Oct 7, 2020

choidongyeon commented Oct 8, 2020

[pytorch] Change Torchvision version from 0.6.0 to 0.6.1 for PyTorch 1.5.1 #650

[pytorch] Change Torchvision version from 0.6.0 to 0.6.1 for PyTorch 1.5.1 #650

Conversation

choidongyeon commented Oct 6, 2020 • edited Loading

PR Checklist

Pytest Marker Checklist

EIA/NEURON Checklist

Benchmark Checklist

Reviewer Checklist

choidongyeon commented Oct 7, 2020

saimidu Oct 7, 2020

Choose a reason for hiding this comment

saimidu Oct 7, 2020

Choose a reason for hiding this comment

choidongyeon commented Oct 8, 2020

choidongyeon commented Oct 6, 2020 •

edited

Loading