Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[huggingface_pytorch] upgrade PyTorch to 1.7.1 #1025

Merged
merged 5 commits into from Apr 22, 2021
Merged

[huggingface_pytorch] upgrade PyTorch to 1.7.1 #1025

merged 5 commits into from Apr 22, 2021

Conversation

philschmid
Copy link
Contributor

What is the PR doing?

This PR upgrades the PyTorch version from 1.6.0 to 1.7.1.

Issue #, if available:

PR Checklist

  • I've prepended PR tag with frameworks/job this applies to : [mxnet, tensorflow, pytorch] | [ei/neuron] | [build] | [test] | [benchmark] | [ec2, ecs, eks, sagemaker]
  • (If applicable) I've documented below the DLC image/dockerfile this relates to
  • (If applicable) I've documented below the tests I've run on the DLC image
  • (If applicable) I've reviewed the licenses of updated and new binaries and their dependencies to make sure all licenses are on the Apache Software Foundation Third Party License Policy Category A or Category B license list. See https://www.apache.org/legal/resolved.html.
  • (If applicable) I've scanned the updated and new binaries to make sure they do not have vulnerabilities associated with them.

Pytest Marker Checklist

  • (If applicable) I have added the marker @pytest.mark.model("<model-type>") to the new tests which I have added, to specify the Deep Learning model that is used in the test (use "N/A" if the test doesn't use a model)
  • (If applicable) I have added the marker @pytest.mark.integration("<feature-being-tested>") to the new tests which I have added, to specify the feature that will be tested
  • (If applicable) I have added the marker @pytest.mark.multinode(<integer-num-nodes>) to the new tests which I have added, to specify the number of nodes used on a multi-node test
  • (If applicable) I have added the marker @pytest.mark.processor(<"cpu"/"gpu"/"eia"/"neuron">) to the new tests which I have added, if a test is specifically applicable to only one processor type

EIA/NEURON Checklist

  • When creating a PR:
  • I've modified src/config/build_config.py in my PR branch by setting ENABLE_EI_MODE = True or ENABLE_NEURON_MODE = True
  • When PR is reviewed and ready to be merged:
  • I've reverted the code change on the config file mentioned above

Benchmark Checklist

  • When creating a PR:
  • I've modified src/config/test_config.py in my PR branch by setting ENABLE_BENCHMARK_DEV_MODE = True
  • When PR is reviewed and ready to be merged:
  • I've reverted the code change on the config file mentioned above

Reviewer Checklist

  • For reviewer, before merging, please cross-check:
  • I've verified the code change on the config file mentioned above has already been reverted

Description:

This PRs upgrades the PyTorch from 1.6.0 to 1.7.1. Therefore I have created a new buildspec.yaml and renamed the old buildspec.yaml to buildspec-1-6.yml.

Tests run:

https://github.com/huggingface/transformers/tree/master/tests/sagemaker

ID Description Platform #GPUS Collected & evaluated metrics
pytorch-transfromers-test-single test bert finetuning using BERT fromtransformerlib+PT SageMaker createTrainingJob 1 train_runtime, eval_accuracy & eval_loss
pytorch-transfromers-test-2-ddp test bert finetuning using BERT from transformer lib+ PT DPP SageMaker createTrainingJob 16 train_runtime, eval_accuracy & eval_loss
pytorch-transfromers-test-2-smd test bert finetuning using BERT from transformer lib+ PT SM DDP SageMaker createTrainingJob 16 train_runtime, eval_accuracy & eval_loss
pytorch-transfromers-test-1-smp test roberta finetuning using BERT from transformer lib+ PT SM MP SageMaker createTrainingJob 8 train_runtime, eval_accuracy & eval_loss

DLC image/dockerfile:

The Hugging Face DLCs for PyTorch

Test Json files:

pytorch-transfromers-test-1-smp-smtrain-2021-04-12-14-11-39-093.txt
pytorch-transfromers-test-1-smp-trainer-2021-04-12-13-35-45-822.txt
pytorch-transfromers-test-2-ddp-2021-04-12-13-22-48-512.txt
pytorch-transfromers-test-2-smd-2021-04-12-14-11-39-093.txt
pytorch-transfromers-test-single-2021-04-12-13-35-45-177.txt

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@SergTogul SergTogul merged commit 73215f0 into aws:master Apr 22, 2021
jeet4320 added a commit to jeet4320/deep-learning-containers that referenced this pull request Apr 23, 2021
* aws/master:
  [doc] Minor corrections on available_images (aws#1061)
  [huggingface_pytorch] upgrade PyTorch to 1.7.1 (aws#1025)
  Fix for hf canary tests (aws#1057)
  [test][huggingface_tensorflow, huggingface_pytorch] SM local and remote tests (aws#1021)
  hf transformer version update (aws#1060)
  [build] update EIA buildspec and upgrade ruamel_yaml package (aws#1051)
  bump transformer version (aws#1056)
  Including hf images into canary tests (aws#1050)
  fix tf1 neuron buildspec (aws#1048)
  [build,test] Disable dedicated telemetry tests and tags (aws#1045)
  [test][sagemaker] Execute SM local tests in parallel (aws#1027)
jeet4320 added a commit to jeet4320/deep-learning-containers that referenced this pull request Apr 27, 2021
* aws/master:
  [doc] Minor corrections on available_images (aws#1061)
  [huggingface_pytorch] upgrade PyTorch to 1.7.1 (aws#1025)
  Fix for hf canary tests (aws#1057)
  [test][huggingface_tensorflow, huggingface_pytorch] SM local and remote tests (aws#1021)
  hf transformer version update (aws#1060)
  [build] update EIA buildspec and upgrade ruamel_yaml package (aws#1051)
  bump transformer version (aws#1056)
  Including hf images into canary tests (aws#1050)
  fix tf1 neuron buildspec (aws#1048)
  [build,test] Disable dedicated telemetry tests and tags (aws#1045)
  [test][sagemaker] Execute SM local tests in parallel (aws#1027)
  Temporaru disabling sagemaker tests for HF containers. (aws#1042)
  Add automatic yes to prompts for apt (aws#1043)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants