[pytorch][PR][Gradient Compression] Reduce the peak memory of fp16 compression provided by ddp comm hook #46078

wayi1 · 2020-10-09T04:26:30Z

Stack from ghstack:

[pytorch][PR][Gradient Compression] Reduce the peak memory of fp16 compression provided by ddp comm hook #46078 [pytorch][PR][Gradient Compression] Reduce the peak memory of fp16 compression provided by ddp comm hook

The peak memory usage of ddp comm hook has increased due to an extra copy of gradient tensors. To reduce the memory usage, decompress the fp16 tensor in place of the tensor stored in the the gradient bucket.

#Closes: #45968

Differential Revision: D24178118

…mpression provided by ddp comm hook The peak memory usage of ddp comm hook has increased due to an extra copy of gradient tensors. To reduce the memory usage, decompress the fp16 tensor in place of the tensor stored in the the gradient bucket. #Closes: #45968 Differential Revision: [D24178118](https://our.internmc.facebook.com/intern/diff/D24178118/) [ghstack-poisoned]

…mpression provided by ddp comm hook The peak memory usage of ddp comm hook has increased due to an extra copy of gradient tensors. To reduce the memory usage, decompress the fp16 tensor in place of the tensor stored in the the gradient bucket. #Closes: #45968 Differential Revision: [D24178118](https://our.internmc.facebook.com/intern/diff/D24178118/) ghstack-source-id: 113935840 Pull Request resolved: #46078

dr-ci · 2020-10-09T04:55:33Z

💊 CI failures summary and remediations

As of commit 1150400 (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 3 times.

… of fp16 compression provided by ddp comm hook" The peak memory usage of ddp comm hook has increased due to an extra copy of gradient tensors. To reduce the memory usage, decompress the fp16 tensor in place of the tensor stored in the the gradient bucket. #Closes: #45968 Differential Revision: [D24178118](https://our.internmc.facebook.com/intern/diff/D24178118/) [ghstack-poisoned]

…mpression provided by ddp comm hook Pull Request resolved: #46078 The peak memory usage of ddp comm hook has increased due to an extra copy of gradient tensors. To reduce the memory usage, decompress the fp16 tensor in place of the tensor stored in the the gradient bucket. #Closes: #45968 ghstack-source-id: 113996453 Differential Revision: [D24178118](https://our.internmc.facebook.com/intern/diff/D24178118/)

codecov · 2020-10-10T01:35:09Z

Codecov Report

Merging #46078 into gh/SciPioneer/12/base will increase coverage by 0.00%.
The diff coverage is n/a.

@@                  Coverage Diff                   @@
##           gh/SciPioneer/12/base   #46078   +/-   ##
======================================================
  Coverage                  68.27%   68.28%           
======================================================
  Files                        410      410           
  Lines                      53306    53306           
======================================================
+ Hits                       36397    36398    +1     
+ Misses                     16909    16908    -1

Impacted Files	Coverage Δ
torch/testing/_internal/expecttest.py	`78.57% <0.00%> (+1.02%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 40828b6...1150400. Read the comment docs.

facebook-github-bot · 2020-10-13T00:14:50Z

This pull request has been merged in ee3d3e6.

wayi1 requested review from apaszke, mingzhe09088, mrshenli, pietern, pritamdamania87, rohan-varma and zhaojuanmao as code owners October 9, 2020 04:26

facebook-github-bot added the oncall: distributed Add this issue/PR to distributed oncall triage queue label Oct 9, 2020

wayi1 self-assigned this Oct 9, 2020

wayi1 mentioned this pull request Oct 9, 2020

DDP fp16_compress_hook communication hook increases peak memory #45968

Closed

pritamdamania87 approved these changes Oct 12, 2020

View reviewed changes

facebook-github-bot closed this in ee3d3e6 Oct 12, 2020

facebook-github-bot added the Merged label Oct 13, 2020

facebook-github-bot deleted the gh/SciPioneer/12/head branch October 16, 2020 14:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pytorch][PR][Gradient Compression] Reduce the peak memory of fp16 compression provided by ddp comm hook #46078

[pytorch][PR][Gradient Compression] Reduce the peak memory of fp16 compression provided by ddp comm hook #46078

wayi1 commented Oct 9, 2020 •

edited

dr-ci bot commented Oct 9, 2020 •

edited

codecov bot commented Oct 10, 2020 •

edited

facebook-github-bot commented Oct 13, 2020

[pytorch][PR][Gradient Compression] Reduce the peak memory of fp16 compression provided by ddp comm hook #46078

[pytorch][PR][Gradient Compression] Reduce the peak memory of fp16 compression provided by ddp comm hook #46078

Conversation

wayi1 commented Oct 9, 2020 • edited

dr-ci bot commented Oct 9, 2020 • edited

💊 CI failures summary and remediations

codecov bot commented Oct 10, 2020 • edited

Codecov Report

facebook-github-bot commented Oct 13, 2020

wayi1 commented Oct 9, 2020 •

edited

dr-ci bot commented Oct 9, 2020 •

edited

codecov bot commented Oct 10, 2020 •

edited