memory efficient fq: use it everywhere, delete the old version #51159

vkuzo · 2021-01-27T00:13:41Z

Stack from ghstack:

fake_quant: more memory efficient per-channel backward #51255 fake_quant: more memory efficient per-channel backward
memory efficient fq: use it everywhere, delete the old version #51159 memory efficient fq: use it everywhere, delete the old version
fake_quant: add a more memory efficient backward #50561 fake_quant: add a more memory efficient backward

Summary:

This PR is the cleanup after #50561. High level, we make the new
definition of fake_quant be the definition used by autograd, but keep the old
function around as a thin wrapper to keep the user facing API the same.

In detail:

point fake_quantize_per_tensor_affine's implementation to be fake_quantize_per_tensor_affine_cachemask
delete the fake_quantize_per_tensor_affine backward, autograd will automatically use the cachemask backward
delete all the fake_quantize_per_tensor_affine kernels, since they are no longer used by anything

Test Plan:

python test/test_quantization.py TestFakeQuantize

performance testing was done in the previous PR.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D26090869

Summary: This PR is the cleanup after #50561. High level, we make the new definition of fake_quant be the definition used by autograd, but keep the old function around as a thin wrapper to keep the user facing API the same. In detail: 1. point `fake_quantize_per_tensor_affine`'s implementation to be `fake_quantize_per_tensor_affine_cachemask` 2. delete the `fake_quantize_per_tensor_affine` backward, autograd will automatically use the cachemask backward 3. delete all the `fake_quantize_per_tensor_affine` kernels, since they are no longer used by anything Test Plan: ``` python test/test_quantization.py TestFakeQuantize ``` performance testing was done in the previous PR. Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: This PR is the cleanup after #50561. High level, we make the new definition of fake_quant be the definition used by autograd, but keep the old function around as a thin wrapper to keep the user facing API the same. In detail: 1. point `fake_quantize_per_tensor_affine`'s implementation to be `fake_quantize_per_tensor_affine_cachemask` 2. delete the `fake_quantize_per_tensor_affine` backward, autograd will automatically use the cachemask backward 3. delete all the `fake_quantize_per_tensor_affine` kernels, since they are no longer used by anything Test Plan: ``` python test/test_quantization.py TestFakeQuantize ``` performance testing was done in the previous PR. Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 1adb5b962a25dd5f89c035e7855fb8eb28bb1706 Pull Request resolved: #51159

facebook-github-bot · 2021-01-27T00:13:56Z

💊 CI failures summary and remediations

As of commit 6f97b53 (more details on the Dr. CI page):

1/1 failures possibly* introduced in this PR
- 1/1 non-CircleCI failure(s)

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

…sion" Summary: This PR is the cleanup after #50561. High level, we make the new definition of fake_quant be the definition used by autograd, but keep the old function around as a thin wrapper to keep the user facing API the same. In detail: 1. point `fake_quantize_per_tensor_affine`'s implementation to be `fake_quantize_per_tensor_affine_cachemask` 2. delete the `fake_quantize_per_tensor_affine` backward, autograd will automatically use the cachemask backward 3. delete all the `fake_quantize_per_tensor_affine` kernels, since they are no longer used by anything Test Plan: ``` python test/test_quantization.py TestFakeQuantize ``` performance testing was done in the previous PR. Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D26090869](https://our.internmc.facebook.com/intern/diff/D26090869) [ghstack-poisoned]

Summary: This PR is the cleanup after #50561. High level, we make the new definition of fake_quant be the definition used by autograd, but keep the old function around as a thin wrapper to keep the user facing API the same. In detail: 1. point `fake_quantize_per_tensor_affine`'s implementation to be `fake_quantize_per_tensor_affine_cachemask` 2. delete the `fake_quantize_per_tensor_affine` backward, autograd will automatically use the cachemask backward 3. delete all the `fake_quantize_per_tensor_affine` kernels, since they are no longer used by anything Test Plan: ``` python test/test_quantization.py TestFakeQuantize ``` performance testing was done in the previous PR. Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: de70258e9950f1e7a401c52b7fa2082390319690 Pull Request resolved: #51159

codecov · 2021-01-27T23:10:04Z

Codecov Report

Merging #51159 (6f97b53) into gh/vkuzo/214/base (194be25) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@                Coverage Diff                 @@
##           gh/vkuzo/214/base   #51159   +/-   ##
==================================================
  Coverage              80.88%   80.88%           
==================================================
  Files                   1931     1931           
  Lines                 210604   210587   -17     
==================================================
- Hits                  170343   170339    -4     
+ Misses                 40261    40248   -13

facebook-github-bot · 2021-01-28T03:39:47Z

This pull request has been merged in 0335222.

Summary: This PR is the cleanup after #51159. High level, we make the new definition of fake_quant per channel be the definition used by autograd, but keep the old function around as a thin wrapper to keep the user facing API the same. In detail: 1. point fake_quantize_per_channel_affine's implementation to be fake_quantize_per_channel_affine_cachemask 2. delete the fake_quantize_per_channel_affine backward, autograd will automatically use the cachemask backward 3. delete all the fake_quantize_per_channel_affine kernels, since they are no longer used by anything Test Plan: ``` python test/test_quantization.py TestFakeQuantize ``` Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

… old version" Summary: This PR is the cleanup after #51159. High level, we make the new definition of fake_quant per channel be the definition used by autograd, but keep the old function around as a thin wrapper to keep the user facing API the same. In detail: 1. point fake_quantize_per_channel_affine's implementation to be fake_quantize_per_channel_affine_cachemask 2. delete the fake_quantize_per_channel_affine backward, autograd will automatically use the cachemask backward 3. delete all the fake_quantize_per_channel_affine kernels, since they are no longer used by anything Test Plan: ``` python test/test_quantization.py TestFakeQuantize ``` Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: This PR is the cleanup after #51159. High level, we make the new definition of fake_quant per channel be the definition used by autograd, but keep the old function around as a thin wrapper to keep the user facing API the same. In detail: 1. point fake_quantize_per_channel_affine's implementation to be fake_quantize_per_channel_affine_cachemask 2. delete the fake_quantize_per_channel_affine backward, autograd will automatically use the cachemask backward 3. delete all the fake_quantize_per_channel_affine kernels, since they are no longer used by anything Test Plan: ``` python test/test_quantization.py TestFakeQuantize ``` Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 098b9cbf00efcd7d911d3d1df008e3be720429a8 Pull Request resolved: #51265

#51265) Summary: Pull Request resolved: #51265 This PR is the cleanup after #51159. High level, we make the new definition of fake_quant per channel be the definition used by autograd, but keep the old function around as a thin wrapper to keep the user facing API the same. In detail: 1. point fake_quantize_per_channel_affine's implementation to be fake_quantize_per_channel_affine_cachemask 2. delete the fake_quantize_per_channel_affine backward, autograd will automatically use the cachemask backward 3. delete all the fake_quantize_per_channel_affine kernels, since they are no longer used by anything Test Plan: ``` python test/test_quantization.py TestFakeQuantize ``` Imported from OSS Reviewed By: jerryzh168 Differential Revision: D26120957 fbshipit-source-id: 264426435fabd925decf6d1f0aa79275977ea29b

vkuzo mentioned this pull request Jan 27, 2021

fake_quant: add a more memory efficient backward #50561

Closed

facebook-github-bot added the cla signed label Jan 27, 2021

vkuzo requested review from raghuramank100 and jerryzh168 January 27, 2021 00:14

jerryzh168 approved these changes Jan 27, 2021

View reviewed changes

vkuzo mentioned this pull request Jan 28, 2021

fake_quant: more memory efficient per-channel backward #51255

Closed

facebook-github-bot closed this in 0335222 Jan 28, 2021

facebook-github-bot added the Merged label Jan 28, 2021

vkuzo mentioned this pull request Jan 28, 2021

memory efficient per-channel fq: use it everywhere, delete old version #51265

Closed

facebook-github-bot deleted the gh/vkuzo/214/head branch January 31, 2021 15:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory efficient fq: use it everywhere, delete the old version #51159

memory efficient fq: use it everywhere, delete the old version #51159

vkuzo commented Jan 27, 2021 •

edited

facebook-github-bot commented Jan 27, 2021 •

edited

codecov bot commented Jan 27, 2021

facebook-github-bot commented Jan 28, 2021

memory efficient fq: use it everywhere, delete the old version #51159

memory efficient fq: use it everywhere, delete the old version #51159

Conversation

vkuzo commented Jan 27, 2021 • edited

facebook-github-bot commented Jan 27, 2021 • edited

💊 CI failures summary and remediations

codecov bot commented Jan 27, 2021

Codecov Report

facebook-github-bot commented Jan 28, 2021

vkuzo commented Jan 27, 2021 •

edited

facebook-github-bot commented Jan 27, 2021 •

edited