Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory efficient fq: use it everywhere, delete the old version #51159

Closed
wants to merge 2 commits into from

Conversation

vkuzo
Copy link
Contributor

@vkuzo vkuzo commented Jan 27, 2021

Stack from ghstack:

Summary:

This PR is the cleanup after #50561. High level, we make the new
definition of fake_quant be the definition used by autograd, but keep the old
function around as a thin wrapper to keep the user facing API the same.

In detail:

  1. point fake_quantize_per_tensor_affine's implementation to be fake_quantize_per_tensor_affine_cachemask
  2. delete the fake_quantize_per_tensor_affine backward, autograd will automatically use the cachemask backward
  3. delete all the fake_quantize_per_tensor_affine kernels, since they are no longer used by anything

Test Plan:

python test/test_quantization.py TestFakeQuantize

performance testing was done in the previous PR.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D26090869

Summary:

This PR is the cleanup after #50561. High level, we make the new
definition of fake_quant be the definition used by autograd, but keep the old
function around as a thin wrapper to keep the user facing API the same.

In detail:
1. point `fake_quantize_per_tensor_affine`'s implementation to be `fake_quantize_per_tensor_affine_cachemask`
2. delete the `fake_quantize_per_tensor_affine` backward, autograd will automatically use the cachemask backward
3. delete all the `fake_quantize_per_tensor_affine` kernels, since they are no longer used by anything

Test Plan:

```
python test/test_quantization.py TestFakeQuantize
```

performance testing was done in the previous PR.

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Jan 27, 2021
Summary:

This PR is the cleanup after #50561. High level, we make the new
definition of fake_quant be the definition used by autograd, but keep the old
function around as a thin wrapper to keep the user facing API the same.

In detail:
1. point `fake_quantize_per_tensor_affine`'s implementation to be `fake_quantize_per_tensor_affine_cachemask`
2. delete the `fake_quantize_per_tensor_affine` backward, autograd will automatically use the cachemask backward
3. delete all the `fake_quantize_per_tensor_affine` kernels, since they are no longer used by anything

Test Plan:

```
python test/test_quantization.py TestFakeQuantize
```

performance testing was done in the previous PR.

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 1adb5b962a25dd5f89c035e7855fb8eb28bb1706
Pull Request resolved: #51159
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jan 27, 2021

💊 CI failures summary and remediations

As of commit 6f97b53 (more details on the Dr. CI page):


  • 1/1 failures possibly* introduced in this PR
    • 1/1 non-CircleCI failure(s)

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

…sion"

Summary:

This PR is the cleanup after #50561. High level, we make the new
definition of fake_quant be the definition used by autograd, but keep the old
function around as a thin wrapper to keep the user facing API the same.

In detail:
1. point `fake_quantize_per_tensor_affine`'s implementation to be `fake_quantize_per_tensor_affine_cachemask`
2. delete the `fake_quantize_per_tensor_affine` backward, autograd will automatically use the cachemask backward
3. delete all the `fake_quantize_per_tensor_affine` kernels, since they are no longer used by anything

Test Plan:

```
python test/test_quantization.py TestFakeQuantize
```

performance testing was done in the previous PR.

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D26090869](https://our.internmc.facebook.com/intern/diff/D26090869)

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Jan 27, 2021
Summary:

This PR is the cleanup after #50561. High level, we make the new
definition of fake_quant be the definition used by autograd, but keep the old
function around as a thin wrapper to keep the user facing API the same.

In detail:
1. point `fake_quantize_per_tensor_affine`'s implementation to be `fake_quantize_per_tensor_affine_cachemask`
2. delete the `fake_quantize_per_tensor_affine` backward, autograd will automatically use the cachemask backward
3. delete all the `fake_quantize_per_tensor_affine` kernels, since they are no longer used by anything

Test Plan:

```
python test/test_quantization.py TestFakeQuantize
```

performance testing was done in the previous PR.

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: de70258e9950f1e7a401c52b7fa2082390319690
Pull Request resolved: #51159
@codecov
Copy link

codecov bot commented Jan 27, 2021

Codecov Report

Merging #51159 (6f97b53) into gh/vkuzo/214/base (194be25) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@                Coverage Diff                 @@
##           gh/vkuzo/214/base   #51159   +/-   ##
==================================================
  Coverage              80.88%   80.88%           
==================================================
  Files                   1931     1931           
  Lines                 210604   210587   -17     
==================================================
- Hits                  170343   170339    -4     
+ Misses                 40261    40248   -13     

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 0335222.

vkuzo added a commit that referenced this pull request Jan 28, 2021
Summary:

This PR is the cleanup after #51159. High level, we make the new
definition of fake_quant per channel be the definition used by autograd, but keep the old
function around as a thin wrapper to keep the user facing API the same.

In detail:

1. point fake_quantize_per_channel_affine's implementation to be fake_quantize_per_channel_affine_cachemask
2. delete the fake_quantize_per_channel_affine backward, autograd will automatically use the cachemask backward
3. delete all the fake_quantize_per_channel_affine kernels, since they are no longer used by anything

Test Plan:

```
python test/test_quantization.py TestFakeQuantize
```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Jan 28, 2021
… old version"

Summary:

This PR is the cleanup after #51159. High level, we make the new
definition of fake_quant per channel be the definition used by autograd, but keep the old
function around as a thin wrapper to keep the user facing API the same.

In detail:

1. point fake_quantize_per_channel_affine's implementation to be fake_quantize_per_channel_affine_cachemask
2. delete the fake_quantize_per_channel_affine backward, autograd will automatically use the cachemask backward
3. delete all the fake_quantize_per_channel_affine kernels, since they are no longer used by anything

Test Plan:

```
python test/test_quantization.py TestFakeQuantize
```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Jan 28, 2021
Summary:

This PR is the cleanup after #51159. High level, we make the new
definition of fake_quant per channel be the definition used by autograd, but keep the old
function around as a thin wrapper to keep the user facing API the same.

In detail:

1. point fake_quantize_per_channel_affine's implementation to be fake_quantize_per_channel_affine_cachemask
2. delete the fake_quantize_per_channel_affine backward, autograd will automatically use the cachemask backward
3. delete all the fake_quantize_per_channel_affine kernels, since they are no longer used by anything

Test Plan:

```
python test/test_quantization.py TestFakeQuantize
```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 098b9cbf00efcd7d911d3d1df008e3be720429a8
Pull Request resolved: #51265
facebook-github-bot pushed a commit that referenced this pull request Jan 29, 2021
#51265)

Summary:
Pull Request resolved: #51265

This PR is the cleanup after #51159. High level, we make the new
definition of fake_quant per channel be the definition used by autograd, but keep the old
function around as a thin wrapper to keep the user facing API the same.

In detail:

1. point fake_quantize_per_channel_affine's implementation to be fake_quantize_per_channel_affine_cachemask
2. delete the fake_quantize_per_channel_affine backward, autograd will automatically use the cachemask backward
3. delete all the fake_quantize_per_channel_affine kernels, since they are no longer used by anything

Test Plan:
```
python test/test_quantization.py TestFakeQuantize
```

Imported from OSS

Reviewed By: jerryzh168

Differential Revision: D26120957

fbshipit-source-id: 264426435fabd925decf6d1f0aa79275977ea29b
@facebook-github-bot facebook-github-bot deleted the gh/vkuzo/214/head branch January 31, 2021 15:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants