Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Quant][PT2E] Enable qconv for quantization 2.0 export #104580

Conversation

leslie-fang-intel
Copy link
Collaborator

@leslie-fang-intel leslie-fang-intel commented Jul 4, 2023

Stack from ghstack (oldest at bottom):

Summary
Enable qconv1d/2d/3d, qconv2d_relu, qconv2d_add, and qconv2d_add_relu operator for quantization 2.0 export with oneDNN library.

Test Plan

python -u -m pytest -s -v test_quantized_op.py -k test_qconv1d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv3d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_relu_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_add_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_add_relu_pt2e

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

@pytorch-bot
Copy link

pytorch-bot bot commented Jul 4, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/104580

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit e67e9bb with merge base 97a291f (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions github-actions bot added the module: cpu CPU specific problem (e.g., perf, algorithm) label Jul 4, 2023
leslie-fang-intel added a commit that referenced this pull request Jul 4, 2023
ghstack-source-id: e61ce96dea3554d80afba1785a11634c036482b7
Pull Request resolved: #104580
@leslie-fang-intel leslie-fang-intel marked this pull request as draft July 4, 2023 06:51
@leslie-fang-intel leslie-fang-intel changed the title Enable qconv for quantization 2.0 export [Quant][PT2E] Enable qconv for quantization 2.0 export Jul 4, 2023
**Summary**
Enable `qconv1d/2d/3d`, `qconv2d_relu`, `qconv2d_add`, and `qconv2d_add_relu` operator for quantization 2.0 export with oneDNN library.

**Test Plan**
```
python -u -m pytest -s -v test_quantized_op.py -k test_qconv1d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv3d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_relu_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_add_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_add_relu_pt2e
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
**Summary**
Enable `qconv1d/2d/3d`, `qconv2d_relu`, `qconv2d_add`, and `qconv2d_add_relu` operator for quantization 2.0 export with oneDNN library.

**Test Plan**
```
python -u -m pytest -s -v test_quantized_op.py -k test_qconv1d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv3d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_relu_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_add_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_add_relu_pt2e
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
leslie-fang-intel added a commit to leslie-fang-intel/pytorch that referenced this pull request Jul 4, 2023
ghstack-source-id: 3b60dda2f7fdcf0fbda54936efbb615dab5e212f
Pull Request resolved: pytorch#104580
aten/src/ATen/native/quantized/library.cpp Outdated Show resolved Hide resolved
aten/src/ATen/native/quantized/cpu/qconv_prepack.cpp Outdated Show resolved Hide resolved
aten/src/ATen/native/quantized/cpu/qconv.cpp Outdated Show resolved Hide resolved
aten/src/ATen/native/quantized/cpu/qconv.cpp Outdated Show resolved Hide resolved
aten/src/ATen/native/quantized/cpu/qconv.cpp Outdated Show resolved Hide resolved
@leslie-fang-intel leslie-fang-intel added the ciflow/trunk Trigger trunk jobs on your pull request label Jul 5, 2023
@kimishpatel
Copy link
Contributor

Few questions.

  • If this is specific to onednn transforms, does it have to be an op in quantized aten lib? I can imagine this to be just onednn specific quantized op libs in which it can live?
  • dont see any tests.
  • Is this mostly copy paste of other implementation? In that case probably better to refactor.
  • I imagine you would have some pass to convert PT2E quantized model to replace ops with quantized ops?

In general though, I think we should aim to make this a custom op rather than op in quantized aten lib. cc: @jerryzh168

leslie-fang-intel added a commit to leslie-fang-intel/pytorch that referenced this pull request Jul 6, 2023
ghstack-source-id: 3b60dda2f7fdcf0fbda54936efbb615dab5e212f
Pull Request resolved: pytorch#104580
**Summary**
Enable `qconv1d/2d/3d`, `qconv2d_relu`, `qconv2d_add`, and `qconv2d_add_relu` operator for quantization 2.0 export with oneDNN library.

**Test Plan**
```
python -u -m pytest -s -v test_quantized_op.py -k test_qconv1d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv3d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_relu_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_add_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_add_relu_pt2e
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
**Summary**
Enable `qconv1d/2d/3d`, `qconv2d_relu`, `qconv2d_add`, and `qconv2d_add_relu` operator for quantization 2.0 export with oneDNN library.

**Test Plan**
```
python -u -m pytest -s -v test_quantized_op.py -k test_qconv1d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv3d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_relu_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_add_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_add_relu_pt2e
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
**Summary**
Enable `qconv1d/2d/3d`, `qconv2d_relu`, `qconv2d_add`, and `qconv2d_add_relu` operator for quantization 2.0 export with oneDNN library.

**Test Plan**
```
python -u -m pytest -s -v test_quantized_op.py -k test_qconv1d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv3d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_relu_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_add_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_add_relu_pt2e
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
**Summary**
Enable `qconv1d/2d/3d`, `qconv2d_relu`, `qconv2d_add`, and `qconv2d_add_relu` operator for quantization 2.0 export with oneDNN library.

**Test Plan**
```
python -u -m pytest -s -v test_quantized_op.py -k test_qconv1d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv3d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_relu_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_add_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_add_relu_pt2e
```

cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10

[ghstack-poisoned]
@leslie-fang-intel
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pytorchmergebot pushed a commit that referenced this pull request Aug 25, 2023
…r constant folding (#104581)

**Summary**
Enable quantization conv weight prepack inside inductor constant folding.

**Test Plan**
```
python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d_unary
```

Pull Request resolved: #104581
Approved by: https://github.com/jgong5, https://github.com/eellison
ghstack dependencies: #104580
pytorchmergebot pushed a commit that referenced this pull request Aug 25, 2023
…ctor (#104588)

**Summary**
Enable the `dequant-quantization-quant` pattern fusion and lowering inside inductor.

**Test Plan**
```
python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d_unary
```

Pull Request resolved: #104588
Approved by: https://github.com/jgong5, https://github.com/eellison
ghstack dependencies: #104580, #104581
pytorchmergebot pushed a commit that referenced this pull request Aug 25, 2023
**Summary**
Enable the `dequant pattern` promotion pass in inductor. Since in the qconv weight prepack pass, we will match the `dequant->conv2d` pattern. If the `dequant pattern` has multi user nodes, it will fail to be matched.
Taking the example of
```
        conv1
       /     \
   conv2    conv3
```
After quantization flow, it will generate pattern as
```
      dequant1
          |
        conv1
          |
        quant2
          |
       dequant2
       /     \
   conv2    conv3
```
We need to duplicate `dequant2` into `dequant2` and `dequant3`, in order to make `dequant2->conv2` and  `dequant3->conv3`  pattern matched.

**Test Plan**
```
python -m pytest test_mkldnn_pattern_matcher.py -k test_dequant_promotion
```

Pull Request resolved: #104590
Approved by: https://github.com/jgong5, https://github.com/eellison
ghstack dependencies: #104580, #104581, #104588
pytorchmergebot pushed a commit that referenced this pull request Aug 25, 2023
… inside inductor (#105455)

**Summary**
Enable the `dequant-conv2d-unary_postop(relu)-quant` pattern fusion and lowering inside inductor.

**Test Plan**
```
clear && python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d_unary
```

Pull Request resolved: #105455
Approved by: https://github.com/jgong5, https://github.com/eellison
ghstack dependencies: #104580, #104581, #104588, #104590
pytorchmergebot pushed a commit that referenced this pull request Aug 25, 2023
…rn fusion inside inductor (#105456)

**Summary**
Enable the `dequant-conv2d-binary_postop(add)-unary_postop(relu)-quant` pattern fusion and lowering inside inductor.

**Test Plan**
```
clear && python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d_binary
```

Pull Request resolved: #105456
Approved by: https://github.com/jgong5, https://github.com/eellison
ghstack dependencies: #104580, #104581, #104588, #104590, #105455
pytorchmergebot pushed a commit that referenced this pull request Aug 26, 2023
…ol2d) (#105639)

**Summary**
In this PR, we mainly enable 2 things.

- Enable the skeleton of quantization recipe for single quantizable operators in `X86InductorQuantizer`.
- Add quantization recipe of `maxpool2d` and annotate it as input./output share observer.

**Test Plan**
```
python -m pytest test_x86inductor_quantizer.py -k test_maxpool2d_recipe
```

Pull Request resolved: #105639
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
ghstack dependencies: #104580, #104581, #104588, #104590, #105455, #105456
pytorchmergebot pushed a commit that referenced this pull request Aug 26, 2023
**Summary**
Enable the `dq-maxpool2d-q` pattern match and lower into `torch.ops.quantized.max_pool2d`.

**Test Plan**
```
python -m pytest test_mkldnn_pattern_matcher.py -k test_qmaxpool2d
python -m pytest test_quantized_op.py -k test_max_pool2d_pt2e
```

Pull Request resolved: #105906
Approved by: https://github.com/jgong5, https://github.com/eellison
ghstack dependencies: #104580, #104581, #104588, #104590, #105455, #105456, #105639
pytorchmergebot pushed a commit that referenced this pull request Aug 26, 2023
**Summary**
After oneDNN 3.1 upgrade, we don't need to do the weight scale reciprocal calculation. So, remove the redundant reciprocal calculation to optimize QConv performance and using IDeep version API to implement it in this PR:

- This QConv implementation expects to work functionally both with current IDeep version and the following IDeep upgrade in PR: #107565.
- With the following IDeep upgrade in PR: #107565, the QConv has better performance since the redundant reciprocal calculation are removed.

Pull Request resolved: #105996
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
ghstack dependencies: #104580, #104581, #104588, #104590, #105455, #105456, #105639, #105906
pytorchmergebot pushed a commit that referenced this pull request Aug 26, 2023
…ht scale reciprocal calculation (#107565)

**Summary**
Upgrade IDeep which includes 1 IDeep change as IDeep PR: intel/ideep#226

- For IDeep PR: intel/ideep#226 which has done 2 things:

  - Remove the redundant QConv weight scale reciprocal calculation.
  - Pump IDEEP_VERSION_REVISION version from 0 to 1.

  So only QConv related calculation will be impacted and we already use IDeep version API in #105996 to make the corresponding change in PyTorch.

Pull Request resolved: #107565
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
ghstack dependencies: #104580, #104581, #104588, #104590, #105455, #105456, #105639, #105906, #105996
voznesenskym pushed a commit that referenced this pull request Aug 27, 2023
**Summary**
Enable `qconv1d/2d/3d`, `qconv2d_relu`, `qconv2d_add`, and `qconv2d_add_relu` operator for quantization 2.0 export with oneDNN library.

**Test Plan**
```
python -u -m pytest -s -v test_quantized_op.py -k test_qconv1d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv3d_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_relu_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_add_pt2e
python -u -m pytest -s -v test_quantized_op.py -k test_qconv2d_add_relu_pt2e
```

Pull Request resolved: #104580
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
voznesenskym pushed a commit that referenced this pull request Aug 27, 2023
…r constant folding (#104581)

**Summary**
Enable quantization conv weight prepack inside inductor constant folding.

**Test Plan**
```
python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d_unary
```

Pull Request resolved: #104581
Approved by: https://github.com/jgong5, https://github.com/eellison
ghstack dependencies: #104580
voznesenskym pushed a commit that referenced this pull request Aug 27, 2023
…ctor (#104588)

**Summary**
Enable the `dequant-quantization-quant` pattern fusion and lowering inside inductor.

**Test Plan**
```
python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d_unary
```

Pull Request resolved: #104588
Approved by: https://github.com/jgong5, https://github.com/eellison
ghstack dependencies: #104580, #104581
voznesenskym pushed a commit that referenced this pull request Aug 27, 2023
**Summary**
Enable the `dequant pattern` promotion pass in inductor. Since in the qconv weight prepack pass, we will match the `dequant->conv2d` pattern. If the `dequant pattern` has multi user nodes, it will fail to be matched.
Taking the example of
```
        conv1
       /     \
   conv2    conv3
```
After quantization flow, it will generate pattern as
```
      dequant1
          |
        conv1
          |
        quant2
          |
       dequant2
       /     \
   conv2    conv3
```
We need to duplicate `dequant2` into `dequant2` and `dequant3`, in order to make `dequant2->conv2` and  `dequant3->conv3`  pattern matched.

**Test Plan**
```
python -m pytest test_mkldnn_pattern_matcher.py -k test_dequant_promotion
```

Pull Request resolved: #104590
Approved by: https://github.com/jgong5, https://github.com/eellison
ghstack dependencies: #104580, #104581, #104588
voznesenskym pushed a commit that referenced this pull request Aug 27, 2023
… inside inductor (#105455)

**Summary**
Enable the `dequant-conv2d-unary_postop(relu)-quant` pattern fusion and lowering inside inductor.

**Test Plan**
```
clear && python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d_unary
```

Pull Request resolved: #105455
Approved by: https://github.com/jgong5, https://github.com/eellison
ghstack dependencies: #104580, #104581, #104588, #104590
voznesenskym pushed a commit that referenced this pull request Aug 27, 2023
…rn fusion inside inductor (#105456)

**Summary**
Enable the `dequant-conv2d-binary_postop(add)-unary_postop(relu)-quant` pattern fusion and lowering inside inductor.

**Test Plan**
```
clear && python -m pytest test_mkldnn_pattern_matcher.py -k test_qconv2d_binary
```

Pull Request resolved: #105456
Approved by: https://github.com/jgong5, https://github.com/eellison
ghstack dependencies: #104580, #104581, #104588, #104590, #105455
voznesenskym pushed a commit that referenced this pull request Aug 27, 2023
…ol2d) (#105639)

**Summary**
In this PR, we mainly enable 2 things.

- Enable the skeleton of quantization recipe for single quantizable operators in `X86InductorQuantizer`.
- Add quantization recipe of `maxpool2d` and annotate it as input./output share observer.

**Test Plan**
```
python -m pytest test_x86inductor_quantizer.py -k test_maxpool2d_recipe
```

Pull Request resolved: #105639
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
ghstack dependencies: #104580, #104581, #104588, #104590, #105455, #105456
voznesenskym pushed a commit that referenced this pull request Aug 27, 2023
**Summary**
Enable the `dq-maxpool2d-q` pattern match and lower into `torch.ops.quantized.max_pool2d`.

**Test Plan**
```
python -m pytest test_mkldnn_pattern_matcher.py -k test_qmaxpool2d
python -m pytest test_quantized_op.py -k test_max_pool2d_pt2e
```

Pull Request resolved: #105906
Approved by: https://github.com/jgong5, https://github.com/eellison
ghstack dependencies: #104580, #104581, #104588, #104590, #105455, #105456, #105639
voznesenskym pushed a commit that referenced this pull request Aug 27, 2023
**Summary**
After oneDNN 3.1 upgrade, we don't need to do the weight scale reciprocal calculation. So, remove the redundant reciprocal calculation to optimize QConv performance and using IDeep version API to implement it in this PR:

- This QConv implementation expects to work functionally both with current IDeep version and the following IDeep upgrade in PR: #107565.
- With the following IDeep upgrade in PR: #107565, the QConv has better performance since the redundant reciprocal calculation are removed.

Pull Request resolved: #105996
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
ghstack dependencies: #104580, #104581, #104588, #104590, #105455, #105456, #105639, #105906
voznesenskym pushed a commit that referenced this pull request Aug 27, 2023
…ht scale reciprocal calculation (#107565)

**Summary**
Upgrade IDeep which includes 1 IDeep change as IDeep PR: intel/ideep#226

- For IDeep PR: intel/ideep#226 which has done 2 things:

  - Remove the redundant QConv weight scale reciprocal calculation.
  - Pump IDEEP_VERSION_REVISION version from 0 to 1.

  So only QConv related calculation will be impacted and we already use IDeep version API in #105996 to make the corresponding change in PyTorch.

Pull Request resolved: #107565
Approved by: https://github.com/jgong5, https://github.com/jerryzh168
ghstack dependencies: #104580, #104581, #104588, #104590, #105455, #105456, #105639, #105906, #105996
@facebook-github-bot facebook-github-bot deleted the gh/leslie-fang-intel/51/head branch August 29, 2023 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request Merged module: cpu CPU specific problem (e.g., perf, algorithm) open source release notes: quantization release notes category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants