Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

Conversation

@anmarques
Copy link
Member

  1. Remove quantization of identity branch on BERT models.
  2. Replace array quantization in NumPy for torch.quantize_per_tensor.

NOTE: This PR does NOT remove quantization of the identity branch for ResNet models. This will require fixes on the model side as well. Will be addressed in a future PR.

@github-actions
Copy link

github-actions bot commented Jul 8, 2022

@kylesayrs assigned for review

@anmarques anmarques requested review from a team, bfineran and natuan July 8, 2022 21:39
Copy link
Contributor

@bfineran bfineran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending comments

…export.py

Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>
natuan
natuan previously approved these changes Jul 8, 2022
Copy link
Contributor

@natuan natuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, pending Ben's suggestion to delete the unused func

@anmarques anmarques requested a review from natuan July 8, 2022 21:51
@natuan natuan merged commit adb8429 into main Jul 8, 2022
@natuan natuan deleted the fix_onnx_export_bert branch July 8, 2022 22:05
anmarques added a commit that referenced this pull request Jul 11, 2022
* Remove quantization of identity branch on BERT models

* Style and quality fixes.

* Update src/sparseml/pytorch/sparsification/quantization/quantize_qat_export.py

Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>

* Removed unused function

Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>
markurtz pushed a commit that referenced this pull request Jul 12, 2022
* Bump up version id

* Fix for ONNX export for quantized BERT models (#935)

* Remove quantization of identity branch on BERT models

* Style and quality fixes.

* Update src/sparseml/pytorch/sparsification/quantization/quantize_qat_export.py

Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>

* Removed unused function

Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>

Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants