Skip to content

Conversation

jerryzh168
Copy link
Contributor

fixes for int4 non-gptq/gptq quantizer and int8 weight only quantization

HDCharles and others added 3 commits April 3, 2024 22:09
Summary: int4weightlinear had a bug that made it not pad when it should
have

Test Plan: python test/quantization/test_quant_api.py -k "int4wo"

Reviewers:

Subscribers:

Tasks:

Tags:
* fixing bug in GPTQ

Summary: shape was always padded even when not needed.

Test Plan: pythont test/quantization/test_quant_api.py -k
"test_gptq_quantizer_int4wo"

Reviewers:

Subscribers:

Tasks:

Tags:

* removing extra spaces

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Summary:
registering fields as buffers so they get picked up in `model.to`

Test Plan:
python test/quantization/test_quant_api.py -k test_int8_wo_quant_save_load
Reviewers:

Subscribers:

Tasks:

Tags:
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 4, 2024
@jerryzh168 jerryzh168 merged commit e25c79a into release/v0.1 Apr 4, 2024
yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024
Add information about params-path to README, update spelling of torchat
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants