Raise a8wxdq load errors only when quant scheme is used #1231

Jack-Khuu · 2024-09-29T22:05:44Z

As titled, originally the message was shown regardless of whether the scheme was used; tested on Mac (ARM)

Not Using Quant

> python3 torchchat.py generate llama3.1 --device cpu
Using device=cpu Apple M1 Max
Loading model...
Time to load model: 2.24 seconds
-----------------------------------------------------------

Using Quant

> OMP_NUM_THREADS=6 python3 torchchat.py generate llama3.1 --device cpu --dtype float32 --quantize '{"linear:a8wxdq": {"bitwidth": 4, "groupsize": 256, "has_weight_zeros": false}}'

w/o Loading

Using device=cpu Apple M1 Max
Loading model...
Time to load model: 2.74 seconds
Quantizing the model with: {'linear:a8wxdq': {'bitwidth': 4, 'groupsize': 256, 'has_weight_zeros': False}}
Time to quantize model: 0.00 seconds
Traceback (most recent call last):
  File "/Users/jackkhuu/Desktop/oss/torchchat/torchchat.py", line 83, in <module>
    generate_main(args)
  File "/Users/jackkhuu/Desktop/oss/torchchat/torchchat/generate.py", line 1093, in main
    gen = Generator(
          ^^^^^^^^^^
  File "/Users/jackkhuu/Desktop/oss/torchchat/torchchat/generate.py", line 284, in __init__
    self.model = _initialize_model(self.builder_args, self.quantize, self.tokenizer)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jackkhuu/Desktop/oss/torchchat/torchchat/cli/builder.py", line 568, in _initialize_model
    quantize_model(
  File "/Users/jackkhuu/Desktop/oss/torchchat/torchchat/utils/quantize.py", line 84, in quantize_model
    raise Exception(f"Note: Failed to load torchao experimental a8wxdq quantizer with error: {a8wxdq_load_error}")
Exception: Note: Failed to load torchao experimental a8wxdq quantizer with error: [Errno 2] No such file or directory: '/Users/jackkhuu/Desktop/oss/torchchat/torchao-build/src/ao/torchao/experimental/quant_api.py'

w/ Loading

Using device=cpu Apple M1 Max
Loading model...
Time to load model: 2.98 seconds
Quantizing the model with: {'linear:a8wxdq': {'bitwidth': 4, 'groupsize': 256, 'has_weight_zeros': False}}

pytorch-bot · 2024-09-29T22:05:47Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1231

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

PyTorch Testing Nodes Undergoing ROCm 6.2.1 Upgrades

✅ No Failures

As of commit 6e1cdd5 with merge base 8278aa2 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

* Show a8wxdq load error only when the quant is used * Update Error check

Jack-Khuu added 2 commits September 29, 2024 14:39

Show a8wxdq load error only when the quant is used

4e2bee8

Update Error check

6e1cdd5

Jack-Khuu requested review from jerryzh168 and metascroy September 29, 2024 22:05

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 29, 2024

Jack-Khuu requested a review from byjlw September 29, 2024 22:23

byjlw approved these changes Sep 30, 2024

View reviewed changes

Jack-Khuu merged commit 9bbbc87 into main Sep 30, 2024
52 checks passed

metascroy pushed a commit that referenced this pull request Sep 30, 2024

Raise a8wxdq load errors only when quant scheme is used (#1231)

ca3d9b9

* Show a8wxdq load error only when the quant is used * Update Error check

Jack-Khuu deleted the a8wx-message branch October 5, 2024 02:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Raise a8wxdq load errors only when quant scheme is used #1231

Raise a8wxdq load errors only when quant scheme is used #1231

Uh oh!

Jack-Khuu commented Sep 29, 2024

Uh oh!

pytorch-bot bot commented Sep 29, 2024 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Raise a8wxdq load errors only when quant scheme is used #1231

Raise a8wxdq load errors only when quant scheme is used #1231

Uh oh!

Conversation

Jack-Khuu commented Sep 29, 2024

Uh oh!

pytorch-bot bot commented Sep 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1231

❗ 1 Active SEVs

✅ No Failures

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pytorch-bot bot commented Sep 29, 2024 •

edited

Loading