Skip to content

Conversation

@namgyu-youn
Copy link
Contributor

@namgyu-youn namgyu-youn commented Nov 26, 2025

Summary:
Introduce a new tensor subclass API. The main features are

  • Int8Tensor: Main API, which handles quantization and dequantization operations
  • Utility operation functions: Tensor slice, index selection

This API is integrated into global variants (Int8WeightOnlyConfig, Int8DynamicActivationInt8WeightConfig) using version, and not defined as a default.

Related Issue/PR: #3241 (reland)

Test plan: pytest -sv test/quantization/quantize_/workflows/int8/test_int8_tensor.py

PERF Test:
https://github.com/pytorch/ao/blob/main/tutorials/quantize_vit/run_vit_b_quant.py with a batch size of 32:

API With torch.compile Without torch.compile
Old 65.47 ms 234.39 ms
New 63.30 ms 239.30 ms

Future Plan: #3241 (review)

@pytorch-bot pytorch-bot bot added the ci-no-td label Nov 26, 2025
@pytorch-bot
Copy link

pytorch-bot bot commented Nov 26, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3391

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit e32508c with merge base d355d1f (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 26, 2025
@namgyu-youn
Copy link
Contributor Author

@pytorchbot label "quantize_"

@pytorch-bot pytorch-bot bot added the quantize_ quantize_ API label Nov 26, 2025
jerryzh168
jerryzh168 previously approved these changes Nov 26, 2025
@jerryzh168 jerryzh168 changed the title [Reland] Introduce int8 quantization api (version 2) Introduce int8 quantization api (version 2) Nov 26, 2025
@jerryzh168
Copy link
Contributor

@namgyu-youn I have confirmed internally, there are some infra issues right now so the CI jobs didn't show up, let's just wait for that to be resolved

@jcaip jcaip closed this Dec 1, 2025
@jcaip jcaip reopened this Dec 1, 2025
@pytorch-bot pytorch-bot bot dismissed jerryzh168’s stale review December 1, 2025 22:05

This PR was reopened (likely due to being reverted), so your approval was removed. Please request another review.

@jcaip jcaip added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Dec 2, 2025
@namgyu-youn
Copy link
Contributor Author

namgyu-youn commented Dec 2, 2025

@jerryzh168 Finally! CI started to run, but broken although the local test passed in NVIDIA A100. Assuming CI instance calls different kernels compared to Ampere, but not sure what should I do... can you please help with this? Actually, I didn't understand why compiler, instead of profiler

@jcaip jcaip merged commit 3c3515a into pytorch:main Dec 2, 2025
23 checks passed
@jcaip
Copy link
Contributor

jcaip commented Dec 2, 2025

Thanks for working on this @namgyu-youn!

@namgyu-youn namgyu-youn deleted the int8-reland branch December 3, 2025 02:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-td CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. quantize_ quantize_ API topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants