Update quantization overview for XPU #40331

jiqing-feng · 2025-08-21T02:37:10Z

Update quantization overview for XPU.
Keep in draft until optimum-quanto PR: 395 merged.

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

jiqing-feng · 2025-08-21T02:45:02Z

run-slow: aqlm_integration

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

Rocketknight1 · 2025-08-21T11:13:44Z

cc @IlyasMoutawwakil @MekkCyber

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

IlyasMoutawwakil · 2025-08-22T08:19:55Z

tests/quantization/ggml/test_ggml.py

        out = model.generate(**text, max_new_tokens=10)

-        EXPECTED_TEXT = "Hello, I am a 20 year old male"
+        EXPECTED_TEXT = "Helloab, I am a 1000"


is this th expected output on cpu ?

I got this output on the Intel Xeon CPU. The new ground truth does not very reaonsble, but you can see otehr gound truth have same issue, like here

i don't think it's reasonable, we need take a look

very weird indeed

OK, I've reverted this change. Will track it in a separate issue.

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

MekkCyber

LGTM thanks for the update ! Only a small question

MekkCyber · 2025-08-26T14:57:33Z

tests/quantization/quanto_integration/test_quanto.py

+    EXPECTED_OUTPUTS = [
+        "Hello my name is John, I am a professional photographer, I",  # CUDA output
+        "Hello my name is Nils, I am a student of the University",  # XPU output
+    ]


is this the only test where outputs differ ?

Yes, that's mainly because XPU enables a new way in this PR 395

github-actions · 2025-08-28T09:42:28Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: aqlm_integration, ggml, quanto_integration

HuggingFaceDocBuilderDev · 2025-08-28T09:53:36Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

jiqing-feng added 2 commits August 20, 2025 15:47

update xpu quantization overview

c42e0f4

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

fix aqlm tests

f712d73

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

jiqing-feng added 2 commits August 21, 2025 10:32

fix format

7234431

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

Merge branch 'main' into quant_overview

701dd25

jiqing-feng added 4 commits August 21, 2025 16:13

update gguf support

7ac36b9

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

fix gguf tests

667f159

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

fix xpu gguf precision error

cc92fe9

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

Merge branch 'main' into quant_overview

f92e188

jiqing-feng mentioned this pull request Aug 22, 2025

repetitio/flan-t5-small is not longer exists #40335

Closed

4 tasks

IlyasMoutawwakil reviewed Aug 22, 2025

View reviewed changes

jiqing-feng added 8 commits August 22, 2025 09:01

replace deprecated models

75f62f5

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

fix import org

251a573

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

update xpu ggml tests

0f1cc66

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

Merge branch 'main' into quant_overview

d7a0ebb

revert wrong change

3f1d33c

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

fix xpu tests

ceb3d37

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

xpu optimum-quanto goes green

799b19a

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

fix format

7500c0d

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

MekkCyber approved these changes Aug 26, 2025

View reviewed changes

Merge branch 'main' into quant_overview

bb67f93

jiqing-feng marked this pull request as ready for review August 27, 2025 01:18

Merge branch 'main' into quant_overview

df9788e

MekkCyber enabled auto-merge (squash) August 28, 2025 09:04

Merge branch 'main' into quant_overview

046f480

MekkCyber merged commit f9b9a5e into huggingface:main Aug 28, 2025
14 checks passed

jiqing-feng deleted the quant_overview branch December 15, 2025 02:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update quantization overview for XPU #40331

Update quantization overview for XPU #40331

jiqing-feng commented Aug 21, 2025 •

edited

Loading

Uh oh!

jiqing-feng commented Aug 21, 2025

Uh oh!

Rocketknight1 commented Aug 21, 2025

Uh oh!

IlyasMoutawwakil Aug 22, 2025

Uh oh!

jiqing-feng Aug 22, 2025 •

edited

Loading

Uh oh!

yao-matrix Aug 22, 2025

Uh oh!

MekkCyber Aug 25, 2025

Uh oh!

jiqing-feng Aug 26, 2025 •

edited

Loading

Uh oh!

MekkCyber left a comment

Uh oh!

MekkCyber Aug 26, 2025

Uh oh!

jiqing-feng Aug 27, 2025

Uh oh!

github-actions bot commented Aug 28, 2025

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Aug 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Update quantization overview for XPU #40331

Update quantization overview for XPU #40331

Conversation

jiqing-feng commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jiqing-feng commented Aug 21, 2025

Uh oh!

Rocketknight1 commented Aug 21, 2025

Uh oh!

IlyasMoutawwakil Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

jiqing-feng Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yao-matrix Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

MekkCyber Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

jiqing-feng Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MekkCyber left a comment

Choose a reason for hiding this comment

Uh oh!

MekkCyber Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

jiqing-feng Aug 27, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 28, 2025

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Aug 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

jiqing-feng commented Aug 21, 2025 •

edited

Loading

jiqing-feng Aug 22, 2025 •

edited

Loading

jiqing-feng Aug 26, 2025 •

edited

Loading