[Quantization] Add metal quantization for MPS devices! by MekkCyber · Pull Request #43934 · huggingface/transformers

MekkCyber · 2026-02-12T07:59:02Z

What does this PR do?

Adds mlx quantization for mps devices leveraging the kernels library for pre-built kernels !!

HuggingFaceDocBuilderDev · 2026-02-12T08:11:58Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

Super nice, missing some tests tho!!

src/transformers/integrations/mlx_quantization.py

src/transformers/integrations/metal_quantization.py

src/transformers/utils/quantization_config.py

SunMarc

Thanks ! Maybe we can change the name to Metal insteal of Mlx as it can create confusion ? In the future, we might have mlx if we add compatibility with mlx models. Please add some e2e tests + add tests to check that we have the right dtype after quantization and dequantization

src/transformers/integrations/metal_quantization.py

SunMarc · 2026-02-17T13:10:15Z

src/transformers/integrations/mlx_quantization.py

+        orig_dtype = value.dtype  # e.g. bfloat16 for Llama
+        return {
+            target_key: w_packed,
+            scale_key: scales.to(orig_dtype),
+            bias_key: biases.to(orig_dtype),


fine but I think _affine_quantize_tensor should return them in the right dtype already

not sure about this, since when we quantize we keep the scales dtype the same as the weight dtype before quantization which is float32

src/transformers/quantizers/quantizer_metal.py

src/transformers/quantizers/quantizer_mlx.py

src/transformers/quantizers/quantizer_metal.py

src/transformers/utils/quantization_config.py

src/transformers/integrations/metal_quantization.py

SunMarc

Thanks a lot ! Can you just update the overview docs to add this quantization method ?

github-actions · 2026-02-27T12:25:51Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: metal

…3934) * first commit * style * fix * fix * mlx -> metal * other fixes * add tests * fixes * weight -> qweight * fix * tests * fix style * fix * toctree * some docs * qweight -> weight * fix dtype * rm print * overview --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

MekkCyber added 5 commits February 12, 2026 08:57

first commit

adb51ad

style

4fb1ddf

fix

b247031

Merge branch 'main' into add-mlx-quantization

58a58ed

fix

ee40de9

ArthurZucker reviewed Feb 13, 2026

View reviewed changes

src/transformers/integrations/mlx_quantization.py Outdated Show resolved Hide resolved

src/transformers/integrations/metal_quantization.py Outdated Show resolved Hide resolved

src/transformers/utils/quantization_config.py Show resolved Hide resolved

SunMarc self-requested a review February 17, 2026 12:37

SunMarc reviewed Feb 17, 2026

View reviewed changes

MekkCyber commented Feb 17, 2026

View reviewed changes

src/transformers/integrations/metal_quantization.py Outdated Show resolved Hide resolved

MekkCyber and others added 13 commits February 26, 2026 08:11

Merge remote-tracking branch 'upstream/main' into add-mlx-quantization

69c2e17

mlx -> metal

a6ebd5c

other fixes

64a625c

add tests

7df25b4

fixes

536ccfb

weight -> qweight

7d67239

fix

abc2b97

tests

56f92e9

fix style

86699b1

fix

601d7e6

toctree

5d15435

some docs

bab4dd8

qweight -> weight

6ac192b

MekkCyber force-pushed the add-mlx-quantization branch from 212b192 to 6ac192b Compare February 26, 2026 15:31

MekkCyber added 3 commits February 26, 2026 16:56

fix dtype

1877f1d

rm print

3f4b688

Merge branch 'main' into add-mlx-quantization

c059f1a

SunMarc reviewed Feb 27, 2026

View reviewed changes

SunMarc changed the title ~~[Quantization] Add mlx quantization for MPS devices!~~ [Quantization] Add metal quantization for MPS devices! Feb 27, 2026

SunMarc approved these changes Feb 27, 2026

View reviewed changes

SunMarc and others added 2 commits February 27, 2026 12:27

Merge branch 'main' into add-mlx-quantization

28b43b8

overview

97ce86a

SunMarc merged commit 9dd9076 into main Feb 27, 2026
26 checks passed

SunMarc deleted the add-mlx-quantization branch February 27, 2026 13:28

Conversation

MekkCyber commented Feb 12, 2026

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Feb 12, 2026

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SunMarc Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

MekkCyber Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants