Voxtral Realtime: enable bf16 for Metal backend with quantization by pytorchbot · Pull Request #18358 · pytorch/executorch

pytorchbot · 2026-03-20T01:54:40Z

The Metal AOTI backend already handles bf16 correctly (fp32 attention
masks, fp32 RoPE upcast, dtype-agnostic KV caches and SDPA). Enable
--dtype bf16 as the default recipe for Metal CI and update all
documentation to recommend bf16 with fpa4w quantization.

…7845) The Metal AOTI backend already handles bf16 correctly (fp32 attention masks, fp32 RoPE upcast, dtype-agnostic KV caches and SDPA). Enable --dtype bf16 as the default recipe for Metal CI and update all documentation to recommend bf16 with fpa4w quantization. (cherry picked from commit 202c6af)

pytorch-bot · 2026-03-20T01:54:44Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18358

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 143 Pending

As of commit aadf80e with merge base 8c0a60b ():

NEW FAILURE - The following job has failed:

trunk / test-models-linux-aarch64 (vit, portable, linux.arm64.2xlarge) / linux-job (gh)
RuntimeError: Command docker exec -t db36f190250f5fbd28ba239233c49e88a9c7822c882b63aa3417c42d5c986b37 /exec failed with exit code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorchbot requested a review from lucylq as a code owner March 20, 2026 01:54

This was referenced Mar 20, 2026

[v1.2.0] Release Schedule and Tracker #17016

Open

Voxtral Realtime: enable bf16 for Metal backend with quantization #17845

Merged

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 20, 2026

manuelcandales approved these changes Mar 20, 2026

View reviewed changes

manuelcandales merged commit 6b7579b into release/1.2 Mar 20, 2026
374 of 379 checks passed

manuelcandales deleted the cherry-pick-17845-by-pytorch_bot_bot_ branch March 20, 2026 02:34

pytorchbot temporarily deployed to upload-benchmark-results March 20, 2026 02:59 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Voxtral Realtime: enable bf16 for Metal backend with quantization#18358

Voxtral Realtime: enable bf16 for Metal backend with quantization#18358
manuelcandales merged 1 commit intorelease/1.2from
cherry-pick-17845-by-pytorch_bot_bot_

pytorchbot commented Mar 20, 2026

Uh oh!

pytorch-bot bot commented Mar 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pytorchbot commented Mar 20, 2026

Uh oh!

pytorch-bot bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18358

❌ 1 New Failure, 143 Pending

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pytorch-bot bot commented Mar 20, 2026 •

edited

Loading