Add Qwen 3.6 MoE model and switch CI to Qwen3.6-35B-A3B-HQQ-INT4 by mergennachin · Pull Request #18955 · pytorch/executorch

mergennachin · 2026-04-16T21:21:46Z

Qwen 3.6 MoE shares architecture and runner with Qwen 3.5 MoE. Add a stub README pointing to the existing qwen3_5_moe example. Update CI scripts and cuda.yml to use the Qwen 3.6 prequantized checkpoint. Improve qwen3_5_moe README: add quick-start section for prequantized weights, list available prequantized checkpoints, and clean up terminology.

pytorch-bot · 2026-04-16T21:21:50Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18955

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

[CI[B200] Smoke test encounters CUDA Unknown error for dgxb200-03 and dgxb200-04

❌ 5 New Failures, 1 Cancelled Job, 3 Unrelated Failures

As of commit 655fa02 with merge base 75ba558 ():

NEW FAILURES - The following jobs have failed:

pull / android / build-android (gh)
Process completed with exit code 2.
pull / unittest-editable / macos / macos-job (gh)
export/tests/test_target_recipes.py::TestTargetRecipes::test_mv3_model
trunk / test-torchao-huggingface-checkpoints (lfm2_5_1_2b, linux.arm64.2xlarge, executorch-ubuntu-22.04-g... / linux-job (gh)
RuntimeError: Command docker exec -t f506c7c3c2d63321f325511db37a0c7fa73a5be23adf4185605eab450a90f7ec /exec failed with exit code 1
trunk / test-torchao-huggingface-checkpoints (phi_4_mini, linux.arm64.2xlarge, executorch-ubuntu-22.04-gc... / linux-job (gh)
RuntimeError: Command docker exec -t 87c341acc03066a6b1e22fdf04ff48899497d8fd98ce44d3421f7b024783f8ae /exec failed with exit code 1
trunk / test-torchao-huggingface-checkpoints (qwen3_4b, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc1... / linux-job (gh)
RuntimeError: Command docker exec -t f7718632755d44392745bbd1a2ddcdf241847f74e89f4bf61bc8bf6a150991ce /exec failed with exit code 1

CANCELLED JOB - The following job was cancelled. Please retry:

trunk / test-models-macos-cpu (vit, xnnpack-quantization-delegation) / macos-job (gh)
##[error]The operation was canceled.

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / unittest-editable / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.
trunk / unittest-release / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-04-16T21:22:40Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copilot

Pull request overview

Adds Qwen 3.6 MoE documentation and switches CUDA CI/export scripts to use the Qwen3.6-35B-A3B-HQQ-INT4 prequantized checkpoint, leveraging the existing Qwen 3.5 MoE runner/export pipeline.

Changes:

Add a stub qwen3_6_moe README that points to the qwen3_5_moe example and links the Qwen 3.6 prequantized INT4 checkpoint.
Update CUDA workflow + CI scripts to use SocialLocalMobile/Qwen3.6-35B-A3B-HQQ-INT4.
Improve qwen3_5_moe README with a prequantized quick-start and clearer “prequantized checkpoint” terminology.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
examples/models/qwen3_6_moe/README.md	Adds a minimal pointer README for Qwen 3.6 MoE and links prequantized weights.
examples/models/qwen3_5_moe/README.md	Adds quick-start + clarifies prequantized export terminology and checkpoints list.
.github/workflows/cuda.yml	Switches CUDA CI model matrix and runner selection to Qwen 3.6 INT4 checkpoint.
.ci/scripts/test_model_e2e.sh	Updates supported HF model ID mapping to Qwen 3.6 checkpoint.
.ci/scripts/export_model_artifact.sh	Updates supported HF model ID mapping to Qwen 3.6 checkpoint for export.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-16T21:24:33Z

+- [SocialLocalMobile/Qwen3.6-35B-A3B-HQQ-INT4](https://huggingface.co/SocialLocalMobile/Qwen3.6-35B-A3B-HQQ-INT4)
+
+```bash
+python export.py --prequantized <path-to-bundle>


In the quick-start snippet, the placeholder still says <path-to-bundle>, but the README now consistently describes --prequantized as a prequantized checkpoint directory. Update the placeholder wording to match (e.g., <path-to-prequantized-checkpoint>), to avoid confusion about what path is expected.

Suggested change

python export.py --prequantized <path-to-bundle>

python export.py --prequantized <path-to-prequantized-checkpoint>

Copilot · 2026-04-16T21:24:34Z

@@ -194,11 +194,11 @@ jobs:
          # Qwen3.5 MoE uses a prequantized checkpoint, only tile-packed


This comment still refers to “Qwen3.5 MoE”, but the matrix entry below is now Qwen3.6-35B-A3B-HQQ-INT4. Update the comment to reflect Qwen 3.6 (or both 3.5/3.6) so the exclusion rationale stays accurate.

Suggested change

# Qwen3.5 MoE uses a prequantized checkpoint, only tile-packed

# Qwen3.6 MoE uses a prequantized checkpoint, only tile-packed

Copilot · 2026-04-16T21:24:34Z

@@ -324,11 +324,11 @@ jobs:
          # Qwen3.5 MoE uses a prequantized checkpoint, only tile-packed


This exclusion-block comment still says “Qwen3.5 MoE uses a prequantized checkpoint”, but the excluded model is now Qwen3.6-35B-A3B-HQQ-INT4. Update the comment to match the new model name (or make it version-agnostic) to keep the workflow self-explanatory.

Suggested change

# Qwen3.5 MoE uses a prequantized checkpoint, only tile-packed

# Qwen3.6-35B-A3B-HQQ-INT4 uses a prequantized checkpoint, only tile-packed

Copilot AI review requested due to automatic review settings April 16, 2026 21:21

mergennachin requested a review from lucylq as a code owner April 16, 2026 21:21

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 16, 2026

Copilot started reviewing on behalf of mergennachin April 16, 2026 21:22 View session

mergennachin requested review from Gasoonjia and digantdesai April 16, 2026 21:22

Copilot AI reviewed Apr 16, 2026

View reviewed changes

manuelcandales approved these changes Apr 16, 2026

View reviewed changes

mergennachin temporarily deployed to upload-benchmark-results April 16, 2026 22:26 — with GitHub Actions Inactive

mergennachin merged commit 7fdd306 into main Apr 17, 2026
380 of 391 checks passed

mergennachin deleted the qwen3_6 branch April 17, 2026 00:02

mergennachin mentioned this pull request Apr 17, 2026

Revert "Add Qwen 3.6 MoE model and switch CI to Qwen3.6-35B-A3B-HQQ-INT4" #18965

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Qwen 3.6 MoE model and switch CI to Qwen3.6-35B-A3B-HQQ-INT4#18955

Add Qwen 3.6 MoE model and switch CI to Qwen3.6-35B-A3B-HQQ-INT4#18955
mergennachin merged 1 commit intomainfrom
qwen3_6

mergennachin commented Apr 16, 2026

Uh oh!

pytorch-bot Bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 16, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 16, 2026

Uh oh!

Copilot AI Apr 16, 2026

Uh oh!

Copilot AI Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	python export.py --prequantized <path-to-bundle>
	python export.py --prequantized <path-to-prequantized-checkpoint>

		@@ -194,11 +194,11 @@ jobs:
		# Qwen3.5 MoE uses a prequantized checkpoint, only tile-packed

	# Qwen3.5 MoE uses a prequantized checkpoint, only tile-packed
	# Qwen3.6 MoE uses a prequantized checkpoint, only tile-packed

		@@ -324,11 +324,11 @@ jobs:
		# Qwen3.5 MoE uses a prequantized checkpoint, only tile-packed

	# Qwen3.5 MoE uses a prequantized checkpoint, only tile-packed
	# Qwen3.6-35B-A3B-HQQ-INT4 uses a prequantized checkpoint, only tile-packed

Conversation

mergennachin commented Apr 16, 2026

Uh oh!

pytorch-bot Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18955

❗ 1 Active SEVs

❌ 5 New Failures, 1 Cancelled Job, 3 Unrelated Failures

Uh oh!

github-actions Bot commented Apr 16, 2026

This PR needs a release notes: label

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pytorch-bot Bot commented Apr 16, 2026 •

edited

Loading

This PR needs a `release notes:` label