Skip to content

Add Qwen 3.6 MoE model and switch CI to Qwen3.6-35B-A3B-HQQ-INT4#18955

Merged
mergennachin merged 1 commit intomainfrom
qwen3_6
Apr 17, 2026
Merged

Add Qwen 3.6 MoE model and switch CI to Qwen3.6-35B-A3B-HQQ-INT4#18955
mergennachin merged 1 commit intomainfrom
qwen3_6

Conversation

@mergennachin
Copy link
Copy Markdown
Contributor

Qwen 3.6 MoE shares architecture and runner with Qwen 3.5 MoE. Add a stub README pointing to the existing qwen3_5_moe example. Update CI scripts and cuda.yml to use the Qwen 3.6 prequantized checkpoint. Improve qwen3_5_moe README: add quick-start section for prequantized weights, list available prequantized checkpoints, and clean up terminology.

Qwen 3.6 MoE shares architecture and runner with Qwen 3.5 MoE.
Add a stub README pointing to the existing qwen3_5_moe example.
Update CI scripts and cuda.yml to use the Qwen 3.6 prequantized
checkpoint. Improve qwen3_5_moe README: add quick-start section
for prequantized weights, list available prequantized checkpoints,
and clean up terminology.
Copilot AI review requested due to automatic review settings April 16, 2026 21:21
@mergennachin mergennachin requested a review from lucylq as a code owner April 16, 2026 21:21
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Apr 16, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18955

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 5 New Failures, 1 Cancelled Job, 3 Unrelated Failures

As of commit 655fa02 with merge base 75ba558 (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 16, 2026
@github-actions
Copy link
Copy Markdown

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Qwen 3.6 MoE documentation and switches CUDA CI/export scripts to use the Qwen3.6-35B-A3B-HQQ-INT4 prequantized checkpoint, leveraging the existing Qwen 3.5 MoE runner/export pipeline.

Changes:

  • Add a stub qwen3_6_moe README that points to the qwen3_5_moe example and links the Qwen 3.6 prequantized INT4 checkpoint.
  • Update CUDA workflow + CI scripts to use SocialLocalMobile/Qwen3.6-35B-A3B-HQQ-INT4.
  • Improve qwen3_5_moe README with a prequantized quick-start and clearer “prequantized checkpoint” terminology.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
examples/models/qwen3_6_moe/README.md Adds a minimal pointer README for Qwen 3.6 MoE and links prequantized weights.
examples/models/qwen3_5_moe/README.md Adds quick-start + clarifies prequantized export terminology and checkpoints list.
.github/workflows/cuda.yml Switches CUDA CI model matrix and runner selection to Qwen 3.6 INT4 checkpoint.
.ci/scripts/test_model_e2e.sh Updates supported HF model ID mapping to Qwen 3.6 checkpoint.
.ci/scripts/export_model_artifact.sh Updates supported HF model ID mapping to Qwen 3.6 checkpoint for export.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- [SocialLocalMobile/Qwen3.6-35B-A3B-HQQ-INT4](https://huggingface.co/SocialLocalMobile/Qwen3.6-35B-A3B-HQQ-INT4)

```bash
python export.py --prequantized <path-to-bundle>
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the quick-start snippet, the placeholder still says <path-to-bundle>, but the README now consistently describes --prequantized as a prequantized checkpoint directory. Update the placeholder wording to match (e.g., <path-to-prequantized-checkpoint>), to avoid confusion about what path is expected.

Suggested change
python export.py --prequantized <path-to-bundle>
python export.py --prequantized <path-to-prequantized-checkpoint>

Copilot uses AI. Check for mistakes.
@@ -194,11 +194,11 @@ jobs:
# Qwen3.5 MoE uses a prequantized checkpoint, only tile-packed
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment still refers to “Qwen3.5 MoE”, but the matrix entry below is now Qwen3.6-35B-A3B-HQQ-INT4. Update the comment to reflect Qwen 3.6 (or both 3.5/3.6) so the exclusion rationale stays accurate.

Suggested change
# Qwen3.5 MoE uses a prequantized checkpoint, only tile-packed
# Qwen3.6 MoE uses a prequantized checkpoint, only tile-packed

Copilot uses AI. Check for mistakes.
@@ -324,11 +324,11 @@ jobs:
# Qwen3.5 MoE uses a prequantized checkpoint, only tile-packed
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exclusion-block comment still says “Qwen3.5 MoE uses a prequantized checkpoint”, but the excluded model is now Qwen3.6-35B-A3B-HQQ-INT4. Update the comment to match the new model name (or make it version-agnostic) to keep the workflow self-explanatory.

Suggested change
# Qwen3.5 MoE uses a prequantized checkpoint, only tile-packed
# Qwen3.6-35B-A3B-HQQ-INT4 uses a prequantized checkpoint, only tile-packed

Copilot uses AI. Check for mistakes.
@mergennachin mergennachin temporarily deployed to upload-benchmark-results April 16, 2026 22:26 — with GitHub Actions Inactive
@mergennachin mergennachin merged commit 7fdd306 into main Apr 17, 2026
380 of 391 checks passed
@mergennachin mergennachin deleted the qwen3_6 branch April 17, 2026 00:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants