Document Qwen3.5 FLA patch for NPU support by hazelduan · Pull Request #9237 · modelscope/ms-swift

hazelduan · 2026-04-28T13:36:48Z

Added detailed explanation of the Qwen3.5 FLA patch for NPU support, including its functionality and impact on the transformers library.

PR type

[ npu patcher] Document Updates

PR information

verified version -- transformers==5.2.0 triton-ascend==3.2.0 flash-linear-attention==0.4.1 torch==2.7.1

Added detailed explanation of the Qwen3.5 FLA patch for NPU support, including its functionality and impact on the transformers library.

gemini-code-assist

Code Review

This pull request updates the documentation to include a detailed explanation of the Qwen3.5 linear attention patch for NPU support, outlining how it redirects Triton operators to MindSpeed implementations. A review comment identified a version mismatch for the 'flash-linear-attention' package in the documentation, which has been addressed with a correction.

gemini-code-assist · 2026-04-28T13:39:54Z

+- 该 patch 主要覆盖的是 **Qwen3.5 linear attention 的 gated-delta-rule 路径**；
+- 它并不等价于“将整个 fla 包完整替换为 MindSpeed”；
+- 若需要这条路径生效，请确保当前环境中可以正确导入 MindSpeed。
+- 精度对齐验证版本：torch 2.7.1 + MindSpeed 0.12.1 + flash-linear-attention 4.1.0 + triton-ascend 3.2.0 + transformers 5.2.0


The version for flash-linear-attention is listed as 4.1.0, but the pull request description specifies 0.4.1. This discrepancy should be corrected to ensure accuracy for users setting up the environment.

Suggested change

- 精度对齐验证版本：torch 2.7.1 + MindSpeed 0.12.1 + flash-linear-attention 4.1.0 + triton-ascend 3.2.0 + transformers 5.2.0

- 精度对齐验证版本：torch 2.7.1 + MindSpeed 0.12.1 + flash-linear-attention 0.4.1 + triton-ascend 3.2.0 + transformers 5.2.0

Jintao-Huang · 2026-04-29T02:54:53Z

Please also supplement the English document, thanks.

Document Qwen3.5 FLA patch for NPU support

a8a1639

Added detailed explanation of the Qwen3.5 FLA patch for NPU support, including its functionality and impact on the transformers library.

gemini-code-assist Bot reviewed Apr 28, 2026

View reviewed changes

Jintao-Huang approved these changes Apr 29, 2026

View reviewed changes

Add English NPU FLA patch notes

db139cc

addsubmuldiv merged commit ae0a9be into modelscope:main Apr 30, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document Qwen3.5 FLA patch for NPU support#9237

Document Qwen3.5 FLA patch for NPU support#9237
addsubmuldiv merged 2 commits into
modelscope:mainfrom
hazelduan:patch-1

hazelduan commented Apr 28, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 28, 2026

Uh oh!

Jintao-Huang commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	- 精度对齐验证版本：torch 2.7.1 + MindSpeed 0.12.1 + flash-linear-attention 4.1.0 + triton-ascend 3.2.0 + transformers 5.2.0
	- 精度对齐验证版本：torch 2.7.1 + MindSpeed 0.12.1 + flash-linear-attention 0.4.1 + triton-ascend 3.2.0 + transformers 5.2.0

Conversation

hazelduan commented Apr 28, 2026

PR type

PR information

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Jintao-Huang commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants