Skip to content

Generalizes Qwen3's partial RoPE implementation into a reusable PartialRotaryEmbedding layer. #3282

Merged
copybara-service[bot] merged 1 commit intomainfrom
agagik-partial-rope
Mar 4, 2026
Merged

Generalizes Qwen3's partial RoPE implementation into a reusable PartialRotaryEmbedding layer. #3282
copybara-service[bot] merged 1 commit intomainfrom
agagik-partial-rope

Conversation

@gagika
Copy link
Copy Markdown
Collaborator

@gagika gagika commented Mar 1, 2026

Description

Generalizes Qwen3's partial RoPE implementation into a reusable PartialRotaryEmbedding layer.

Previously, partial RoPE was implemented specifically for Qwen3 as Qwen3NextRotaryEmbedding. This PR refactors it to be model-agnostic to adopt partial RoPE across other architectures.

Tests

  • Unit Tests: Added a dedicated unit test suite at tests/unit/partial_rotary_embedding_test.py that verifies:
    • partial_rotary_factor=0.5 correctly rotates the first half of the hidden dimension and passes the second half through unmodified.
    • partial_rotary_factor=1.0 behaves identically to the base RotaryEmbedding.
    • Shift invariance is maintained.

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@gagika gagika force-pushed the agagik-partial-rope branch from 8e267f0 to 5d3a1f8 Compare March 1, 2026 06:11
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 1, 2026

Codecov Report

❌ Patch coverage is 66.66667% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/maxtext/layers/attentions.py 0.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@gagika gagika changed the title Rename Qwen3NextRotaryEmbedding to PartialRotaryEmbedding more reusable Generalizes Qwen3's partial RoPE implementation into a reusable PartialRotaryEmbedding layer. Mar 1, 2026
@gagika gagika marked this pull request as ready for review March 1, 2026 06:45
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 1, 2026

🤖 Hi @gagika, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 1, 2026

🤖 I'm sorry @gagika, but I was unable to process your request. Please see the logs for more details.

@gagika gagika force-pushed the agagik-partial-rope branch from 5d3a1f8 to 174c655 Compare March 3, 2026 03:06
Copy link
Copy Markdown
Collaborator

@shuningjin shuningjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Collaborator

@parambole parambole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gagika gagika force-pushed the agagik-partial-rope branch from 174c655 to 9fc7f95 Compare March 4, 2026 05:57
@copybara-service copybara-service Bot merged commit 441bc95 into main Mar 4, 2026
37 of 39 checks passed
@copybara-service copybara-service Bot deleted the agagik-partial-rope branch March 4, 2026 06:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants