Allow specifying trainer tensor parallelism in multislice RL by xuefgu · Pull Request #3067 · AI-Hypercomputer/maxtext

xuefgu · 2026-02-02T19:18:26Z

Description

Allow specifying trainer tensor parallelism in multislice RL

FIXES: b/480979614

Tests

Manually tested.

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

codecov · 2026-02-03T04:16:32Z

Codecov Report

❌ Patch coverage is 0% with 9 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/MaxText/rl/train_rl.py	0.00%	9 Missing ⚠️

📢 Thoughts on this report? Let us know!

xuefgu added the gemini-review label Feb 2, 2026

xuefgu force-pushed the xfgu-rl-sharding branch 3 times, most recently from 00678c6 to 744d788 Compare February 2, 2026 22:46

xuefgu removed the gemini-review label Feb 3, 2026

xuefgu force-pushed the xfgu-rl-sharding branch from 744d788 to d1b4184 Compare February 3, 2026 04:07

xuefgu marked this pull request as ready for review February 3, 2026 04:20

xuefgu requested review from A9isha, NicoGrande, NuojCheng, RissyRan, SurbhiJainUSC, aireenmei, bvandermoon, gagika, gobbleturk, hengtaoguo, jesselu-google, jiangjy1982, khatwanimohit, richjames0, shralex, suexu1025 and vipannalla as code owners February 3, 2026 04:20

NicoGrande approved these changes Feb 3, 2026

View reviewed changes

hengtaoguo approved these changes Feb 3, 2026

View reviewed changes

khatwanimohit reviewed Feb 3, 2026

View reviewed changes

Comment thread src/MaxText/rl/train_rl.py

Allow specifying trainer tensor parallelism in multislice RL

b62a65f

xuefgu force-pushed the xfgu-rl-sharding branch from d1b4184 to b62a65f Compare February 3, 2026 19:11

xuefgu added the pull ready label Feb 3, 2026

copybara-service Bot merged commit 1a44692 into main Feb 3, 2026
28 of 30 checks passed

copybara-service Bot deleted the xfgu-rl-sharding branch February 3, 2026 20:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow specifying trainer tensor parallelism in multislice RL#3067

Allow specifying trainer tensor parallelism in multislice RL#3067
copybara-service[bot] merged 1 commit intomainfrom
xfgu-rl-sharding

xuefgu commented Feb 2, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Feb 3, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

xuefgu commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tests

Checklist

Uh oh!

codecov Bot commented Feb 3, 2026

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

xuefgu commented Feb 2, 2026 •

edited

Loading