Skip to content

DeepSeek V3.2 user guide update#3565

Merged
copybara-service[bot] merged 1 commit intomainfrom
ds3.2-xlml-tests
Apr 11, 2026
Merged

DeepSeek V3.2 user guide update#3565
copybara-service[bot] merged 1 commit intomainfrom
ds3.2-xlml-tests

Conversation

@snehalv2002
Copy link
Copy Markdown
Collaborator

@snehalv2002 snehalv2002 commented Apr 3, 2026

Description

Updating the user guide for DeepSeek-V3.2. Explains new feature updates and updates instructions on multi-stage lightning indexer training and checkpoint conversion.

Tests

Verified maxtext/tests/end_to_end/tpu/deepseek/v3.2-671b/2_test_deepseek.sh before adding to XLML dag.
Command: https://paste.googleplex.com/5274574397767680
Logs: https://cloudlogging.app.goo.gl/d9HUWQLzQa74HzfB7

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 3, 2026

🤖 Hi @RissyRan, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

## 📋 Review Summary

This pull request updates the DeepSeek user guide to include instructions for the new DeepSeek-V3.2 model, specifically focusing on indexer training and checkpoint conversion. The updates are timely and provide clear steps for users to leverage the latest sparse attention features.

🔍 General Feedback

  • Consistency: Ensure that the model names (deepseek3.2-671b) and tokenizer paths (deepseek-ai/DeepSeek-V3.2) are consistent across all stages of the guide.
  • Syntax: Be careful with trailing backslashes in shell command examples, as they can cause errors if users copy-paste the last line.
  • Clarity: Using concrete example values (like 0.1 for scaling factors) is generally more user-friendly than placeholders in curly braces.

Copy link
Copy Markdown
Collaborator

@RissyRan RissyRan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your 1st PR!!!

One more thing, could you update the PR desperation to follow our default template? One example: here

Copy link
Copy Markdown
Collaborator

@shuningjin shuningjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update! Might be good to organize deepseek3.2 into a self-contained section for clarity, and add more explanation on continued pre-training for indexer.

@snehalv2002 snehalv2002 force-pushed the ds3.2-xlml-tests branch 3 times, most recently from 42d4e55 to 931dd56 Compare April 10, 2026 17:45
Rohan-Bierneni
Rohan-Bierneni previously approved these changes Apr 10, 2026
Copy link
Copy Markdown
Collaborator

@Rohan-Bierneni Rohan-Bierneni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for fixing this issue!

@Rohan-Bierneni Rohan-Bierneni dismissed their stale review April 10, 2026 18:05

Approved the wrong pr by accident. Sorry for the confusion!

Copy link
Copy Markdown
Collaborator

@RissyRan RissyRan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address minor comments.

I think decoding case is clear, shall we add announcement too here?

cc @shuningjin @Rohan-Bierneni

@snehalv2002 snehalv2002 force-pushed the ds3.2-xlml-tests branch 5 times, most recently from 0fc326c to 6913d72 Compare April 10, 2026 22:26
Copy link
Copy Markdown
Collaborator

@shuningjin shuningjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update! Left some minor comments.

@snehalv2002 snehalv2002 force-pushed the ds3.2-xlml-tests branch 2 times, most recently from 419ab01 to 5cebdab Compare April 10, 2026 23:55
Copy link
Copy Markdown
Collaborator

@shuningjin shuningjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!!

@copybara-service copybara-service bot merged commit 2fcaeb8 into main Apr 11, 2026
35 checks passed
@copybara-service copybara-service bot deleted the ds3.2-xlml-tests branch April 11, 2026 01:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants