Skip to content

Enable mixed scan strategy for models with Engram#3255

Merged
copybara-service[bot] merged 1 commit intomainfrom
engram_scan_clean
Mar 5, 2026
Merged

Enable mixed scan strategy for models with Engram#3255
copybara-service[bot] merged 1 commit intomainfrom
engram_scan_clean

Conversation

@RissyRan
Copy link
Copy Markdown
Collaborator

@RissyRan RissyRan commented Feb 26, 2026

Description

Refactor and enable scan for DeepSeek custom model with Engram layers, config link

Due to challenges with random layer IDs across the entire scan block, the model is currently partitioned segments in order. For instance, we have 10 layers, and first 5 as Dense, and last 5 as MoE, and Engram is at 3rd, and 9th.

* Scanned - Dense layers 0-1
* Unscanned - Engram layer 2
* Scanned - Dense layers 3-4
* Scanned - MoE layers 5-7
* Unscanned - Engram layer 8
* Scanned - MoE layers 9

Tests

  • Added unit test - deepseek_scan_engram_test.py, and expect to pass
  • Un-scan vs. scan (deepseek-custom, max_target_length=4096 per_device_batch_size=16 on v5p) link - Obvious performance gains and reduced memory footprint.
  • No impact for existing deepseek related models - DeepSeek v2 scan - before vs. after change - no change link

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@RissyRan RissyRan changed the title [Draft] Engram scan clean [Draft] Support scan for Engram layers Feb 26, 2026
@RissyRan RissyRan changed the title [Draft] Support scan for Engram layers [Draft] Enable mixed scan strategies for models using Engram layers Feb 26, 2026
@RissyRan RissyRan changed the title [Draft] Enable mixed scan strategies for models using Engram layers [Draft] Enable mixed scan strategy for models using Engram layers Feb 26, 2026
@RissyRan RissyRan changed the title [Draft] Enable mixed scan strategy for models using Engram layers [Draft] Enable mixed scan strategy for models with Engram Feb 26, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 26, 2026

Codecov Report

❌ Patch coverage is 92.30769% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/maxtext/layers/decoders.py 92.30% 1 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

@RissyRan RissyRan force-pushed the engram_scan_clean branch 3 times, most recently from 3c2e251 to 3654d6d Compare February 26, 2026 23:41
@RissyRan RissyRan marked this pull request as ready for review February 26, 2026 23:41
@RissyRan RissyRan changed the title [Draft] Enable mixed scan strategy for models with Engram Enable mixed scan strategy for models with Engram Feb 26, 2026
Comment thread src/maxtext/layers/decoders.py Outdated
@AI-Hypercomputer AI-Hypercomputer deleted a comment from github-actions Bot Mar 3, 2026
@AI-Hypercomputer AI-Hypercomputer deleted a comment from github-actions Bot Mar 3, 2026
@AI-Hypercomputer AI-Hypercomputer deleted a comment from github-actions Bot Mar 3, 2026
@AI-Hypercomputer AI-Hypercomputer deleted a comment from github-actions Bot Mar 3, 2026
@RissyRan RissyRan force-pushed the engram_scan_clean branch from b7d17e0 to c53574e Compare March 3, 2026 23:04
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 3, 2026

🤖 Hi @RissyRan, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 3, 2026

🤖 I'm sorry @RissyRan, but I was unable to process your request. Please see the logs for more details.

@RissyRan RissyRan force-pushed the engram_scan_clean branch 2 times, most recently from c8c3ba0 to 9cfd551 Compare March 3, 2026 23:47
Copy link
Copy Markdown
Collaborator

@gagika gagika left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the logic looks good to me.
could you try to do through a helper function and some code re-use?

Comment thread src/maxtext/layers/decoders.py
Comment thread src/maxtext/layers/decoders.py
@RissyRan RissyRan force-pushed the engram_scan_clean branch 3 times, most recently from 3da6bda to 33d4d22 Compare March 5, 2026 04:57
Copy link
Copy Markdown
Collaborator

@gobbleturk gobbleturk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Ran!

Comment thread src/maxtext/layers/decoders.py
@RissyRan RissyRan force-pushed the engram_scan_clean branch from 33d4d22 to 6c94aa2 Compare March 5, 2026 18:14
@RissyRan RissyRan force-pushed the engram_scan_clean branch from 6c94aa2 to 88afe99 Compare March 5, 2026 19:05
@copybara-service copybara-service Bot merged commit d14f70d into main Mar 5, 2026
54 checks passed
@copybara-service copybara-service Bot deleted the engram_scan_clean branch March 5, 2026 23:24
@shuningjin shuningjin mentioned this pull request Mar 6, 2026
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants