Skip to content

Fix caching allocator warmup byte estimation for EP model loading#46149

Merged
ArthurZucker merged 1 commit into
huggingface:mainfrom
sywangyi:ep_load_oom
May 25, 2026
Merged

Fix caching allocator warmup byte estimation for EP model loading#46149
ArthurZucker merged 1 commit into
huggingface:mainfrom
sywangyi:ep_load_oom

Conversation

@sywangyi
Copy link
Copy Markdown
Contributor

@ArthurZucker @Cyrilvallez

In EP runs, the effective distributed plan is exposed through model.tp_plan, which switches to the EP plan when distributed_config.enable_expert_parallel is set. Because warmup bypassed that property and read model._tp_plan directly, it could overestimate local device memory and try to preallocate as if expert weights were not sharded.

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
@sywangyi
Copy link
Copy Markdown
Contributor Author

find in deepseek v4 flash ep run

@github-actions
Copy link
Copy Markdown
Contributor

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=46149&sha=4b2923

@sywangyi
Copy link
Copy Markdown
Contributor Author

@IlyasMoutawwakil

Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! 🤗

@ArthurZucker ArthurZucker merged commit 7f2c8c9 into huggingface:main May 25, 2026
26 of 30 checks passed
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

kaixuanliu pushed a commit to kaixuanliu/transformers that referenced this pull request May 28, 2026
yuchenxie4645 pushed a commit to yuchenxie4645/transformers that referenced this pull request May 28, 2026
kashif pushed a commit to kashif/transformers that referenced this pull request Jun 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants