Skip to content

[None][fix] Warn on experimental DeepSeek-V4 base checkpoints#14299

Merged
lfr-0531 merged 2 commits into
NVIDIA:feat/deepseek_v4from
lfr-0531:user/fanrongl/warn-dsv4-base
May 20, 2026
Merged

[None][fix] Warn on experimental DeepSeek-V4 base checkpoints#14299
lfr-0531 merged 2 commits into
NVIDIA:feat/deepseek_v4from
lfr-0531:user/fanrongl/warn-dsv4-base

Conversation

@lfr-0531
Copy link
Copy Markdown
Collaborator

@lfr-0531 lfr-0531 commented May 19, 2026

@coderabbitai summary

Description

DeepSeek-V4 Base checkpoints currently go through the same config path as the better supported DeepSeek-V4 Instruct checkpoints, but Base support is still experimental. The Base checkpoints in /raid/model can be distinguished from Instruct checkpoints by routed MoE safetensors metadata: Instruct expert tensors use the existing MXFP4/NVFP4 layouts, while Base expert tensors are not MXFP4 or NVFP4.

This PR adds a small DeepSeek-V4 Base checkpoint detector that reuses the routed MoE tensor metadata check and emits a warning when users load a Base checkpoint. The warning tells users that Base support is experimental and recommends using a DeepSeek-V4 Instruct checkpoint.

Test Coverage

Added unit coverage for DeepSeek-V4 routed expert metadata detection:

  • MXFP4 metadata is detected as a non-Base checkpoint.
  • NVFP4 metadata is detected as a non-Base checkpoint.
  • FP8 routed expert metadata is detected as a Base checkpoint.

Validation run locally:

  • PATH=/usr/bin:$PATH pre-commit run --files tensorrt_llm/_torch/model_config.py tests/unittest/_torch/test_model_config.py
  • /usr/bin/python3.12 -m py_compile tensorrt_llm/_torch/model_config.py tests/unittest/_torch/test_model_config.py

Not run successfully:

  • PYTHONPATH=$PWD pytest tests/unittest/_torch/test_model_config.py -k deepseek_v4_base_checkpoint_detection fails during test bootstrap because the local environment is missing nvtx.

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Signed-off-by: Fanrong Li <lfr-0531@users.noreply.github.com>
@lfr-0531 lfr-0531 requested a review from a team as a code owner May 19, 2026 08:32
@lfr-0531 lfr-0531 requested review from HuiGao-NV and leslie-fang25 and removed request for a team May 19, 2026 08:32
@lfr-0531 lfr-0531 requested review from Barry-Delaney and baize97 and removed request for HuiGao-NV and leslie-fang25 May 19, 2026 08:33
@lfr-0531
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #49153 [ run ] triggered by Bot. Commit: 83d0dd9 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #49153 [ run ] completed with state SUCCESS. Commit: 83d0dd9
/LLM/main/L0_MergeRequest_PR pipeline #38834 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@lfr-0531
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #49296 [ run ] triggered by Bot. Commit: aaadd6f Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #49296 [ run ] completed with state SUCCESS. Commit: aaadd6f
/LLM/main/L0_MergeRequest_PR pipeline #38960 completed with status: 'SUCCESS'

CI Report

Link to invocation

@lfr-0531 lfr-0531 merged commit 5d0a30e into NVIDIA:feat/deepseek_v4 May 20, 2026
6 checks passed
lfr-0531 added a commit to lfr-0531/TensorRT-LLM that referenced this pull request May 29, 2026
…#14299)

Signed-off-by: Fanrong Li <lfr-0531@users.noreply.github.com>
Co-authored-by: Fanrong Li <lfr-0531@users.noreply.github.com>
(cherry picked from commit 5d0a30e)
Signed-off-by: Fanrong Li <lfr-0531@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants