Skip to content

Conversation

@yongming-qin
Copy link

@yongming-qin yongming-qin commented Nov 18, 2025

Purpose

Fix an inconsistency in _CONFIG_REGISTRY:
deepseek_v32 was the only entry using an eager class import:

deepseek_v32 = DeepseekV3Config

While all other entries use string-based lazy loading.
This caused unnecessary eager imports and broke environments with older transformers versions.

Fix:

  • Change to "DeepseekV3Config"
  • Move the import to vllm/transformers_utils/configs/__init__.py so LazyConfigDict can resolve it

Before:

error message


Test Plan

1. Verify Transformers does not know deepseek_v32:

from transformers.models.auto.configuration_auto import CONFIG_MAPPING
[k for k in CONFIG_MAPPING if "deepseek" in k.lower()]

Result:
['deepseek_v2', 'deepseek_v3', 'deepseek_vl', 'deepseek_vl_hybrid']

2. Verify vLLM now resolves it lazily:

from vllm.transformers_utils.config import get_config
get_config("deepseek-ai/DeepSeek-V3.2-Exp-Base", trust_remote_code=False)

Produces the expected Transformers warning, confirming correct mapping.


Test Result

  • vLLM version: 0.11.1rc7.dev217
  • Transformers: 4.57.1
  • Before: ImportError on older Transformers versions
  • After: Lazy loading works; deepseek_v32 → DeepseekV3Config resolves correctly

Signed-off-by: Luke <yq0536@gmail.com>
@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

@mergify mergify bot added the deepseek Related to DeepSeek models label Nov 18, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to fix an inconsistency in _CONFIG_REGISTRY by using a string reference for DeepseekV3Config to enable lazy loading. While the intention is correct and addresses a potential issue with eager imports, the current implementation is incomplete and will cause a runtime AttributeError. This highlights the importance of testing even for seemingly small changes. My review includes a critical comment that explains why the change will fail and provides guidance on how to correctly implement the lazy loading for this configuration. The fix involves a small change in a different file to make DeepseekV3Config available to the lazy loading mechanism.

chatglm="ChatGLMConfig",
deepseek_vl_v2="DeepseekVLV2Config",
deepseek_v32=DeepseekV3Config,
deepseek_v32="DeepseekV3Config",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This change to use a string reference for lazy loading is a good idea, but it will cause a runtime AttributeError as implemented.

The LazyConfigDict expects to find the config class within the vllm.transformers_utils.configs module when a string is provided. However, DeepseekV3Config is imported from transformers and is not exposed in that module after your change to remove the direct import in this file.

To fix this, you need to make DeepseekV3Config available in vllm.transformers_utils.configs. You can achieve this by adding from transformers import DeepseekV3Config to vllm/transformers_utils/configs/__init__.py and including "DeepseekV3Config" in its __all__ list. This will preserve the lazy loading behavior, as vllm.transformers_utils.configs is only imported when a config is first accessed from the registry.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me. I have added from transformers import DeepseekV3Config in vllm/transformers_utils/configs/__init__.py
and added comments to explain such case.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines 81 to 84
chatglm="ChatGLMConfig",
deepseek_vl_v2="DeepseekVLV2Config",
deepseek_v32=DeepseekV3Config,
deepseek_v32="DeepseekV3Config",
flex_olmo="FlexOlmoConfig",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge DeepseekV3 configs now fail to resolve

The registry now stores deepseek_v32 as the string "DeepseekV3Config", but LazyConfigDict.__getitem__ only resolves strings from vllm.transformers_utils.configs. That package does not export DeepseekV3Config, so any model with model_type deepseek_v32 will raise AttributeError when loading the config, even on transformers versions that provide the class. Previously the config loaded correctly when the class was available; after this change Deepseek V3 models cannot be parsed at all.

Useful? React with 👍 / 👎.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solved in the reply to gemini-code-assist[bot]
https://github.com/vllm-project/vllm/pull/28958/files#r2539426365

@yongming-qin
Copy link
Author

Checked the AI suggestion and updated the PR description.

Signed-off-by: Luke <yq0536@gmail.com>
Signed-off-by: Luke <yq0536@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deepseek Related to DeepSeek models

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant