-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[Bugfix][Model] Fix inference for Hunyuan dense models #25354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: anion <1005128408@qq.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly fixes a regression that prevented dense Hunyuan v1 models from running. The fix involves refactoring the model classes to better separate the logic for dense and Mixture-of-Experts (MoE) variants, introducing a common base class HunyuanV1ModelBase
. While the fix is correct, there is significant code duplication in the __init__
methods of HunYuanDenseV1Base
and HunYuanMoEV1Base
. I've provided suggestions to move the common initialization logic to the base class to improve maintainability and reduce redundancy.
Signed-off-by: anion <1005128408@qq.com>
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively addresses a regression that prevented inference on Hunyuan v1 dense models. The refactoring of the class hierarchy, by introducing HunyuanV1ModelBase
for common logic and creating separate HunYuanMoEV1Base
and HunYuanDenseV1Base
for MoE and dense variants respectively, is a clean and correct approach. This properly isolates the MoE-specific logic that was causing runtime errors for dense models. The changes are well-structured and directly fix the reported issue. I have no further recommendations.
Hi @abmfy, I noticed that you have reviewed related modifications and familiar with this area. Would you mind helping to review this? Let me know if you have any further suggestions for this PR.Thanks :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for fixing this issue! @simon-mo
Hi @abmfy |
Signed-off-by: Anion <123177548+Anionex@users.noreply.github.com>
@jeejeelee @Isotr0py It seems the other reviewers are quite busy. Could you please help merge this PR? Thank you! |
…25354) Signed-off-by: anion <1005128408@qq.com> Signed-off-by: Anion <123177548+Anionex@users.noreply.github.com>
Signed-off-by: anion <1005128408@qq.com> Signed-off-by: Anion <123177548+Anionex@users.noreply.github.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>
Purpose
This PR fixes a regression that broke inference support for Hunyuan v1 dense models, such as Hunyuan-1.8B-Instruct.
After recent updates to the model support files, the logic no longer correctly handles the architecture of the dense variant, causing the inference process to fail. This functionality was working correctly in previous versions.
This pr restores the necessary logic to ensure that Hunyuan v1 dense models can be loaded and used for inference successfully.
Test Plan
export VLLM_USE_MODELSCOPE=True python3 -m vllm.entrypoints.openai.api_server --model Tencent-Hunyuan/Hunyuan-1.8B-Instruct --trust_remote_code
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.