Skip to content

[model] support gemma4 padding_free#88

Merged
Jintao-Huang merged 3 commits into
modelscope:mainfrom
Jintao-Huang:support_gemma4_padding_free
May 21, 2026
Merged

[model] support gemma4 padding_free#88
Jintao-Huang merged 3 commits into
modelscope:mainfrom
Jintao-Huang:support_gemma4_padding_free

Conversation

@Jintao-Huang
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the Gemma4 configuration and model initialization to support bidirectional attention in vision contexts, including a null check for the attention mask in the forward pass. A critical issue was identified in the Gemma4TextGPTModel constructor where sliding window attention parameters were being incorrectly cleared for all non-vision models, which would disable a core architectural feature; a code suggestion was provided to ensure these parameters are only reset when the custom bidirectional mask is active with the 'unfused' backend.

Comment thread src/mcore_bridge/model/mm_gpts/gemma4.py Outdated
@Jintao-Huang
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the configuration and initialization logic for the gemma4 model. Specifically, it removes a conditional check in the configuration parser to unconditionally set window size parameters and introduces handling for the 'vision' bidirectional attention setting within the Gemma4TextGPTModel, including backend compatibility checks. Feedback was provided regarding the need for defensive programming: specifically, adding null checks for window_size and layer_types in the parser to prevent TypeErrors, and using getattr when accessing configuration attributes to avoid potential AttributeErrors.

Comment thread src/mcore_bridge/config/parser.py
Comment thread src/mcore_bridge/model/mm_gpts/gemma4.py
@Jintao-Huang Jintao-Huang merged commit 3706bf8 into modelscope:main May 21, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants