Skip to content

[trainer] optimize use_logits_to_keep#9194

Merged
Jintao-Huang merged 3 commits into
modelscope:mainfrom
Jintao-Huang:optimize_use_logits_to_keep
Apr 23, 2026
Merged

[trainer] optimize use_logits_to_keep#9194
Jintao-Huang merged 3 commits into
modelscope:mainfrom
Jintao-Huang:optimize_use_logits_to_keep

Conversation

@Jintao-Huang
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a global version check for transformers >= 5.0.0 and refactors the logic in get_use_logits_to_keep to improve readability and handle multimodal model compatibility. The review feedback suggests caching the result of this logic in self.args.use_logits_to_keep to avoid redundant and expensive inspect.signature calls during training.

Comment thread swift/trainers/mixin.py
@Jintao-Huang
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a global version check for transformers >= 5.0.0 and refactors the get_use_logits_to_keep logic within SwiftMixin to include more explicit conditional checks for multimodal models and the presence of the logits_to_keep parameter. Feedback suggests that the model should be unwrapped using unwrap_model before inspecting its signature to ensure compatibility with distributed training wrappers like DDP or FSDP. Additionally, it is recommended to use logger.info_once instead of logger.info to avoid duplicate log entries in distributed environments.

Comment thread swift/trainers/mixin.py Outdated
@Jintao-Huang Jintao-Huang merged commit 014f040 into modelscope:main Apr 23, 2026
1 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants