Skip to content

[bugfix] fix optimizer deepspeed#9173

Merged
Jintao-Huang merged 4 commits into
modelscope:mainfrom
Jintao-Huang:fix_optimizer_deepspeed
Apr 22, 2026
Merged

[bugfix] fix optimizer deepspeed#9173
Jintao-Huang merged 4 commits into
modelscope:mainfrom
Jintao-Huang:fix_optimizer_deepspeed

Conversation

@Jintao-Huang
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the optimizer and scheduler creation process by introducing an optimizer_callback and delegating tasks to HfTrainer static methods. It also adds a mechanism to filter out empty parameter groups during optimizer initialization. The review feedback points out that the create_optimizer_and_scheduler method should call the local create_optimizer to ensure the parameter group filtering is applied and suggests adding a null check for the optimizer for increased robustness.

Comment thread swift/trainers/mixin.py
Comment thread swift/trainers/mixin.py
@Jintao-Huang Jintao-Huang merged commit 00d6bbc into modelscope:main Apr 22, 2026
2 of 3 checks passed
Jintao-Huang added a commit that referenced this pull request Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants