Fix: Use DDP for PPO traning will cause AttributeError: 'DistributedDataParallel' object has no attribute 'config' error #5822

kiritoxkiriko · 2025-09-15T14:08:07Z

PR type

Bug Fix
New Feature
Document Updates
More Models or Datasets Support

PR information

When trl >= 0.20.0
When use DDP for ppo, save model will return AttributeError: 'DistributedDataParallel' object has no attribute 'config'.

This pr fix the error occured when save model after training #5287, but this issue will trigger when save ckpt during traning.

Before call _save_checkpoint, trl will call create_model_card to create model card, which will use value model.config, which will not exist in ddp model, so we need to unwrap it, see:

# part of trl's create_model_card functon
  def create_model_card(
      self,
      model_name: Optional[str] = None,
      dataset_name: Optional[str] = None,
      tags: Union[str, list[str], None] = None,
  ):
      """
      Creates a draft of a model card using the information available to the `Trainer`.

      Args:
          model_name (`str` or `None`, *optional*, defaults to `None`):
              Name of the model.
          dataset_name (`str` or `None`, *optional*, defaults to `None`):
              Name of the dataset used for training.
          tags (`str`, `list[str]` or `None`, *optional*, defaults to `None`):
              Tags to be associated with the model card.
      """
      if not self.is_world_process_zero():
          return

      if hasattr(self.model.config, "_name_or_path") and not os.path.isdir(self.model.config._name_or_path):
          base_model = self.model.config._name_or_path
      else:
          base_model = None

This PR will fix this by unwraping the DDP model before call _save_checkpoint.

Experiment results

[rank0]:   File "/app/.venv/lib/python3.10/site-packages/swift/llm/train/sft.py", line 235, in train
[rank0]:     trainer.train(trainer.args.resume_from_checkpoint)
[rank0]:   File "/app/.venv/lib/python3.10/site-packages/swift/trainers/rlhf_trainer/ppo_trainer.py", line 62, in train
[rank0]:     super().train()
[rank0]:   File "/app/.venv/lib/python3.10/site-packages/swift/trainers/mixin.py", line 676, in train
[rank0]:     res = super().train(*args, **kwargs)
[rank0]:   File "/app/.venv/lib/python3.10/site-packages/trl/trainer/ppo_trainer.py", line 651, in train
[rank0]:     self._save_checkpoint(model, trial=None)
[rank0]:   File "/app/.venv/lib/python3.10/site-packages/swift/trainers/rlhf_trainer/ppo_trainer.py", line 66, in _save_checkpoint
[rank0]:     return super()._save_checkpoint(*args, **kwargs)
[rank0]:   File "/app/.venv/lib/python3.10/site-packages/swift/trainers/mixin.py", line 458, in _save_checkpoint
[rank0]:     result = super()._save_checkpoint(*args, **kwargs)
[rank0]:   File "/app/.venv/lib/python3.10/site-packages/trl/trainer/ppo_trainer.py", line 757, in _save_checkpoint
[rank0]:     self.create_model_card(model_name=model_name)
[rank0]:   File "/app/.venv/lib/python3.10/site-packages/trl/trainer/ppo_trainer.py", line 780, in create_model_card
[rank0]:     if hasattr(self.model.config, "_name_or_path") and not os.path.isdir(self.model.config._name_or_path):
[rank0]:   File "/app/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1940, in __getattr__
[rank0]:     raise AttributeError(
[rank0]: AttributeError: 'DistributedDataParallel' object has no attribute 'config'

Paste your experiment result here(if needed).

gemini-code-assist

Summary of Changes

Hello @kiritoxkiriko, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug in the PPO trainer when utilizing Distributed Data Parallel (DDP) for training. It resolves an AttributeError that previously occurred during model saving, which was caused by the trl library's internal create_model_card function attempting to access configuration directly from a DDP-wrapped model. The implemented solution ensures that the model is correctly unwrapped before saving, thereby preventing crashes and enabling seamless checkpointing during distributed PPO training.

Highlights

Bug Fix for DDP PPO Training: This pull request resolves an AttributeError that occurs when saving a model during PPO training with Distributed Data Parallel (DDP) enabled, specifically when trl version is 0.20.0 or higher.
Root Cause Identification: The error stems from trl's create_model_card function attempting to access self.model.config directly on a DistributedDataParallel object, which does not expose the config attribute directly.
Model Unwrapping Mechanism: The fix involves temporarily unwrapping the DDP model before calling the superclass's _save_checkpoint method, allowing trl to access the underlying model's configuration, and then restoring the DDP-wrapped model afterwards.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request correctly fixes an AttributeError during PPO training with DDP by unwrapping the model before saving a checkpoint. The overall change is good. I've provided one suggestion to improve the robustness of the implementation by using a try...finally block. This ensures the model's state is correctly restored even if an error occurs during the checkpointing process.

swift/trainers/rlhf_trainer/ppo_trainer.py

hjh0119 · 2025-09-16T13:54:50Z

thanks for your contribution

please pass the lint test

use more robust error check Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

kiritoxkiriko · 2025-09-19T08:47:11Z

thanks for your contribution

please pass the lint test

passed

gemini-code-assist bot reviewed Sep 15, 2025

View reviewed changes

swift/trainers/rlhf_trainer/ppo_trainer.py Outdated Show resolved Hide resolved

hjh0119 approved these changes Sep 16, 2025

View reviewed changes

kiritoxkiriko and others added 2 commits September 19, 2025 15:43

fix: unwrap model for ppo trainer in _save_checkpoint

5016a0d

Update swift/trainers/rlhf_trainer/ppo_trainer.py

62bd950

use more robust error check Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

kiritoxkiriko force-pushed the fix/ppo-bug branch 2 times, most recently from 095442c to e70b3a5 Compare September 19, 2025 08:18

lint: reformat ppo trainer

6aa4c30

kiritoxkiriko force-pushed the fix/ppo-bug branch from e70b3a5 to 6aa4c30 Compare September 19, 2025 08:44

hjh0119 merged commit c25d275 into modelscope:main Sep 19, 2025
1 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: Use DDP for PPO traning will cause AttributeError: 'DistributedDataParallel' object has no attribute 'config' error #5822

Fix: Use DDP for PPO traning will cause AttributeError: 'DistributedDataParallel' object has no attribute 'config' error #5822

Uh oh!

kiritoxkiriko commented Sep 15, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

hjh0119 commented Sep 16, 2025

Uh oh!

kiritoxkiriko commented Sep 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix: Use DDP for PPO traning will cause AttributeError: 'DistributedDataParallel' object has no attribute 'config' error #5822

Fix: Use DDP for PPO traning will cause AttributeError: 'DistributedDataParallel' object has no attribute 'config' error #5822

Uh oh!

Conversation

kiritoxkiriko commented Sep 15, 2025

PR type

PR information

Experiment results

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

hjh0119 commented Sep 16, 2025

Uh oh!

kiritoxkiriko commented Sep 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants