Skip to content

fix: compile inner model before DDP wrapping to prevent Dynamo tracing DDP internals#4017

Open
anishesg wants to merge 1 commit intohuggingface:mainfrom
anishesg:fix/ph-issue-3991
Open

fix: compile inner model before DDP wrapping to prevent Dynamo tracing DDP internals#4017
anishesg wants to merge 1 commit intohuggingface:mainfrom
anishesg:fix/ph-issue-3991

Conversation

@anishesg
Copy link
Copy Markdown

What does this PR do?

When using torch.compile with multi-GPU (DDP) training via Accelerate, users hit a crash during the forward pass:

torch._dynamo.exc.Unsupported: Unsupported method call
  Explanation: Dynamo does not know how to trace method `set_runtime_stats_and_log` of class `Logger`

The root cause is in accelerator.py's prepare_model: the code was wrapping the model with DistributedDataParallel first, then applying torch.compile to the DDP wrapper. This caused Dynamo to trace into DDP's internal _pre_forward hook which calls self.logger.set_runtime_stats_and_log() — a method on a user-defined object that Dynamo cannot trace.

The fix follows the PyTorch-recommended pattern for DDP + torch.compile: compile the inner model before wrapping it with DDP. DDP then operates outside the compiled region, so its internal logging and communication hooks are never seen by Dynamo. This is applied to both the MULTI_GPU and MULTI_CPU DDP paths in prepare_model. The final compile guard is also updated to skip models that already have compiled submodules (via has_compiled_regions), preventing the DDP wrapper from being double-compiled.

Fixes #3991

…g DDP internals

## What does this PR do?

Signed-off-by: anish k <ak8686@princeton.edu>
@anishesg anishesg mentioned this pull request Apr 25, 2026
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Isssue when using torch.compile

1 participant