Skip to content

[Feature] Feat/diffusion bc loss#3604

Open
theap06 wants to merge 7 commits intopytorch:mainfrom
theap06:feat/diffusion-bc-loss
Open

[Feature] Feat/diffusion bc loss#3604
theap06 wants to merge 7 commits intopytorch:mainfrom
theap06:feat/diffusion-bc-loss

Conversation

@theap06
Copy link
Copy Markdown
Contributor

@theap06 theap06 commented Apr 8, 2026

[Feature] Add DiffusionBCLoss objective and Pendulum BC example" --body

Summary

Fixes #3149

Design

The loss:

  1. Samples a random timestep t per batch element
  2. Corrupts the clean demonstration action via _DDPMModule.add_noise(clean_action, t) (forward diffusion)
  3. Runs the score network on (noisy_action || observation || t)
  4. Returns MSE between predicted noise and actual noise as loss_diffusion_bc

Supports set_keys() for observation/action key remapping and configurable reduction.

Files changed

File Description
torchrl/objectives/diffusion_bc.py DiffusionBCLoss implementation
torchrl/objectives/__init__.py Register DiffusionBCLoss
test/objectives/test_diffusion_bc.py 17 tests (output keys, backward, gradient flow, custom keys, convergence)
examples/diffusion_bc_pendulum.py End-to-end BC training on Pendulum-v1

Test plan

  • pytest test/objectives/test_diffusion_bc.py — 17/17 passing
  • pre-commit run — all hooks passing
  • Forward + backward smoke tested locally

theap06 and others added 7 commits April 5, 2026 03:10
…tioned

on observations using a fixed linear-beta DDPM scheduler, following
Diffusion Policy (Chi et al., RSS 2023).
Implements the ε-prediction denoising loss from Diffusion Policy (Chi et al.,
RSS 2023) as a TorchRL LossModule, completing Phase-1 of the diffusion policy
feature alongside DiffusionActor (pytorch#3596).

- torchrl/objectives/diffusion_bc.py: DiffusionBCLoss subclassing LossModule,
  uses _DDPMModule.add_noise() for the forward diffusion step and computes
  MSE between predicted and actual noise. Supports configurable reduction and
  set_keys() for observation/action key remapping.
- torchrl/objectives/__init__.py: register DiffusionBCLoss in alphabetical order.
- test/objectives/test_diffusion_bc.py: 17 tests covering output keys, scalar
  loss, backward, gradient flow, reduction modes, custom keys, and a training
  convergence check.
- examples/diffusion_bc_pendulum.py: end-to-end BC training on Pendulum-v1
  with expert data collection, training loop, and evaluation.
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Apr 8, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3604

Note: Links to docs will display an error until the docs builds have been completed.

⚠️ 17 Awaiting Approval

As of commit ef2554c with merge base f54a7c7 (image):

AWAITING APPROVAL - The following workflows need approval before CI can run:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Examples Feature New feature Modules Objectives

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Add Diffusion Policy modules (Actor + Loss + Example)

2 participants