Value clipping for PPO loss

## Motivation

It would nice to add a new option to compute the clipped value loss as used in OpenAI's Baselines for PPO:

https://github.com/openai/baselines/blob/ea25b9e8b234e6ee1bca43083f8f3cf974143998/baselines/ppo2/model.py#L66-L75

Currently, the `PPOLoss().loss_critic` method calls the `torchrl.objectives.utils.distance_loss` function which supports the `"l1"`, `"l2"` and `"smooth_l1"` loss functions. Perhaps this new clipped value loss function could be implemented as a new loss type within `distance_loss`.

The clipping fraction is also commonly reported as a metric by OpenAI's Baselines and this could be useful to report from `PPOLoss` for the clipped value loss, as well as for `ClipPPOLoss` for the `loss_objective` loss term.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Value clipping for PPO loss #1977

Motivation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Value clipping for PPO loss #1977

Description

Motivation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions