PPO crashes with out-of-bounds action during evaluation

**Description**
PPO crashes during training, due to an action space violation during evaluation. The bug is a missing action clip in the evaluation loop.

The training loop clips actions (control_experiment.py:108), but the evaluation loop does not do so in either the sequential or vectorized paths (control_experiment.py:211 and 238).

PPOActorNetProbabilistic uses a plain Normal distribution with no squashing, so its outputs are unbounded.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PPO crashes with out-of-bounds action during evaluation #2

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

PPO crashes with out-of-bounds action during evaluation #2

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions