Skip to content

spike(learning): PPO path for DroneTarget-v0 #52

@salim4n

Description

@salim4n

Validate the PPO route for bounded continuous control in DroneTarget-v0.

Acceptance criteria:

  • spike identifies whether PPO should be implemented in TS first, Rust Burn, or hybrid;
  • rollout batching, advantage estimation, policy/value update and checkpoint shape are documented;
  • minimal experiment produces useful metrics even if learning quality is rough;
  • decision notes list blockers before full implementation.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions