Skip to content

feat(learning): DQN learner for discrete debug environments #51

@salim4n

Description

@salim4n

Add a first neural learner path for discrete toy/debug environments.

Acceptance criteria:

  • DQN trains against GridWorld or Target2D without users implementing the algorithm;
  • metrics include reward, episode length, epsilon/exploration and loss;
  • checkpoint save/load supports inference replay;
  • CI smoke proves improvement over random on a deterministic demo.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions