Skip to content

Feature request: public async evaluator API for dedicated eval devices #3610

@vmoens

Description

@vmoens

Summary

TorchRL could use a public, documented evaluator API for the common setup where:

  • training runs on one device,
  • collection runs on one or more other devices,
  • evaluation runs on a dedicated device,
  • evaluation should not block the training loop.

In the current installed build I tested, torchrl.trainers.Evaluator is not importable, and I couldn't find a public MultiAsyncCollector symbol either. That leaves users implementing custom background threads/processes around:

  • creating a separate eval env,
  • copying policy weights over,
  • running deterministic rollout,
  • handling logging/video manually,
  • polling/joining results.

Concrete use case

I am training PPO with:

  • collectors on cuda:4,cuda:6,
  • optimizer on cuda:5,
  • evaluation on cuda:7.

The desired behavior is:

  1. trigger eval every N training iterations,
  2. keep the hot training loop running,
  3. poll the eval result later,
  4. log scalar metrics and optional video once the result is ready.

That pattern is useful enough that it would be better as a first-class TorchRL API than repeated custom code in downstream projects.

What would help

Something along these lines:

  • a public Evaluator (or similarly named) object that is part of the installed API,
  • support for sync and async modes,
  • explicit support for a dedicated eval device,
  • simple trigger(...), poll(), and wait() semantics,
  • a clear contract for how policy weights are transferred,
  • integration with VideoRecorder / loggers, or at least a recommended pattern documented in TorchRL.

Why this matters

Without this, async eval tends to become a pile of downstream boilerplate that is easy to get subtly wrong:

  • stale weights,
  • blocking behavior,
  • duplicate env setup,
  • awkward video handling,
  • ad hoc thread/process lifecycle management.

If there is already a recommended API for this that is just not exported/documented, exposing it would already help a lot.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions