Skip to content

Best way to organize self-play #391

@alex-petrenko

Description

@alex-petrenko

I am planning to experiment with population-based training and self-play, similar to the recent DeepMind's Q3 CTF paper. The obvious requirement would be the ability to train the agents to play against other agent copies on the same map at the same time.

I could probably wrap a multiplayer session into a single multi-agent interface and use ASYNC_PLAYER mode, maybe with increased tickrate (#209)
However the optimal way to implement this would be to render multiple observations for different agents within the same tick in the same process in synchronous mode, similar to how it's done in single-player.

Any thoughts on what is the right course of action here? Does multi-agent SYNC mode seem feasible or would it require changing half the codebase?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions