-
-
Notifications
You must be signed in to change notification settings - Fork 422
Description
I am planning to experiment with population-based training and self-play, similar to the recent DeepMind's Q3 CTF paper. The obvious requirement would be the ability to train the agents to play against other agent copies on the same map at the same time.
I could probably wrap a multiplayer session into a single multi-agent interface and use ASYNC_PLAYER mode, maybe with increased tickrate (#209)
However the optimal way to implement this would be to render multiple observations for different agents within the same tick in the same process in synchronous mode, similar to how it's done in single-player.
Any thoughts on what is the right course of action here? Does multi-agent SYNC mode seem feasible or would it require changing half the codebase?