Skip to content

Adapting to unseen partners in multi-agent Reinforcement Learning (MARL) using Evolutionary Strategies (ES).

Notifications You must be signed in to change notification settings

hericks/online-lever-adaptation

Repository files navigation

Online Lever Adaptation

This repository aims to bundle all the code and experiments for my Master's thesis on online adaptation in multi-agent reinforcement learning (MARL). The thesis is jointly supervised by Jakob Foerster with support from his research group at FLAIR and Arnaud Doucet from the Department of Statistics at the University of Oxford.

The top-level scripts describe some of the basic functionality and structure of this project.

  • 01_step_through_env.py shows how to initialize and step through an iterated lever environment with custom parameters and partner policies.
  • 02_q_learning.py combines the environment with a learner of class DQNAgent to perform vanilla q-learning.
  • 03_es_meta_learning.py exemplifies how the OpenES class - which implements the evolution strategies algorithm Open-ES - can be used to learn initial network weights capable of remembering a fixed partner pattern of length three.
  • 04_es_learn_history_representations.py shows how evolution strategies can be used to learn the parameters of a LSTM giving a history representation suitable for effective q-learning.
  • 05_learning_with_drqn.py exemplifies the adaptation baseline (a simple deep recurrent q-learner based on the work by Hausknecht et al.).
  • 06_step_through_marl_env.py shows how to step through the iterated lever environment without a fixed partner policy, but a pair of (possibly learning) agents.

About

Adapting to unseen partners in multi-agent Reinforcement Learning (MARL) using Evolutionary Strategies (ES).

Topics

Resources

Stars

Watchers

Forks