Name		Name	Last commit message	Last commit date
parent directory ..
baselines		baselines
bsuite		bsuite
offline		offline
open_spiel		open_spiel
README.md		README.md
quickstart.ipynb		quickstart.ipynb
tutorial.ipynb		tutorial.ipynb

README.md

Examples

This directory includes a number of working examples of Acme agents. These examples are not meant to be comprehensive and instead show a number of common use cases under which Acme agents can be applied.

Our quickstart guide can be used to get running quickly. This notebook will show how to instantiate a simple agent and run it on an environment. You can also take a look at our tutorial, which takes a more in-depth look at the construction of the D4PG agent. This also highlights the general structure of most Acme agents which applies more broadly to all agents implemented in Acme.

Continuous control

We include a number of agents running on continuous control tasks. These agents are representative examples, but any continuous control algorithm implemented in Acme should be able to be swapped in.

Note that many of the examples, particularly those based on the DeepMind Control Suite, will require a MuJoCo license in order to run. See our tutorial for more details or see refer to the dm_control repository for further information.

D4PG: a deterministic policy gradient (D4PG) agent which includes a determinstic policy and a distributional critic running on the DeepMind Control Suite.
D4PG (gym): this example runs the same algorithm on a number of tasks defined in the OpenAI Gym. By default this will run the "mountain car" domain which does not require a MuJoCo license.
DMPO: a maximum-a-posterior policy optimization agent which combines both a distributional critic and a stochastic policy.

Discrete agents (Atari)

The development of the Arcade Learning environment and the coinciding use of Atari as a benchmark has played a very prominent role in the modern usage and testing of reinforcement learning algorithms. As a result we've also included direct examples of prominent discrete-action algorithms implemented in Acme and running on this environment.

DQN: a "classic" benchmark agent for Atari; and

Offline agents

Acme includes examples of offline agents, i.e. agents trained using external data generated by another agent:

BC: a behaviour cloning agent.
BC (JAX): a behaviour cloning agent (implemented in jax).
BCQ: an implementation of BCQ.

Similarly we also include so-called "from demonstration" agents which mix offline and online data:

DQfD: the DQfD agent running on hard-exploration tasks within bsuite (e.g. deep sea) using demonstration data; and

Behaviour Suite

The Behaviour Suite for Reinforcement Learning defines a collection of tasks and environments which collectively investigate core capabilities of RL algorithms across a number of different axes. The examples we include show how to run Acme agents on this suite.

DQN: an off-policy DQN examples;
Impala: an on-policy Impala agent; and
MCTS: a model-based agent running on the task suite using either a simulator of the environment or a learned model.

For more information see https://github.com/deepmind/bsuite.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples

examples

baselines

baselines

bsuite

bsuite

offline

offline

open_spiel

open_spiel

README.md

README.md

quickstart.ipynb

quickstart.ipynb

tutorial.ipynb

tutorial.ipynb

README.md

Examples

Continuous control

Discrete agents (Atari)

Offline agents

Behaviour Suite

Files

examples

Directory actions

More options

Directory actions

More options

Latest commit

History

examples

Folders and files

parent directory

Examples

Continuous control

Discrete agents (Atari)

Offline agents

Behaviour Suite