GitHub - alpine-chamois/actor-critic: Deep Reinforcement Learning

Advantage Actor-Critic (A2C) Reinforcement Learning (RL) Agent

This A2C RL agent is based on the Asynchronous A2C (A3C) agent in Deep Reinforcement Learning in Action, but with tuned hyperparameters, and without asynchronous processing.

A2C agents combine a Deep Q-network (DQN) like that used by DeepMind with a policy network like REINFORCE. They provide direct sampling of actions from a distribution (like a policy network) whilst also supporting rapid online learning (like a DQN, but without the need for experience replay or a target network).

The A2C agent learns to play the Cart Pole game environment in Gymnasium:

OpenAI Gym, OpenAI, 2022

The agent is a two-headed feed-forward neural network:

Deep Reinforcement Learning in Action, Manning, 2020

Here the agent is being trained to play Cart Pole.

And here the trained agent is playing the game unaided:

What about playing other games?

The A2C agent can be used to play any game as long as main.py is updated to initialise the agent and correctly handle rewards. There is a branch of this repository that shows how it can be successfully trained to play the Lunar Lander game in Gymnasium.

Why A2C and not A3C or PPO?

Although A3C and PPO agents can perform better than A2C agents, they include additional complexity that makes the fundamentals of RL more difficult to understand when looking at the code. This A2C agent is designed to be a reference for how to implement a Deep RL (DRL) agent using PyTorch. If you want a PPO agent, I recommend using the implementation in Stable Baselines 3. There is a branch of this repository that shows how to implement an equivalent A2C agent using Stable Baselines 3, and to convert this agent to a PPO agent, simply replace instances of A2C with instances of PPO.

Getting started

Prerequisites: Python 3.10

Run the install script (Linux):
```
. install.sh
```
or (Windows):
```
install.bat
```

Run the example:

(venv) >python -m actorcritic --train --render

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.vscode		.vscode
images		images
src		src
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
a2c.mdl		a2c.mdl
install.bat		install.bat
install.sh		install.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Advantage Actor-Critic (A2C) Reinforcement Learning (RL) Agent

What about playing other games?

Why A2C and not A3C or PPO?

Getting started

About

Languages

License

alpine-chamois/actor-critic

Folders and files

Latest commit

History

Repository files navigation

Advantage Actor-Critic (A2C) Reinforcement Learning (RL) Agent

What about playing other games?

Why A2C and not A3C or PPO?

Getting started

About

Topics

Resources

License

Stars

Watchers

Forks

Languages