Implementation of the AlphaZero algorithm
This repo contains:
- a simple but working implementation of the AlphaZero algorithm
- an agent that uses the AlphaZero algorithm to play an openAI gym game (CartPole-v1)
The code is an addition to the MCTS algorithm implementation.
This is an implementation of an agent that uses an AlphaZero implementation in order to play the openAI gym game of CartPole.
Execute the code in the notebook to train the agent!
To set up your python environment to run the code in this repository, follow the instructions below.
-
Create (and activate) a new environment with Python 3.6.
- Linux or Mac:
conda create --name AlphaZero python=3.6 source activate AlphaZero
- Windows:
conda create --name AlphaZero python=3.6 activate AlphaZero
-
Clone the repository, and then, install the required packages (see requirements).
git clone https://github.com/ciamic/AlphaZero.git
- Create an IPython kernel for the
AlphaZero
environment.
python -m ipykernel install --user --name AlphaZero --display-name "AlphaZero"
- Before running code in a notebook, change the kernel to match the
AlphaZero
environment by using the drop-down contextualKernel
menu.
Python 3
numpy
matplotlib
gym
Tensorflow