Status: Under construction.
Amca is an RL-based Backgammon agent.
|Dependency||Version Tested On|
This project aims to design Backgammon as a reinforcement learning problem, and gauge the performance of common deep reinforcement learning algorithms. This is done by training and gauging the performance of three popular and powerful RL algorithms:
- Deep Q Network (Mnih et. al)
- Proximal Policy Optimization (Schulman et. al)
- Soft Actor-Critic (Haarnoja et. al)
- Sarsa (Rummery and Niranjan)
The testing is done with the default parameters and implementations provided by the Stable Baselines library for all the 3 deep RL algorithms. A custom implementation heavily modified from this repo is used for SARSA, and the hyperparameters are given in the SarsaAgent object.
- play.py: to launch a game against a deep RL trained model. For example,
python play.py ppo amca/models/amca.pklwill launch the model called
amca.pklthat was trained using the PPO algorithm.
- train.py: to train an deep RL model (with default hyperparameters) to play. For example,
python train.py -n terminator.pkl -a sac -t 1000000will train an agent called
terminator.pklusing the SAC algorithm for 1000000 steps.
- sarsa_play.py: to launch a game against a SARSA trained model.
python sarsa_play.py r2d2.pklwill launch the model called
r2d2.pklthat was trained using the SARSA algorithm.
- sarsa_train.py: to train a model using SARSA. For example,
python sarsa_train.py jarvis.pkl -g 10000will train an agent called
jarvis.pklusing the SARSA algorithm for 10000 games.