A platform to confront in multiple two-player games agents trained with reinforcement learning.
This project is a joint venture between the School of AI in Angers and in Le Mans, France to explore Reinforcement Learning in an engaging and dynamic setup. For that, we've decided to setup this arena where attendees can submit their own competitors and the system will make them battle against each other to decide who's the best!
An environment is an implementation of a two-player game. This repo contains implementations for:
- Tic-tac-toe
- Quarto
- Connect Four
Each implementation resides inside its own directory in environments
, providing:
environment.py
: a Python implementation of the game rules and its HTML renderizationinfo.yml
: data used to set-up the Environment model in the databasestatic
: static files served on the website under the URL/static/environment-<environment>/
- builder: responsible for preparing a Docker image wrapping each submission
- core: main Django settings, database models and migrations
- data: contains all user data, its subdirectories are mounted inside the multiple containers
- duel_runner: launch and monitors the duels that are scheduled as part of the tournaments and also tests whether a given submission is valid by making it play against itself multiple times without any rule violation
- environments: implement the different games
- example_players: a collect of basic players, mostly used for testing
- nginx: Nginx webproxy configuration
- publisher: responsible for pushing public submissions to the repo rl-arena-public-submissions
- run_duel: given two Docker images and the environment name, runs multiple matches and collect the results
- terraform: controls the infrastructure deployment on DigitalOcean
- tournament_manager: launch and monitors running tournaments, aggregating results and calculating rankings
- web: main Django app with all the user-facing web platform
- Clone this repo
- Install Docker and docker-compose
- Copy
example.env
as.env
and edit it with values that fit you - Run
./prepare_dev.sh
This is a Python script that executes a duel between two players. For that, it takes the Docker-image name of each player and the name of the environment.
It will output a JSON document with the results of the duel with the format:
{
"result": "one of: ERROR, PLAYER_1_WIN, PLAYER_2_WIN, DRAW",
"error_msg": "",
"num_matches": 17,
"player_1_errors": 17,
"player_2_errors": 17,
"other_errors": 17,
"player_1_wins": 17,
"player_2_wins": 17,
"draws": 17,
"player_1_score": 1.7,
"player_2_score": 1.7,
"matches": [{
"result": "one of: PLAYER_1_ERROR, PLAYER_2_ERROR, OTHER_ERROR, PLAYER_1_WIN, PLAYER_2_WIN, DRAW",
"error_msg": "",
"player_1_score": 1.7,
"player_2_score": 1.7,
"first_player": "one of: PLAYER_1, PLAYER_2",
"states": ["result of BaseEnvironment.to_jsonable()"]
}]
}
-
Install Terraform
-
Create a
terraform/secrets.tfvars
file with the necessary tokens -
cd terraform; terraform apply -var-file=secrets.tfvars
-
Inside the recently-created Droplet, execute the following instructions. Note: this script should be executed manually, as there are some interactive steps!
# Build source git clone https://github.com/school-of-ai-angers/rl-arena.git cd rl-arena # Prepare service account keys nano keys/gcp.json # Paste JSON key docker build -t rl-arena . docker build -t rl-arena-nginx nginx # Configure env cp example.env .env nano .env # Prepare publisher repo mkdir -p data/publish_keys ssh-keygen -f data/publish_keys/id_rsa cat data/publish_keys/id_rsa.pub # Prepare database docker-compose up -d db wait 30 docker-compose run --rm -T migrate # Prepare static files docker-compose run --rm collectstatic # Start other services in master node docker-compose up -d web builder publisher tournament_manager auto_scaler # Generate certificate docker-compose up -d nginx docker-compose exec nginx bash # Run it inside: certbot --nginx --register-unsafely-without-email # Get out # Rerun nginx docker-compose stop nginx docker-compose up -d nginx # Setup firewal ufw allow 80 ufw allow 443 ufw allow in on eth1 to any port 5432 proto tcp # Turn off and then create snapshot for worker docker-compose down poweroff