ReversIA - An AI for Reversi/Othello game using AlphaBeta algorithm or UCT Search algorithm

Colin Decourt & Mehamli Théo

This repository contain an AI for Reversi/Othello game using tree search algorithm such as Alpha Beta or Monte Carlo Tree Search algorithms. This was done as part of the courses "Research Algorithm" and "Knowledge Representation" given by Laurent Simon at Bordeaux INP - ENSEIRB MATMECA.

Description

The repository is composed of different important files.

Board and core functions

The Reversi.py file contains core functions to play and develop the AI.

Players

The players folder contains three different AIs to run a local game.

myPlayerBasic

myPlayerBasic.pycontains behavior functions for playing a party with random moves, without any strategie.

myPlayerAlphaBeta

myPlayerAlphaBeta.pycontrains beavior functions for playing a party with an negAlphaBeta with memory algorithm and a MTDF algorithm. These algorithms are inspired from these links : /

We used five different heuristic to find the optimal moves to win :

Pieces differences heuristic : the player tries to maximize his number of tiles.
Corner closeness heuristic : the player tries to avoid the tiles close to the corner. Indeed these tiles are sensitive to a capture if the opponent player in the corner.
Corner occupancy heuristic : the player will try to place tile on the corner to capture tile on diagonals.
Mobility heuristic
Coin parity heuristic

Each heuristics are weigthed to influence the AI to play more some moves. More detail are provide (in french...) in myPlayerAlphaBeta.pycode.

Weaknesses and improvements : From now, it is only possible to play with AlphaBeta with a 10x10 board. Further versions could contain more WEIGHT_BOARD matrices. Moreover heuristics are weighted in the same way during all the game. For example corner closeness and occupancy heursitics could be use more at the end of the party.

myPlayerUCTSearch

myPlayerUCTSearch.pycontains behavior functions for playing a party using an UCT Search algorithm (Monte Carlo Tree Search based algorithm). We chose to implement this algorihtm because it is known as the best tree search algorithm for playing game such as Reversi or Go. UCT search algorithm code can be found in the UCTSearch.pyfile in the utils folder. This code was inspired from the page 6 to 10 of paper. In this file the following functions can be find :

Node class : this class is use to create a new node in the tree and contain informations about parent node, children nodes, the number of time the node has been visited, the current state board and the total simulation reward for this node.
Tree policy function : get a tree policy for a given node. While the node is non terminal and the node is expandable, we expand the tree search. Else we return the best child node.
Best child function : return the best node child.
Default policy function : return the best reward for a given state.
Backup function : perform the backpropagation.
uct_search function : main loop for finding the best move in the tree. We chose to allow a computational budget of 100 for finding the best child.

Usage

Download the code, open a terminal and run :

python localGame.py

This will run a local game between to AI. By default the code launch a party between UCT Search and AlphaBeta. UCT Search play first.

To run custom game :

python localGame.py --blackTile nameOfyourAI --whiteTile nameOfyourAI

Choose an AI among the following list :

UCTSearch
random
AlphaBeta

Exemple :

python localGame.py --blackTile UCTSearch --whiteTile random

Black tiles always begin but this can be change in localGame.py file.

Performances

Alpha Beta with memory algorithm vs Random moves

Number of victory : Alpha Beta always won versus an Random moves AI.

Total time per party : AlphaBeta (in orange) is very long to play depending on the party (time in seconds).

Monte Carlo Tree Search algorithm (UCT Search) vs Random moves

Number of victory : UCT Search always won versus an Random moves AI.

Total time per party : Because of a constant allocation of computational ressources, UCT Search algorithm (in orange) play with a constant time.

Monte Carlo Tree Search algorithm (UCT Search) vs Alpha Beta with memory algorithm

UCT First

Number of victory :

Total time per party : As said previously, UCT Search plays with a constant time contrary to AlphaBeta method.

AlphaBeta First

We noticed that when UCT Search AI plays first, it always won. However, when it plays as the second player the victory is not recurrent.

Number of victory : Here AlphaBeta won 3 times playing first.

Total time per party :

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ReversIA - An AI for Reversi/Othello game using AlphaBeta algorithm or UCT Search algorithm

Description

Board and core functions

Players

myPlayerBasic

myPlayerAlphaBeta

myPlayerUCTSearch

Usage

Performances

Alpha Beta with memory algorithm vs Random moves

Monte Carlo Tree Search algorithm (UCT Search) vs Random moves

Monte Carlo Tree Search algorithm (UCT Search) vs Alpha Beta with memory algorithm

UCT First

AlphaBeta First

Files

README.md

Latest commit

History

README.md

File metadata and controls

ReversIA - An AI for Reversi/Othello game using AlphaBeta algorithm or UCT Search algorithm

Description

Board and core functions

Players

myPlayerBasic

myPlayerAlphaBeta

myPlayerUCTSearch

Usage

Performances

Alpha Beta with memory algorithm vs Random moves

Monte Carlo Tree Search algorithm (UCT Search) vs Random moves

Monte Carlo Tree Search algorithm (UCT Search) vs Alpha Beta with memory algorithm

UCT First

AlphaBeta First