## Training a specialised model (CNN+RNN) for Othello/Reversi

This notebook presents a new approach to estimate the next move to play in a game of Othello using Supervised Learning. The datasets come from the [Fédération Française d'Othello](https://www.ffothello.org/informatique/la-base-wthor/). The model input is on one hand the board state which will feed in a Convolutional Neural Network (CNN) and on the other hand the history of the game which will feed in a Recurrent Neural Network (RNN). The output of the model is the next move to play. We approach the task as a classification problem with a new type of kernel for the CNN : star-shaped kernel.

### Data Handling

In [1]:
import struct   # for reading the .wtb files
import os       # for file/path/directories,...  handling
import pickle   # for saving/loading the data

#### Extracting data from the WThor database

Some functions were taken or modified from the [dnnothello repo](https://github.com/wjaskowski/dnnothello/blob/master/games/othello_data.py)

The header of a .wthor file is 16 bytes long and contains the following fields:
- 1 byte: century of the file's creation
- 1 byte: year of the file's creation
- 1 byte: month of the file's creation
- 1 byte: day of the file's creation
- 4 bytes (int): number of games in the file ($\leq$ 2 147 483 648)
- 2 bytes (short): 0 here (but for other type of files : number of players, tournaments, or number of empty squares in the board ($\leq$ 65 535))
- 1 byte: year of the games
- 1 byte: size of the board {0: 8x8, 8: 8x8, 10: 10x10}
- 1 byte: 0 here the games type (1 if "solitaire", 0 otherwise)
- 1 byte: the games depth
- 1 byte: reserved

The games are stored in the file in the following format:
- 2 bytes (short): label of the tournament
- 2 bytes (short): id number of the black player
- 2 bytes (short): id number of the white player
- 1 byte: true score of the black player
- 1 byte: theoretic score of the black player

And then each move is stored as a 60 byte long record (list of moves).

In [2]:
BOARD_SIZE = 8

HEADER_LENGTH = 16
HEADER_FORMAT = "<BBBBIHHBBBB"  # Byte, Byte, Byte, Byte, Int, Short, Short, Byte, Byte, Byte, (Reserved) Byte

GAME_INFO_LENGTH = 8    
GAME_INFO_FORMAT = "<HHHBB"     # Short, Short, Short, Byte, Byte

MOVES_LENGTH = 60
MOVES_FORMAT = "<" + "B"*MOVES_LENGTH

POSSIBLE_SIZE = [0, 8]

def read_all_wtb_files(directory):
    """Generator to read all .wtb files in a directory."""
    for file_name in os.listdir(directory):
        if file_name.endswith(".wtb"):
            yield from read_wtb(os.path.join(directory, file_name))

def read_wtb(file_path):
    """Generator to read a .wtb file and yield game information and played moves."""
    with open(file_path, 'rb') as f:
        header = struct.unpack(HEADER_FORMAT, f.read(HEADER_LENGTH))
        assert header[7] in POSSIBLE_SIZE   # Check the board size
        
        for _ in range(header[4]):  # Number of games
            game_info = struct.unpack(GAME_INFO_FORMAT, f.read(GAME_INFO_LENGTH))
            played_moves = struct.unpack(MOVES_FORMAT, f.read(MOVES_LENGTH))
            yield game_info[3], played_moves    # Black player true score, moves

In [3]:
reader = read_wtb('../data/raw/WTH_2001.wtb')
print(next(reader))

full_reader = read_all_wtb_files('../data/raw')
print(next(full_reader))

(11, (56, 64, 53, 46, 35, 63, 34, 66, 65, 74, 37, 43, 57, 33, 76, 24, 75, 26, 83, 36, 73, 38, 25, 16, 14, 15, 17, 47, 13, 68, 48, 58, 52, 28, 67, 23, 12, 61, 32, 42, 31, 86, 51, 41, 27, 84, 85, 82, 71, 18, 72, 11, 21, 22, 62, 81, 77, 78, 88, 87))
(34, (56, 64, 33, 36, 46, 34, 43, 67, 66, 65, 53, 63, 74, 84, 75, 57, 35, 24, 47, 38, 76, 52, 58, 37, 42, 62, 83, 82, 73, 85, 86, 87, 48, 68, 25, 14, 13, 31, 61, 51, 15, 26, 77, 23, 41, 88, 21, 72, 16, 32, 12, 22, 78, 71, 81, 11, 17, 27, 28, 18))


In [4]:
from utils import *

In [5]:
def decode_game(moves):
    """Decode moves played in a game from the 0-63 representation to the bitboard representation."""
    own, enemy = init()
    node = Node(None, own, enemy, -1, BOARD_SIZE)
    for move in moves:
        if move == 0:
            break
        node.expand() # Generate the possible moves
        x, y = decode_move(move)
        move = set_state(0, x, y, BOARD_SIZE)
        
        if move not in node.moves: # then it means it is a pass and the other player plays of it is the end of the game
            node.invert()
            node.expand()
            if move in node.moves:
                node = node.set_child(move)
            else:
                node.set_child(node.moves[0])
        else:
            node = node.set_child(move)
    return node

            
def decode_move(move):
    """Decode a move from the 0-63 representation to the (x, y) representation."""
    return move // 10 - 1, move % 10 - 1

In [6]:
true_score, game_moves = next(full_reader)
print(f"Expected score: {true_score}")
first_game = decode_game(game_moves)
# replay(first_game, BOARD_SIZE)
print(f"Score : {cell_count(first_game.own_pieces), cell_count(first_game.enemy_pieces)}")
while true_score in [cell_count(first_game.own_pieces), cell_count(first_game.enemy_pieces)]:
    true_score, game_moves = next(full_reader)
    first_game = decode_game(game_moves)
print(f"Expected score: {true_score}")
print(f"Score : {cell_count(first_game.own_pieces), cell_count(first_game.enemy_pieces)}")
# replay(first_game, BOARD_SIZE)

Expected score: 52
Score : (12, 52)
Expected score: 64
Score : (0, 63)


True score is the number of pieces of the black player + the empty ones.

In [7]:
def dump_data(directory, output_file, batch_size=1):
    """Dump the data from the .wtb files in a pickle file."""
    data = []
    data_reader = read_all_wtb_files(directory)
    for i, (score, moves) in enumerate(data_reader):
        game = decode_game(moves)
        move_list = replay(game, BOARD_SIZE, False)
        data.append((score, move_list))
        if i % batch_size == 0:
            print(f"Batch {i // batch_size}", end='\r')
            with open(f"{output_file}_{i // batch_size}.pkl", 'wb') as f:
                pickle.dump(data, f)
                data = []
    with open(f"{output_file}_{len(list(read_all_wtb_files(directory))) // batch_size + 1}.pkl", 'wb') as f:
        pickle.dump(data, f)

In [8]:
# dump_data('../data/raw', '../data/processed/data', batch_size=1000)

In [9]:
def load_data(file_path, bound=131):
    """Load the data from a pickle file."""
    data = []
    for i in range(bound):
        with open(f"{file_path}_{i}.pkl", 'rb') as f:
            data.extend(pickle.load(f))
    return data

In [None]:
loaded_data = load_data('../data/processed/data', 1)
def test_loaded():
    for score, game_nodes in loaded_data:
        print(score)
        for game_node in game_nodes:
            if not game_node.moves:
                game_node.moves = generate_moves(game_node.own_pieces, game_node.enemy_pieces, BOARD_SIZE)[0]
            print(game_node)
            print(game_node.moves)
            print(game_node.value)
            cv2_display(BOARD_SIZE, game_node.own_pieces, game_node.enemy_pieces, game_node.moves, game_node.turn, display_only=True)
            answer = input("Continue ?")
            if answer == 'n':
                return
    cv2.destroyAllWindows()

34
34628173824, 68853694464, -1
[524288, 17592186044416, 137438953472, 67108864]
None
Press Enter to continue...134217728, 240786604032, 1
[8796093022208, 536870912, 35184372088832]
None
Press Enter to continue...206426865664, 8830586978304, -1
[67108864, 17179869184, 262144, 4398046511104, 1125899906842624]
None
Press Enter to continue...8830452760576, 206561345536, 1
[524288, 274877906944, 2097152, 536870912]
None
Press Enter to continue...206292910080, 8830723293184, -1
[1048576, 2251799813685248, 536870912, 17179869184, 524288, 1125899906842624]
None
Press Enter to continue...8830454857728, 207098216448, 1
[524288, 35184372088832, 274877906944, 4194304]
None
Press Enter to continue...206963998720, 8830589599744, -1
[8192, 1048576, 67108864, 17179869184, 1024, 16384, 4398046511104, 1125899906842624]
None
Press Enter to continue...8830455382016, 207165325312, 1
[35184372088832, 274877906944, 131072, 4194304, 8589934592, 70368744177664]
None
Press Enter to continue...69457936384, 7933

#### Data Preprocessing
Now we want to remove duplicates, add symmetries, and consider the games as the black player's perspective (if white win, we invert the board).