# Training an AI with ConvNet for Reversi Game

-------------------------------------------
Contributor: Zukang Yang

-------------------------------------------

In this project, we are learning to play Reversi (black player's perspective) with convolutional neural network. 

### Generating data: 
The Reversi data are generated in Matlab. I modified Professor Long Chen's generatedata.m to be able to generate the data with AI tree-searched algorithm and saved them as text files.

### What do the data look like?
The data consist of 

**1).** a 8x8 matrix resembling the Reversi board game with 1 to be black stone, -1 to be white stone and 0 to be empty; 

**2).** a vector containing the next best move given the current stone locations, i.e. a positive value is the next best move for the black player; a negative value is the next best move for the white player (each element is range from 0 to 64 -- 0 means pass and 1~64 is the index for the next best move);

**3).** a 8x8 matrix recording all available moves for the current player (1:black or -1:white).

Training set: 90,000 moves

Test set: 6000 moves
 
### How to process the data generated from Matlab?
Refering to IV, part B in page 3 of *Learning to Play Othello with Deep Neural Networks*, the reversi_data file contains the Reversi data, i.e. 8x8 matrices. I split each matrix into two matrices of the same size with each of new matrices containing only the black or white stones. Then, combine with the data from the valid_moves file, we can create a dataset of a (n, 8, 8, 3) array which is the training data for the CNN model. Last, the next_move file provides classification data for the CNN model. 

### Structure of the CNN:
First CNN layer: 64 3x3 kernels

Second CNN layer: 128 3x3 kernels

One 2x2 maxpooling layer

One fully-connected layer with 128 perceptrons

One output layer with 65 perceptrons


### Reference
* [Professor Long Chen's Github page including all necessary codes for generating Reversi data in Matlab](https://github.com/lyc102/reversi).
* [Reversi - Wikipedia](https://en.wikipedia.org/wiki/Reversi).
* [Learning to Play Othello with Deep Neural Networks](https://arxiv.org/pdf/1711.06583.pdf).

## Install and import relevant packages

In [339]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split  # validation

# for constructing CNN
from keras.utils import to_categorical                # one-hot encode target column
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.losses import categorical_crossentropy
from keras.optimizers import Adadelta

## Data Preparation

In [341]:
reversi = np.loadtxt('reversi_data.txt').reshape(-1, 64)   # Reversi data
move = np.loadtxt('next_move.txt')                         # next move
valid = np.loadtxt('valid_moves.txt').reshape(-1, 64)      # next valid moves

reversi_t = np.loadtxt('reversi_data_test.txt').reshape(-1, 64)   
move_t = np.loadtxt('next_move_test.txt')                         
valid_t = np.loadtxt('valid_moves_test.txt').reshape(-1, 64)      

In [297]:
def player(data, valid, label, turn='black'):
    '''Return a tuple of three numpy arrays containing the reversi data for the specified player.
    turn -- 'black' or 'white' '''
    if turn == 'black':
        cond = np.where(label>=0)[0]
        idx = [i for i in cond if i%2==0]
        return (data[idx], valid[idx], label[idx])
    
    elif turn == 'white':
        cond = np.where(label<=0)[0]
        idx = [i for i in cond if i%2==1]
        return (data[idx], valid[idx], label[idx])
    
    else:
        raise Exception('Please enter either black or white.')


def black_and_white(data):
    '''Return a tuple of numpy array that separate black pieces from white pieces.
    input -- the reversi data.'''
    black = np.zeros(data.shape)
    white = np.zeros(data.shape)
    
    black = (data==1.0)*1.0
    white = (data==-1.0)*1.0
    return (black, white)


def concatenate(black, white, valid_moves):
    '''Return a 4d array for training CNN model.'''
    n = black.shape[0]   # n rows
    black = black.reshape(n, 8, 8, 1)
    white = white.reshape(n, 8, 8, 1)
    valid_moves = valid_moves.reshape(n, 8, 8, 1)
    
    return np.concatenate((black, white, valid_moves), axis=3)

In [343]:
# --------------- training set -----------------------
game, v_moves, n_moves = player(reversi, valid, move)   # v_moves -- valid moves; n_move -- next moves
black, white = black_and_white(game)                    # split each board into only black and only white

train = concatenate(black, white, v_moves)              # training data
y_train = to_categorical(n_moves)                             # labels

# ---------------- test set --------------------------
game_t, v_moves_t, n_moves_t = player(reversi_t, valid_t, move_t)
black_t, white_t = black_and_white(game_t)

X_test = concatenate(black_t, white_t, v_moves_t)
y_test = to_categorical(n_moves_t)

## CNN Construction

In [344]:
X_train, X_test, y_train, y_test = train_test_split(train, y_train, test_size=0.15, random_state=42)     # validation
batch_size = 128
epochs = 48

# create model
model = Sequential()

# add model layers
model.add(Conv2D(64, kernel_size=(3, 3), strides=(1,1), activation='relu', input_shape=(8,8,3)))
model.add(Conv2D(128, kernal_size=(3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.25))
model.add(Dense(65, activation='softmax'))

model.compile(loss=categorical_crossentropy,
              optimizer=Adadelta(),
              metrics=['accuracy'])

model.fit(X_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(X_test, y_test))

score = model.evaluate(X_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Train on 38250 samples, validate on 6750 samples
Epoch 1/48
Epoch 2/48
Epoch 3/48
Epoch 4/48
Epoch 5/48
Epoch 6/48
Epoch 7/48
Epoch 8/48
Epoch 9/48
Epoch 10/48
Epoch 11/48
Epoch 12/48
Epoch 13/48
Epoch 14/48
Epoch 15/48
Epoch 16/48
Epoch 17/48
Epoch 18/48
Epoch 19/48
Epoch 20/48
Epoch 21/48
Epoch 22/48
Epoch 23/48
Epoch 24/48
Epoch 25/48
Epoch 26/48
Epoch 27/48
Epoch 28/48
Epoch 29/48
Epoch 30/48
Epoch 31/48
Epoch 32/48
Epoch 33/48
Epoch 34/48
Epoch 35/48
Epoch 36/48
Epoch 37/48
Epoch 38/48
Epoch 39/48
Epoch 40/48
Epoch 41/48
Epoch 42/48
Epoch 43/48
Epoch 44/48
Epoch 45/48
Epoch 46/48
Epoch 47/48
Epoch 48/48
Test loss: 0.35874681031924704
Test accuracy: 0.883407407283783
