<a href="https://colab.research.google.com/github/wissam124/iasd-deep-learning-go/blob/master/DeepLearningProject.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Deep Learning Project

This is the page for the Deep Learning Project of the master IASD. The goal is to train a network for playing the game of Go. In order to be fair about training ressources the number of parameters for the networks you submit must be lower than 1 000 000. The maximum number of students per team is two. The data used for training comes from Facebook ELF opengo Go program self played games. There are more than 98 000 000 different states in total in the training set. The input data is composed of 8 19x19 planes (color to play, ladders, current state on two planes, two previous states on four planes). The output targets are the policy (a vector of size 361 with 1.0 for the move played, 0.0 for the other moves), the value (1.0 if White won, 0.0 if Black won) and the state at the end of the game (two planes).

In [1]:
!wget https://www.lamsade.dauphine.fr/~cazenave/DeepLearningProject.zip

--2019-12-12 20:09:52--  https://www.lamsade.dauphine.fr/~cazenave/DeepLearningProject.zip
Resolving www.lamsade.dauphine.fr (www.lamsade.dauphine.fr)... 193.48.71.250
Connecting to www.lamsade.dauphine.fr (www.lamsade.dauphine.fr)|193.48.71.250|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 211774472 (202M) [application/zip]
Saving to: ‘DeepLearningProject.zip’


2019-12-12 20:10:05 (29.2 MB/s) - ‘DeepLearningProject.zip’ saved [211774472/211774472]



In [0]:
!unzip -j DeepLearningProject.zip
# Copy all files into root directory
# !cp -r DeepLearningProject/* .

Archive:  DeepLearningProject.zip
  inflating: Board.h                 
  inflating: Game.h                  
  inflating: Rzone.h                 
  inflating: compileMAC.sh           
  inflating: compile.sh              
  inflating: ls.sh                   
  inflating: golois.cpp              
  inflating: games.data              

In [0]:
!ls -all

In [0]:
!rm -r golois.py

In [0]:
!pip3 install pybind11

In [0]:
!./compile.sh

In [0]:
import tensorflow
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Input, Dense, Conv2D, Flatten, BatchNormalization, Activation, LeakyReLU, add, SpatialDropout2D
from tensorflow.keras.optimizers import SGD, Adam
from tensorflow.keras import regularizers
from tensorflow.keras.utils import plot_model
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.callbacks import CSVLogger
from matplotlib import pyplot as plt

class GoModel():
    def __init__(self, regParam, learningRate, inputDim, outputDim):
        self.regParam = regParam
        self.learningRate = learningRate
        self.inputDim = inputDim
        self.outputDim = outputDim

    def predict(self, x):
        return self.model.predict(x)

    def fit(self, X, y, epochs, verbose, validation_split, batch_size):
        checkpoint = ModelCheckpoint('best_model.h5',
                                     monitor='loss',
                                     verbose=1,
                                     save_best_only=True,
                                     mode='auto',
                                     period=1)

        csv_logger = CSVLogger('training.log', separator=',', append=False)

        return self.model.fit(X,
                              y,
                              epochs=epochs,
                              verbose=verbose,
                              validation_split=validation_split,
                              batch_size=batch_size,
                              callbacks=[checkpoint, csv_logger])

    def save_model(self):
        self.model.save('./model_' + 
                        str(len(hiddenLayers)) + 'layers_'+
                        str(self.regParam) + 'reg' +
                        '.h5')

    def summary(self):
        return self.model.summary()

    def plot_model(self):
        plot_model(self.model)

    def display_layers():
        pass


class NeuralNet(GoModel):
    def __init__(self, regParam, learningRate, inputDim, outputDim,
                 hiddenLayers, momentum):
        GoModel.__init__(self, regParam, learningRate, inputDim, outputDim)
        self.hidden_layers = hiddenLayers
        self.momentum = momentum
        self.num_layers = len(hiddenLayers)
        self.model = self.buildModel()

    def convLayer(self, x, numFilters, kernelSize):

        x = Conv2D(filters=numFilters,
                   kernel_size=kernelSize,
                   data_format='channels_last',
                   padding='same',
                   use_bias=False,
                   activation='linear',
                   kernel_regularizer=regularizers.l2(self.regParam))(x)

        # x = SpatialDropout2D(rate=0.2,
        #                      data_format='channels_last')(x)

        x = BatchNormalization(axis=-1)(x)

        x = LeakyReLU()(x)

        return x

    def residualLayer(self, inputLayer, numFilters, kernelSize):

        x = self.convLayer(inputLayer, numFilters, kernelSize)

        x = Conv2D(filters=numFilters,
                   kernel_size=kernelSize,
                   data_format='channels_last',
                   padding='same',
                   use_bias=False,
                   activation='linear',
                   kernel_regularizer=regularizers.l2(self.regParam))(x)

        x = BatchNormalization(axis=-1)(x)

        x = add([inputLayer, x])

        x = LeakyReLU()(x)

        return (x)

    def value_head(self, x):

        x = Conv2D(filters=1,
                   kernel_size=(1, 1),
                   data_format='channels_last',
                   padding='same',
                   use_bias=False,
                   activation='linear',
                   kernel_regularizer=regularizers.l2(self.regParam))(x)
        
        # x = SpatialDropout2D(rate=0.5,
        #                      data_format='channels_last')(x)

        x = BatchNormalization(axis=-1)(x)

        x = LeakyReLU()(x)

        x = Flatten()(x)

        x = Dense(10,
                  use_bias=False,
                  activation='linear',
                  kernel_regularizer=regularizers.l2(self.regParam))(x)

        x = LeakyReLU()(x)

        x = Dense(1,
                  use_bias=False,
                  activation='sigmoid',
                  kernel_regularizer=regularizers.l2(self.regParam),
                  name='value')(x)

        return (x)

    def policy_head(self, x):

        x = Conv2D(filters=2,
                   kernel_size=(1, 1),
                   data_format='channels_last',
                   padding='same',
                   use_bias=False,
                   activation='linear',
                   kernel_regularizer=regularizers.l2(self.regParam))(x)

        # x = SpatialDropout2D(rate=0.5,
        #                      data_format='channels_last')(x)

        x = BatchNormalization(axis=-1)(x)

        x = LeakyReLU()(x)

        x = Flatten()(x)

        x = Dense(self.outputDim, activation='softmax', name='policy')(x)

        return (x)

    def buildModel(self):

        mainInput = Input(shape=self.inputDim, name='board')

        x = self.convLayer(mainInput, self.hidden_layers[0]['numFilters'],
                           self.hidden_layers[0]['kernelSize'])

        if len(self.hidden_layers) > 1:
            for h in self.hidden_layers[1:]:
                x = self.residualLayer(x, h['numFilters'], h['kernelSize'])

        value_head = self.value_head(x)
        policy_head = self.policy_head(x)

        model = Model(inputs=[mainInput], outputs=[policy_head, value_head])
        model.compile(optimizer=Adam(learning_rate=self.learningRate),
                      loss={
                          'value': 'mse',
                          'policy': 'categorical_crossentropy'
                      },
                    #   loss_weights={
                    #       'value': 0.5,
                    #       'policy': 0.5
                    #   },
                      metrics=['accuracy'])

        # model.compile(optimizer=SGD(lr=self.learningRate, momentum=self.momentum)
        #               loss={
        #                   'value': 'mse',
        #                   'policy': 'categorical_crossentropy'
        #               },
        #               loss_weights={
        #                   'value': 0.5,
        #                   'policy': 0.5
        #               },
        #               metrics=['accuracy'])

        return model


In [0]:
tensorflow.__version__

In [0]:
# coding: utf-8
import tensorflow as tf
import tensorflow.keras as keras
import numpy as np
from tensorflow.keras import layers
import golois


planes = 8
moves = 361
dynamicBatch = True  # pour tester réseau sans installer la bibli golois
if dynamicBatch:
    N = 400000
    input_data = np.random.randint(2, size=(N, 19, 19, planes))
    input_data = input_data.astype('float32')

    policy = np.random.randint(moves, size=(N, ))
    policy = keras.utils.to_categorical(policy)

    value = np.random.randint(2, size=(N, ))
    value = value.astype('float32')

    end = np.random.randint(2, size=(N, 19, 19, 2))
    end = end.astype('float32')

    golois.getBatch(input_data, policy, value, end)
# else:
#     input_data = np.load('./input_data.npy')
#     policy = np.load('./policy.npy')
#     value = np.load('./value.npy')
#     end = np.load('./end.npy')

In [0]:
input_data.shape

In [0]:
# Training
BATCH_SIZE = 128
EPOCHS = 60
REG_CONST = 0.001
LEARNING_RATE = 0.1
MOMENTUM = 0.9

HIDDEN_CNN_LAYERS = [{
    'numFilters': 64,
    'kernelSize': (3, 3)
}, {
    'numFilters': 64,
    'kernelSize': (3, 3)
}, {
    'numFilters': 64,
    'kernelSize': (3, 3)
}, {
    'numFilters': 64,
    'kernelSize': (3, 3)
}]

In [0]:
nHiddenLayers = len(HIDDEN_CNN_LAYERS)
print(len(HIDDEN_CNN_LAYERS))

In [0]:
# Create Go Neural Network
GoNeuralNet = NeuralNet(REG_CONST, LEARNING_RATE,
                        (19, 19, planes), moves, HIDDEN_CNN_LAYERS,
                        MOMENTUM)

In [0]:
# Display summary of neural network
GoNeuralNet.summary()

In [0]:
# Plot model
GoNeuralNet.plot_model()
from IPython.display import Image
Image('model.png');

In [0]:
GoNeuralNet.fit(input_data, {
    'policy': policy,
    'value': value
},
                epochs=60,
                verbose=1,
                validation_split=0.1,
                batch_size=BATCH_SIZE)

In [0]:
import pandas as pd
df = pd.read_csv('./training.log')
epochs = df['epoch']
plt.clf()
f, ax = plt.subplots(2, 3, figsize=(20,10))
ax[0][0].plot(epochs, df['loss'])
ax[0][0].plot(epochs, df['val_loss'])
ax[0][0].legend(['loss', 'val_los'])
ax[0][0].set_title('Total loss')
ax[0][1].plot(epochs, df['policy_loss'])
ax[0][1].plot(epochs, df['val_policy_loss'])
ax[0][1].legend(['policy_loss', 'val_policy_loss'])
ax[0][1].set_title('Policy loss')
ax[0][2].plot(epochs, df['value_loss'])
ax[0][2].plot(epochs, df['val_value_loss'])
ax[0][2].legend(['value_loss', 'val_value_loss'])
ax[0][2].set_title('Value loss')
ax[1][1].plot(epochs, df['policy_acc'])
ax[1][1].plot(epochs, df['val_policy_acc'])
ax[1][1].legend(['policy_acc', 'val_policy_acc'])
ax[1][1].set_title('Policy acc')
ax[1][2].plot(epochs, df['value_acc'])
ax[1][2].plot(epochs, df['val_value_acc'])
ax[1][2].legend(['value_acc', 'val_value_acc'])
ax[1][2].set_title('Value accuarcy')

In [0]:
a=list(df.columns)
print(a)

In [0]:
GoNeuralNet.save_model()

In [0]:
!ls -all

In [0]:
from google.colab import files
files.download('model_' + 
                        str(len(hiddenLayers)) + 'layers_'+
                        str(self.regParam) + 'reg' +
                        '.h5')
files.download('training.log')