# King-Rook vs King-Pawn Chess Problem
## Predicting the winner of this classic final chess game

This is an endgame in a chess game where only the white king and rook are left, and the black king and pawn, where the pawn is in position a7. In this case, it is the King+Rook's side (white) to move. So, the problem is to predict whether the white side will win or not, based on what position each piece is.

You can download the dataset here: [
Chess (King-Rook vs. King-Pawn) Dataset](https://archive.ics.uci.edu/ml/datasets/Chess+%28King-Rook+vs.+King-Pawn%29)

In the next step, we'll build our solution. Therefore, we'll follow all of these steps bellow.

- **0.0.** Data Collection.
- **1.0.** Data description.
- **2.0.** Data preparation
- **3.0.** Machine Learning Modelling

<p align="center">
<img src="img/rook-endgames.png">
</p>

# 0.0. Imports 

In [1]:
import pandas as pd
import numpy  as np

from sklearn.preprocessing         import LabelEncoder
from sklearn.model_selection       import train_test_split
from sklearn.model_selection       import cross_val_score
from sklearn.svm                   import SVC
from sklearn.neural_network        import MLPClassifier
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.metrics               import accuracy_score

import random
import warnings
warnings.filterwarnings( "ignore" )

## 0.1. Helper Functions

In [29]:
# Support Vector Machine
def svm_model( X_train, X_test, y_train, y_test, cv, MAX_EVAL, param ):
    
    final_result = {
        'Regularization': [],
        'Gamma': [],
        'Tol': [],
        'Max_iter': [],
        'Hold_out': [],
        'Cross_validation': []
    }
    
    for i in range( MAX_EVAL ):
        
        # choose values for parameters randomly
        hp = { k: random.sample( v, 1 )[0] for k, v in param.items() }
        final_result['Regularization'].append( hp['C'] )
        final_result['Gamma'].append( hp['gamma'] )
        final_result['Tol'].append( hp['tol'] )
        final_result['Max_iter'].append( hp['max_iter'] )
        
        # model
        model_svm = SVC( C = hp['C'],
                         gamma = hp['gamma'],
                         tol = hp['tol'],
                         max_iter = hp['max_iter'],
                         random_state = 33)
        
        # fit and training
        y_pred = model_svm.fit( X_train, y_train ).predict( X_test )
        
        # performance - hold-out
        result = accuracy_score( y_test, y_pred )
        
        # performance - cross validation
        result_cv = cross_val_score( model_svm, X_train, y_train, cv=cv )
        
        final_result['Hold_out'].append( format( result, ".2f" ) )
        final_result['Cross_validation'].append( format( np.mean( result_cv ), ".2f" ) )
        
    return pd.DataFrame( final_result ) 

# Multilayer Perceptron
def mlp_model( X_train, X_test, y_train, y_test, cv, MAX_EVAL, param ):
    
    final_result = {
        'Hidden_layer_sizes': [],
        'Activation': [],
        'Solver': [],
        'Hold_out': [],
        'Cross_validation': []
    }
    
    for i in range( MAX_EVAL ):
        
        # choose values for parameters randomly
        hp = { k: random.sample( v, 1 )[0] for k, v in param.items() }
        final_result['Hidden_layer_sizes'].append( hp['hls'] )
        final_result['Activation'].append( hp['activation'] )
        final_result['Solver'].append( hp['solver'])
        
        # model
        model_mlp = MLPClassifier( hidden_layer_sizes = hp['hls'],
                                   activation = hp['activation'],
                                   solver = hp['solver'],
                                   random_state = 33 )
        
        # fit and training
        y_pred = model_mlp.fit( X_train, y_train ).predict( X_test )
        
        # performance - hold-out
        result = accuracy_score( y_test, y_pred )
        
        # performance - cross validation
        result_cv = cross_val_score( model_mlp, X_train, y_train, cv=cv )
        
        final_result['Hold_out'].append( format( result, ".2f" ) )
        final_result['Cross_validation'].append( format ( np.mean( result_cv ), ".2f" ) )
        
    return pd.DataFrame( final_result ) 

# Linear Discriminant Analysis
def lda_model( X_train, X_test, y_train, y_test, cv, MAX_EVAL, param ):
    
    final_result = {
        'Solver': [],
        'Hold_out': [],
        'Cross_validation': []
    }
    
    for i in range( MAX_EVAL ):
        
        # choose values for parameters randomly
        hp = { k: random.sample( list(v), 1 )[0] for k, v in param.items() }
        final_result['Solver'].append( hp['solver'] )
        
        # model
        model_lda = LinearDiscriminantAnalysis( solver = hp['solver'] )
        
        # fit and training
        y_pred = model_lda.fit( X_train, y_train ).predict( X_test )
        
        # performance - hold-out
        result = accuracy_score( y_test, y_pred )
        
        # performance - cross validation
        result_cv = cross_val_score( model_lda, X_train, y_train, cv=cv )
        
        final_result['Hold_out'].append( format( result, ".2f" ) )
        final_result['Cross_validation'].append( format ( np.mean( result_cv ), ".2f" ) )
        
    return pd.DataFrame( final_result ) 

## 0.2. Loading Data

In [3]:
df_raw = pd.read_csv( 'datasets/kr-vs-kp.data' )

# 1.0. Data Description

- Attribute Summaries:
    - Classes (2):  -- White-can-win ("won") and White-cannot-win ("nowin").

- Class Distribution:
    - In 1669 of the positions (52%), White can win.
    - In 1527 of the positions (48%), White cannot win.

The format for instances in this database is a sequence of 37 attribute values.
Each instance is a board-descriptions for this chess endgame.  The first
36 attributes describe the board.  The last (37th) attribute is the
classification: "win" or "nowin". A typical board-description is

f,f,f,f,f,f,f,f,f,f,f,f,l,f,n,f,f,t,f,f,f,f,f,f,f,t,f,f,f,f,f,f,f,t,t,n,won

Instead, each feature correponds to a particular position in the feature-value list.  For example, the head of this list is the value for the feature "bkblk".  The following is the list of features, in the order in which their values appear in the feature-value list:

[bkblk,bknwy,bkon8,bkona,bkspr,bkxbq,bkxcr,bkxwp,blxwp,bxqsq,cntxt,dsopp,dwipd,hdchk,katri,mulch,qxmsq,r2ar8,
 reskd,reskr,rimmx,rkxwp,rxmsq,simpl,skach,skewr, skrxp,spcop,stlmt,thrsk,wkcti,wkna8,wknck,wkovl,wkpos,wtoeg]

In the file, there is one instance (board position) per line.

In [4]:
# Shuffling the dataset
df = df_raw.sample( frac = 1 )
df.head()

Unnamed: 0,f,f.1,f.2,f.3,f.4,f.5,f.6,f.7,f.8,f.9,...,f.23,f.24,f.25,f.26,f.27,f.28,t.2,t.3,n.1,won
14,f,f,f,f,t,f,f,f,f,f,...,f,f,f,f,f,f,t,t,n,won
2655,f,f,f,f,t,t,f,t,t,t,...,f,f,f,f,f,t,t,t,t,nowin
1861,t,f,f,f,t,t,t,t,t,f,...,f,f,f,f,f,f,f,f,n,won
2424,f,f,f,f,f,t,f,f,f,t,...,f,f,f,f,f,f,t,t,n,nowin
847,f,f,f,f,t,f,f,f,f,t,...,f,f,f,f,f,f,t,t,n,won


## 1.1. Data Dimensions

In [3]:
df.shape

(3195, 37)

## 1.2. Data Types

In [5]:
df.dtypes

f       object
f.1     object
f.2     object
f.3     object
f.4     object
f.5     object
f.6     object
f.7     object
f.8     object
f.9     object
f.10    object
f.11    object
l       object
f.12    object
n       object
f.13    object
f.14    object
t       object
f.15    object
f.16    object
f.17    object
f.18    object
f.19    object
f.20    object
f.21    object
t.1     object
f.22    object
f.23    object
f.24    object
f.25    object
f.26    object
f.27    object
f.28    object
t.2     object
t.3     object
n.1     object
won     object
dtype: object

## 1.3. Check NA

In [6]:
df.isnull().sum()

f       0
f.1     0
f.2     0
f.3     0
f.4     0
f.5     0
f.6     0
f.7     0
f.8     0
f.9     0
f.10    0
f.11    0
l       0
f.12    0
n       0
f.13    0
f.14    0
t       0
f.15    0
f.16    0
f.17    0
f.18    0
f.19    0
f.20    0
f.21    0
t.1     0
f.22    0
f.23    0
f.24    0
f.25    0
f.26    0
f.27    0
f.28    0
t.2     0
t.3     0
n.1     0
won     0
dtype: int64

# 2.0. Data Preparation

## 2.1. Encoding

In [5]:
# Categorical encoding
for i in df:
    df[i] = LabelEncoder().fit_transform( df[i] )

## 2.2. Split dataframe into training and test dataset

In [6]:
X_train, X_test, y_train, y_test = train_test_split( df.iloc[:, :-1], df.iloc[:, -1], test_size=0.3, random_state=1 )

# 3.0. Machine Learning Modelling

## 3.1. Support Vector Machine

In [7]:
# Parameters and number of iterations
param = {
    'C': [0.1, 1, 10, 100],
    'gamma': [1, 0.1, 0.01, 0.001],
    'tol': [0.0001, 0.001, 0.01, 0.1],
    'max_iter': [-1, 200, 500, 1000]
}

MAX_EVAL = 3

### 3.1.1. k = 5 for cross validation

In [8]:
svm_model( X_train, X_test, y_train, y_test, 5, MAX_EVAL, param )

Unnamed: 0,Regularization,Gamma,Tol,Max_iter,Hold_out,Cross_validation
0,0.1,0.001,0.0001,1000,0.72,0.51
1,1.0,0.001,0.1,-1,0.82,0.84
2,0.1,0.1,0.01,1000,0.92,0.94


### 3.1.2. k = 10 for cross validation

In [9]:
svm_model( X_train, X_test, y_train, y_test, 10, MAX_EVAL, param )

Unnamed: 0,Regularization,Gamma,Tol,Max_iter,Hold_out,Cross_validation
0,10.0,0.1,0.001,200,0.99,0.99
1,100.0,1.0,0.01,-1,0.98,0.98
2,0.1,1.0,0.1,500,0.97,0.98


## 3.2. Multilayer Perceptron

In [10]:
# Parameters and number of iterations
param = {
    'hls': [ (50,), (100,), (100, 50,), (150, 100,) ],
    'activation': [ 'logistic', 'relu', 'tanh', 'identity' ],
    'solver': [ 'sgd', 'lbfgs', 'adam' ],
}

MAX_EVAL = 3

### 3.2.1. k = 5 for cross validation

In [11]:
mlp_model( X_train, X_test, y_train, y_test, 5, MAX_EVAL, param )

Unnamed: 0,Hidden_layer_sizes,Activation,Solver,Hold_out,Cross_validation
0,"(50,)",identity,sgd,0.91,0.94
1,"(50,)",logistic,adam,0.96,0.96
2,"(100, 50)",logistic,sgd,0.54,0.51


### 3.2.2. k = 10 for cross validation

In [12]:
mlp_model( X_train, X_test, y_train, y_test, 10, MAX_EVAL, param )

Unnamed: 0,Hidden_layer_sizes,Activation,Solver,Hold_out,Cross_validation
0,"(50,)",logistic,adam,0.96,0.96
1,"(100,)",logistic,sgd,0.81,0.84
2,"(150, 100)",relu,adam,0.99,0.99


## 3.3. Linear Discriminant Analysis

In [18]:
# Parameters and number of iterations
param = { 'solver': ['svd', 'lsqr', 'eigen'], 'shrinkage': np.arange(0, 1, 0.01) }

MAX_EVAL = 3

### 3.3.1. k = 5 for cross validation

In [34]:
lda_model( X_train, X_test, y_train, y_test, 5, MAX_EVAL, param )

Unnamed: 0,Solver,Hold_out,Cross_validation
0,svd,0.92,0.94
1,svd,0.92,0.94
2,lsqr,0.92,0.94


### 3.3.2. k = 10 for cross validation

In [36]:
lda_model( X_train, X_test, y_train, y_test, 10, MAX_EVAL, param )

Unnamed: 0,Solver,Hold_out,Cross_validation
0,svd,0.92,0.94
1,lsqr,0.92,0.94
2,lsqr,0.92,0.94
