## Problem
Given the possible positions of a Connect-4 game, predict the outcome of the match. The dataset is given [here](http://archive.ics.uci.edu/ml/datasets/connect-4).

### Cleaning the raw data
Procure the feature matrix (X), and the target vector (y) from the .data file 


In [1]:
import pandas as pd

allData = pd.read_csv(r'C:\Users\Saif\.anaconda\connect-4.data',header=None)                      #returns a DataFrame
y = allData.values[:,42]                                            #separating y
feature = allData.loc[:,0:41]                                            #separating X
one_hot = pd.get_dummies(feature,drop_first = True)                      #One Hot Encoding
X = one_hot.values

In [2]:
X

array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 1, 0, ..., 0, 0, 0],
       [0, 1, 1, ..., 1, 0, 0],
       [0, 1, 1, ..., 0, 0, 0]], dtype=uint8)

In [3]:
y

array(['win', 'win', 'win', ..., 'loss', 'draw', 'draw'], dtype=object)

In [4]:
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.model_selection import KFold
import numpy as np

## 10-fold Cross Validation
The data is divided into 10 parts. A model is trained on 9/10th of the data, then tested on the 
remaining fraction. The process is repeated for each fraction to be used as the test data.

## Decision Tree
A decision tree is a structure that includes a root node, branches, and leaf nodes. Each internal node denotes a test on an attribute, each branch denotes the outcome of a test, and each leaf node holds a class label. The topmost node in the tree is the root node.

In [5]:
from sklearn.tree import DecisionTreeClassifier
kf = KFold(n_splits = 10)
acc=0
conf = np.zeros((3,3))

clf = DecisionTreeClassifier(criterion='entropy',random_state = 0)
print('Launching Decision Tree Classifier')                              #iterate over the 10Fold generator
fold = 0
for itrain, itest in kf.split(X):
    clf=clf.fit(X[itrain],y[itrain])

    pred = clf.predict(X[itest])

    score = 100*accuracy_score(y[itest],pred)                            #check accuracy
    acc = acc + score
    
    conf += confusion_matrix(y[itest],pred)
   
    #print('Accuracy for fold',fold,'=',score)
    fold+=1
print('MEAN ACCURACY = ',acc/10,'%')
print('MEAN CONFUSION MATRIX = ',conf)

Launching Decision Tree Classifier
MEAN ACCURACY =  68.57005314573026 %
MEAN CONFUSION MATRIX =  [[ 1579.  1994.  2876.]
 [ 1982.  9872.  4781.]
 [ 3727.  5873. 34873.]]


## Naive Bayesian
The Naive Bayes Classifier technique is based on the so-called Bayesian theorem and is particularly suited when the dimensionality of the inputs is high. Despite its simplicity, Naive Bayes can often outperform more sophisticated classification methods.

In [6]:
from sklearn.naive_bayes import MultinomialNB
print('Launching Naive Bayesian Classifier')
kf = KFold(n_splits = 10)
fold = 0
acc=0
conf = np.zeros((3,3))
clf= MultinomialNB()
for itrain, itest in kf.split(X):
    clf=clf.fit(X[itrain],y[itrain])
    #print('Test indices for fold',fold,itest)
    pred = clf.predict(X[itest])
    #print(pred.tolist())
    score = 100*accuracy_score(y[itest],pred)
    acc = acc+score
    conf += confusion_matrix(y[itest],pred)
    #print('Accuracy for fold',fold,'=',score)
    fold+=1
print('MEAN ACCURACY = ',acc/10,'%')
print('MEAN CONFUSION MATRIX = ',conf)

Launching Naive Bayesian Classifier
MEAN ACCURACY =  71.49049012660404 %
MEAN CONFUSION MATRIX =  [[   83.   748.  5618.]
 [  200.  6170. 10265.]
 [  228.  2201. 42044.]]


## SVM
A Support Vector Machine (SVM) is a discriminative classifier formally defined by a separating hyperplane. In other words, given labeled training data (supervised learning), the algorithm outputs an optimal hyperplane which categorizes new examples. In two dimentional space this hyperplane is a line dividing a plane in two parts where in each class lay in either side.

In [7]:
from sklearn.svm import LinearSVC
kf = KFold(n_splits = 10)
print('Launching SVM Classifier')
fold = 0
acc = 0
conf = np.zeros((3,3))
clf = LinearSVC()
for itrain, itest in kf.split(X):
    clf = clf.fit(X[itrain],y[itrain])
    #print('Test indices for fold',fold,itest)
    pred = clf.predict(X[itest])
    #print(pred.tolist())
    score = 100*accuracy_score(y[itest],pred)
    acc = acc+score
    conf += confusion_matrix(y[itest],pred)
    #print('Accuracy for fold',fold,'=',score)
    fold+=1
print('MEAN ACCURACY = ',acc/10,'%')
print('MEAN CONFUSION MATRIX = ',conf)

Launching SVM Classifier
MEAN ACCURACY =  74.80032048711588 %
MEAN CONFUSION MATRIX =  [[2.1000e+01 1.5490e+03 4.8790e+03]
 [1.1000e+01 9.9790e+03 6.6450e+03]
 [3.1000e+01 3.9090e+03 4.0533e+04]]
