## Problem
Given the possible positions of a Connect-4 game, predict the outcome of the match. The dataset is given [here](http://archive.ics.uci.edu/ml/datasets/connect-4).

### Cleaning the raw data
Procure the feature matrix (X), and the target vector (y) from the .data file 


In [1]:
import pandas as pd

allData = pd.read_csv('connect-4.data',header=None)                      #returns a DataFrame
y = allData.values[:,42]                                            #separating y
feature = allData.loc[:,0:41]                                            #separating X
one_hot = pd.get_dummies(feature,drop_first = True)                      #One Hot Encoding
X = one_hot.values

In [2]:
X

array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ..., 
       [0, 1, 0, ..., 0, 0, 0],
       [0, 1, 1, ..., 1, 0, 0],
       [0, 1, 1, ..., 0, 0, 0]], dtype=uint8)

In [3]:
y

array(['win', 'win', 'win', ..., 'loss', 'draw', 'draw'], dtype=object)

In [4]:
from sklearn.metrics import accuracy_score
from sklearn.model_selection import KFold

## 10-fold Cross Validation
The data is divided into 10 parts. A model is trained on 9/10th of the data, then tested on the 
remaining fraction. The process is repeated for each fraction to be used as the test data.

## Decision Tree
A decision tree is a structure that includes a root node, branches, and leaf nodes. Each internal node denotes a test on an attribute, each branch denotes the outcome of a test, and each leaf node holds a class label. The topmost node in the tree is the root node.

In [5]:
from sklearn.tree import DecisionTreeClassifier
kf = KFold(n_splits = 10)
acc=0
clf = DecisionTreeClassifier(criterion='entropy',random_state = 0)
print('Launching Decision Tree Classifier')                              #iterate over the 10Fold generator
fold = 0
for itrain, itest in kf.split(X):
	clf=clf.fit(X[itrain],y[itrain])

	pred = clf.predict(X[itest])

	score = 100*accuracy_score(y[itest],pred)                            #check accuracy
	acc = acc + score
	print('Accuracy for fold',fold,'=',score)
	fold+=1
print('MEAN ACCURACY = ',acc/10,'%')

Launching Decision Tree Classifier
Accuracy for fold 0 = 62.0485494375
Accuracy for fold 1 = 76.7613972765
Accuracy for fold 2 = 76.5245707519
Accuracy for fold 3 = 71.6696269982
Accuracy for fold 4 = 64.8608644168
Accuracy for fold 5 = 70.026642984
Accuracy for fold 6 = 71.0183540557
Accuracy for fold 7 = 67.5795706884
Accuracy for fold 8 = 59.763138416
Accuracy for fold 9 = 65.4478164323
MEAN ACCURACY =  68.5700531457 %


## Naive Bayesian
The Naive Bayes Classifier technique is based on the so-called Bayesian theorem and is particularly suited when the dimensionality of the inputs is high. Despite its simplicity, Naive Bayes can often outperform more sophisticated classification methods.

In [6]:
from sklearn.naive_bayes import MultinomialNB
print('Launching Naive Bayesian Classifier')
kf = KFold(n_splits = 10)
fold = 0
acc=0
clf= MultinomialNB()
for itrain, itest in kf.split(X):
	clf=clf.fit(X[itrain],y[itrain])
	#print('Test indices for fold',fold,itest)
	pred = clf.predict(X[itest])
	#print(pred.tolist())
	score = 100*accuracy_score(y[itest],pred)
	acc = acc+score
	print('Accuracy for fold',fold,'=',score)
	fold+=1
print('MEAN ACCURACY = ',acc/10,'%')


Launching Naive Bayesian Classifier
Accuracy for fold 0 = 65.6601539372
Accuracy for fold 1 = 82.4452338662
Accuracy for fold 2 = 82.9780935465
Accuracy for fold 3 = 73.6530491415
Accuracy for fold 4 = 67.5103611604
Accuracy for fold 5 = 69.0053285968
Accuracy for fold 6 = 76.1397276495
Accuracy for fold 7 = 72.5092524056
Accuracy for fold 8 = 62.6646928201
Accuracy for fold 9 = 62.3390081421
MEAN ACCURACY =  71.4904901266 %


## SVM
A Support Vector Machine (SVM) is a discriminative classifier formally defined by a separating hyperplane. In other words, given labeled training data (supervised learning), the algorithm outputs an optimal hyperplane which categorizes new examples. In two dimentional space this hyperplane is a line dividing a plane in two parts where in each class lay in either side.

In [7]:
from sklearn.svm import LinearSVC
kf = KFold(n_splits = 10)
print('Launching SVM Classifier')
fold = 0
acc = 0
clf = LinearSVC()
for itrain, itest in kf.split(X):
	clf = clf.fit(X[itrain],y[itrain])
	#print('Test indices for fold',fold,itest)
	pred = clf.predict(X[itest])
	#print(pred.tolist())
	score = 100*accuracy_score(y[itest],pred)
	acc = acc+score
	print('Accuracy for fold',fold,'=',score)
	fold+=1
print('MEAN ACCURACY = ',acc/10,'%')

Launching SVM Classifier
Accuracy for fold 0 = 72.3801065719
Accuracy for fold 1 = 82.7116637063
Accuracy for fold 2 = 84.0290112493
Accuracy for fold 3 = 78.2415630551
Accuracy for fold 4 = 70.1302545885
Accuracy for fold 5 = 72.2616933097
Accuracy for fold 6 = 78.5968028419
Accuracy for fold 7 = 73.5011102887
Accuracy for fold 8 = 67.0318282754
Accuracy for fold 9 = 69.1191709845
MEAN ACCURACY =  74.8003204871 %
