***Tic-Tac-Toe End Game Classification***<br>
The Tic-Tac-Toe end game classification is a binary classification task of possible configurations in tic-tac-toe game.<br>
Dataset is taken from link https://archive.ics.uci.edu/ml/machine-learning-databases/tic-tac-toe/ <br>

The tic-tac-toe dataset encodes the complete set of possible board configurations at the end of Tic-Tac-Toe games using:<br>
- 9 attributes(each attribute corresponding to one tic-tac-toe square and has 3 attribute values x, o, and b)
  - x indicates player 'X'  
  - o indicates player 'O'  
  - b indicates 'blank'
- 2 classes (positive, negative)  

“X” is assumed to play first.<br>
The target concept is “win of X” (i.e., true when “x” has one of 8 possible ways to create a “three-in-a-row”)<br>

Attributes(A1,...,A9) set is represented as follows:<br>
- A1:-top-left-square: {x,o,b}<br>
- A2:-top-middle-square: {x,o,b}<br>
- A3:-top-right-square: {x,o,b}<br>
- A4:-middle-left-square: {x,o,b}<br>
- A5:-middle-middle-square: {x,o,b}<br>
- A6:-middle-right-square: {x,o,b}<br>
- A7:-bottom-left-square: {x,o,b}<br>
- A8:-bottom-middle-square: {x,o,b}<br>
- A9:-bottom-right-square: {x,o,b}<br>
- Class: {positive,negative}<br> 

In [1]:
#importing the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

from sklearn.naive_bayes import BernoulliNB
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import classification_report
from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import Pipeline
from time import time

In [2]:
#loading the dataset
dataset=pd.read_csv('tic-tac-toe.data', names=['A1','A2','A3','A4','A5','A6','A7','A8', 'A9', 'Class'])
dataset.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 958 entries, 0 to 957
Data columns (total 10 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   A1      958 non-null    object
 1   A2      958 non-null    object
 2   A3      958 non-null    object
 3   A4      958 non-null    object
 4   A5      958 non-null    object
 5   A6      958 non-null    object
 6   A7      958 non-null    object
 7   A8      958 non-null    object
 8   A9      958 non-null    object
 9   Class   958 non-null    object
dtypes: object(10)
memory usage: 75.0+ KB


**Observations from the output**<br>
- There are total 958 instances/examples represented in rows and 9 attributes with 1 class target variable<br>

In [3]:
dataset['Class'].unique()

array(['positive', 'negative'], dtype=object)

In [4]:
dataset.groupby('Class').describe()

Unnamed: 0_level_0,A1,A1,A1,A1,A2,A2,A2,A2,A3,A3,...,A7,A7,A8,A8,A8,A8,A9,A9,A9,A9
Unnamed: 0_level_1,count,unique,top,freq,count,unique,top,freq,count,unique,...,top,freq,count,unique,top,freq,count,unique,top,freq
Class,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
negative,332,3,o,146,332,3,x,153,332,3,...,o,146,332,3,x,153,332,3,o,146
positive,626,3,x,295,626,3,o,229,626,3,...,x,295,626,3,o,229,626,3,x,295


**Observations from the output**<br>
- Dataset contains 626 positive examples and 332 negative examples<br>
- Class Distribution: About 65.3% are positive in dataset(i.e., wins for "x")<br>

In [5]:
dataset.head()

Unnamed: 0,A1,A2,A3,A4,A5,A6,A7,A8,A9,Class
0,x,x,x,x,o,o,x,o,o,positive
1,x,x,x,x,o,o,o,x,o,positive
2,x,x,x,x,o,o,o,o,x,positive
3,x,x,x,x,o,o,o,b,b,positive
4,x,x,x,x,o,o,b,o,b,positive


In [6]:
#convert the labels to values 
from sklearn.preprocessing import LabelEncoder
le=LabelEncoder()
for column in dataset.columns:
    dataset[column]=le.fit_transform(dataset[column])
dataset.head()

Unnamed: 0,A1,A2,A3,A4,A5,A6,A7,A8,A9,Class
0,2,2,2,2,1,1,2,1,1,1
1,2,2,2,2,1,1,1,2,1,1
2,2,2,2,2,1,1,1,1,2,1
3,2,2,2,2,1,1,1,0,0,1
4,2,2,2,2,1,1,0,1,0,1


In [7]:
X=dataset.iloc[ : , :-1]
Y=dataset.iloc[ : , -1]
print(X)
print(Y)

     A1  A2  A3  A4  A5  A6  A7  A8  A9
0     2   2   2   2   1   1   2   1   1
1     2   2   2   2   1   1   1   2   1
2     2   2   2   2   1   1   1   1   2
3     2   2   2   2   1   1   1   0   0
4     2   2   2   2   1   1   0   1   0
..   ..  ..  ..  ..  ..  ..  ..  ..  ..
953   1   2   2   2   1   1   1   2   2
954   1   2   1   2   2   1   2   1   2
955   1   2   1   2   1   2   2   1   2
956   1   2   1   1   2   2   2   1   2
957   1   1   2   2   2   1   1   2   2

[958 rows x 9 columns]
0      1
1      1
2      1
3      1
4      1
      ..
953    0
954    0
955    0
956    0
957    0
Name: Class, Length: 958, dtype: int32


In [8]:
#Spliting the dataset into training and test data set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.2, random_state = 1)

In [9]:
print(X_train)

     A1  A2  A3  A4  A5  A6  A7  A8  A9
414   1   1   0   2   2   2   0   0   0
725   2   0   1   2   0   1   0   2   1
223   2   0   2   2   1   1   2   1   0
757   1   2   2   1   1   1   2   2   0
947   2   1   2   2   1   1   1   2   2
..   ..  ..  ..  ..  ..  ..  ..  ..  ..
767   1   2   2   0   2   2   1   1   1
72    2   2   2   0   0   1   1   0   0
908   0   2   0   2   2   0   1   1   1
235   2   0   2   1   0   2   1   1   2
37    2   2   2   1   0   1   2   1   0

[766 rows x 9 columns]


In [10]:
print(X_test)

     A1  A2  A3  A4  A5  A6  A7  A8  A9
241   2   0   1   2   2   2   0   1   1
849   1   1   0   1   2   2   1   2   2
436   1   0   2   1   2   0   2   2   1
386   1   1   2   1   2   0   2   2   0
345   1   2   0   2   2   2   1   1   0
..   ..  ..  ..  ..  ..  ..  ..  ..  ..
177   2   1   1   1   0   0   2   2   2
605   0   0   2   0   1   2   1   0   2
60    2   2   2   0   1   1   0   2   1
268   2   0   1   0   2   2   1   1   2
784   1   2   1   0   1   2   2   2   1

[192 rows x 9 columns]


In [11]:
print(y_train)

414    1
725    0
223    1
757    0
947    0
      ..
767    0
72     1
908    0
235    1
37     1
Name: Class, Length: 766, dtype: int32


In [12]:
print(y_test)

241    1
849    0
436    1
386    1
345    1
      ..
177    1
605    1
60     1
268    1
784    0
Name: Class, Length: 192, dtype: int32


***BERNOULLI NAIVE BAYES MODEL***
- This multivariate binary classifier is suitable for handling discrete data.

In [13]:
t1=time()
print("BERNOULLI NAIVE BAYES MODEL")
model=BernoulliNB()
clf = model.fit(X_train,y_train)
print(clf.score(X_test, y_test))
t2= time()
print("total duration=", t2-t1)


BERNOULLI NAIVE BAYES MODEL
0.6510416666666666
total duration= 0.033084869384765625


**Observations from the output**<br>
The bernoulli naive bayes methods is accurately able to classify only 65% of dataset.

***Decision Tree MODEL***<br>
-This model partitions the dataset into subsets.The partition process starts with binary split and continues until no further split is possible

In [14]:
t3=time()
print("DECISION TREE MODEL")
dtModel=DecisionTreeClassifier(min_samples_split=10, random_state=1)
clfDT = dtModel.fit(X_train,y_train)
print(clfDT.score(X_test, y_test))
t4= time()
print("total duration=", t4-t3)


DECISION TREE MODEL
0.7916666666666666
total duration= 0.019254446029663086


***Observation From Output***<br>
- The decision tree machine learning algorithm is able to accurately classify 79% of dataset which is better than Naive Bayes Bernoulli Model.<br>
- The decision tree machine learning algorithm's performance can be enhanced by calculating entropy gain and spilitting based on highest information gain value 

In [15]:
#entropy based classification report using GridSearch
pl=Pipeline([('clfDT', DecisionTreeClassifier(criterion='entropy'))])
parameters={'clfDT__max_depth':(4,16,40), 'clfDT__min_samples_split':(1,4,8), 'clfDT__min_samples_leaf':(1,2,3)}
gs=GridSearchCV(pl, parameters, n_jobs=-1, verbose=1, scoring='f1')
gs.fit(X_train, y_train)

print("Score:", gs.best_score_)
print("Parameters to be set:")
params=gs.best_estimator_.get_params()

for param_name in sorted(params.keys()):
    print(param_name,params[param_name])

predictions=gs.predict(X_test)
print(classification_report(y_test,predictions))
    


Fitting 5 folds for each of 27 candidates, totalling 135 fits


[Parallel(n_jobs=-1)]: Using backend LokyBackend with 4 concurrent workers.


Score: 0.8713482898232232
Parameters to be set:
clfDT DecisionTreeClassifier(criterion='entropy', max_depth=40, min_samples_split=4)
clfDT__ccp_alpha 0.0
clfDT__class_weight None
clfDT__criterion entropy
clfDT__max_depth 40
clfDT__max_features None
clfDT__max_leaf_nodes None
clfDT__min_impurity_decrease 0.0
clfDT__min_impurity_split None
clfDT__min_samples_leaf 1
clfDT__min_samples_split 4
clfDT__min_weight_fraction_leaf 0.0
clfDT__presort deprecated
clfDT__random_state None
clfDT__splitter best
memory None
steps [('clfDT', DecisionTreeClassifier(criterion='entropy', max_depth=40, min_samples_split=4))]
verbose False
              precision    recall  f1-score   support

           0       0.80      0.79      0.80        67
           1       0.89      0.90      0.89       125

    accuracy                           0.86       192
   macro avg       0.85      0.84      0.84       192
weighted avg       0.86      0.86      0.86       192



[Parallel(n_jobs=-1)]: Done  68 tasks      | elapsed:    2.3s
[Parallel(n_jobs=-1)]: Done 135 out of 135 | elapsed:    2.4s finished


In [16]:
#setting values from classification report
t5=time()
print("DECISION TREE MODEL")
dtEModel=DecisionTreeClassifier(min_samples_split=2,max_depth=16, min_samples_leaf=1,random_state=1)
clfDTE = dtEModel.fit(X_train,y_train)
print(clfDTE.score(X_test, y_test))
t6= time()
print("total duration=", t6-t5)


DECISION TREE MODEL
0.8333333333333334
total duration= 0.017049789428710938


***Observation From Output***<br>
- The decision tree machine learning algorithm's performance has improvised to accurately classify 83% of dataset 