## Intro

This notebook represents a group effort in evaluating the performance of classifiers and regressors in the context of a tic-tac-toe game. Collaboratively, we explored the effectiveness of various machine learning models using provided datasets. Our analysis involved generating accuracy scores and confusion matrices to assess the performance of different algorithms.

+ X = +1 
+ O = -1
+ Empty squre = 0

+ Goal: produce reasonable moves for player O

In [1]:
import numpy as np
from sklearn.model_selection import cross_val_score, KFold, train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier 
from sklearn.neural_network import MLPClassifier
from sklearn.linear_model import LinearRegression
from sklearn.neural_network import MLPRegressor
from sklearn.neighbors import KNeighborsRegressor
import warnings

### 1. Load datasets

In [2]:
# Load intermediate boards optimal play (single label) dataset
single_data = np.loadtxt("data/tictac_single.txt")
single_X = single_data[:, :-1]
single_y = single_data[:, -1]

In [3]:
# Load intermediate boards optimal play (multi label) dataset
multi_data = np.loadtxt("data/tictac_multi.txt")
multi_X = multi_data[:, :-9]
multi_y = multi_data[:, -9:]

In [4]:
# Load final boards classification dataset
final_data = np.loadtxt("data/tictac_final.txt")
final_X = final_data[:, :-1]
final_y = final_data[:, -1]

### 2. Define classifiers and regressors

In [5]:
# Classifiers
classifiers = {
    "LinearSVM": SVC(kernel="linear"),
    "KNN_classifier": KNeighborsClassifier(n_neighbors=5),
    "MLP_classifier": MLPClassifier(hidden_layer_sizes=(100,), max_iter=1000)
}


# Regressors
regressors = {
    "Linear Regression": LinearRegression(),
    "KNN_regressor": KNeighborsRegressor(n_neighbors=5),
    "MLP_regressor": MLPRegressor(hidden_layer_sizes=(100,), max_iter=1000)
}


### 3. Evaluate classifiers and regressors

In [6]:
cv = KFold(n_splits=10, shuffle=True, random_state=42) # KFold with 10 folds, shuffling, and fixed random seed


# Function to evaluate classifiers
def evaluate_classifiers(X, y, classifiers):
    results = {}
    for name, clf in classifiers.items():
        scores = cross_val_score(clf, X, y, cv=cv)
        results[name] = scores
    return results


# Function to evaluate regressors
def evaluate_regressors(X, y, regressors):
    results = {}
    for name, reg in regressors.items():
        scores = cross_val_score(reg, X, y, cv=cv)
        results[name] = scores
    return results

In [7]:
warnings.filterwarnings("ignore")


# Function to print evaluated classifiers
def print_evaluated_classifiers():
    print("Final Boards Classification:")
    final_results_classifiers = evaluate_classifiers(final_X, final_y, classifiers)

    for name, scores in final_results_classifiers.items():
        print(f"{name}:\nAccuracy: {scores.mean():.3f}")

        # Confusion Matrix
        X_train, X_test, y_train, y_test = train_test_split(final_X, final_y, test_size=0.2, random_state=42)
        clf = classifiers[name]
        clf.fit(X_train, y_train)
        y_pred = clf.predict(X_test)
        print(f"Confusion Matrix for {name}:\n{confusion_matrix(y_test, y_pred)}\n")




    print("Intermediate Boards Optimal Play (Single Label):")
    single_results_classifiers = evaluate_classifiers(single_X, single_y, classifiers)

    for name, scores in single_results_classifiers.items():
        print(f"{name}:\nAccuracy: {scores.mean():.3f}")

        # Confusion Matrix
        X_train, X_test, y_train, y_test = train_test_split(single_X, single_y, test_size=0.2, random_state=42)
        clf = classifiers[name]
        clf.fit(X_train, y_train)
        y_pred = clf.predict(X_test)
        print(f"Confusion Matrix for {name}:\n{confusion_matrix(y_test, y_pred)}\n")

    
    
# Function to print evaluated regressors
def print_evaluated_regressors():
    print("Intermediate Boards Optimal Play (Multi Label):")
    multi_results_regressors = evaluate_regressors(multi_X, multi_y, regressors)

    for name, scores in multi_results_regressors.items():
        print(f"{name}:\n Accuracy: {scores.mean():.3f}")


In [8]:
print_evaluated_classifiers()
print_evaluated_regressors()

Final Boards Classification:
LinearSVM:
Accuracy: 0.983
Confusion Matrix for LinearSVM:
[[ 61   6]
 [  0 125]]

KNN_classifier:
Accuracy: 0.999
Confusion Matrix for KNN_classifier:
[[ 66   1]
 [  0 125]]

MLP_classifier:
Accuracy: 0.985
Confusion Matrix for MLP_classifier:
[[ 62   5]
 [  0 125]]

Intermediate Boards Optimal Play (Single Label):
LinearSVM:
Accuracy: 0.365
Confusion Matrix for LinearSVM:
[[321   0   0   0   0   0   0   0   0]
 [ 73  18  26   0  36   0   0   0   0]
 [104   0  56   0  35   0   0   0   0]
 [ 75   5  15   3  20   0   0   0   0]
 [130   0   0   0  89   0   0   0   0]
 [ 42   6  16   0   6   0   0   0   0]
 [ 56   6  18   3  11   0   0   0   0]
 [ 26  10  14   0   3   0   0   0   0]
 [ 58   5  24   0   4   0   0   0   0]]

KNN_classifier:
Accuracy: 0.761
Confusion Matrix for KNN_classifier:
[[278   5  13   1  11   3   8   0   2]
 [  6 110   7  10  11   0   5   0   4]
 [ 20   7 152   5   4   1   3   0   3]
 [ 16   8   5  74   8   1   3   1   2]
 [ 21   8   4  1

### 4. Tuning the models

In [9]:
best_knn_classifier = KNeighborsClassifier(n_neighbors = 6, weights= 'distance')
classifiers["KNN_classifier"] = best_knn_classifier

best_mlp_classifier = MLPClassifier(hidden_layer_sizes=(100, 50), max_iter=1000, alpha=0.0001, learning_rate_init=0.001)
classifiers["MLP_classifier"] = best_mlp_classifier
print_evaluated_classifiers()

Final Boards Classification:
LinearSVM:
Accuracy: 0.983
Confusion Matrix for LinearSVM:
[[ 61   6]
 [  0 125]]

KNN_classifier:
Accuracy: 0.999
Confusion Matrix for KNN_classifier:
[[ 66   1]
 [  0 125]]

MLP_classifier:
Accuracy: 0.992
Confusion Matrix for MLP_classifier:
[[ 63   4]
 [  0 125]]

Intermediate Boards Optimal Play (Single Label):
LinearSVM:
Accuracy: 0.365
Confusion Matrix for LinearSVM:
[[321   0   0   0   0   0   0   0   0]
 [ 73  18  26   0  36   0   0   0   0]
 [104   0  56   0  35   0   0   0   0]
 [ 75   5  15   3  20   0   0   0   0]
 [130   0   0   0  89   0   0   0   0]
 [ 42   6  16   0   6   0   0   0   0]
 [ 56   6  18   3  11   0   0   0   0]
 [ 26  10  14   0   3   0   0   0   0]
 [ 58   5  24   0   4   0   0   0   0]]

KNN_classifier:
Accuracy: 0.911
Confusion Matrix for KNN_classifier:
[[299   0   8   0   3   3   6   2   0]
 [  3 130   4   4   4   0   2   2   4]
 [ 12   3 178   2   0   0   0   0   0]
 [ 11   6   3  92   3   0   1   0   2]
 [ 12   7   2   

In [10]:
best_knn_regressor = KNeighborsRegressor(metric='euclidean', n_neighbors=6, weights='distance')

best_mlp_regressor = MLPRegressor(hidden_layer_sizes=(300, 200, 100), max_iter=1000, alpha=0.0001, learning_rate='adaptive', learning_rate_init=0.001)

regressors["KNN_regressor"] = best_knn_regressor
regressors["MLP_regressor"] = best_mlp_regressor

print_evaluated_regressors()

Intermediate Boards Optimal Play (Multi Label):
Linear Regression:
 Accuracy: 0.001
KNN_regressor:
 Accuracy: 0.851
MLP_regressor:
 Accuracy: 0.853


### 5. Results

1) Final Boards Classification:
     + Linear SVM achieved an accuracy of 98.3%.
     + KNN classifier achieved an accuracy of 99.9%.
     + MLP classifier achieved an accuracy of 99.2%.
2) Intermediate Boards Optimal Play (Single Label):
     + Linear SVM achieved an accuracy of 37.1%.
     + KNN classifier achieved an accuracy of 91.1%.
     + MLP classifier achieved an accuracy of 94.9%.
3) Intermediate Boards Optimal Play (Multi Label):
     + Linear Regression achieved an accuracy of 0.1%.
     + KNN regressor achieved an accuracy of 85.1%.
     + MLP regressor achieved an accuracy of 85.3%.

Reasoning: 
 + The "Final Boards Classification" task involves predicting the winning player based on the final state of the game. All three classifiers (Linear SVM, KNN, and MLP) performed exceptionally well, with high accuracies above 98%. This is likely because the final state of the game contains clear patterns that these classifiers were able to learn effectively.
 + In contrast, the "Intermediate Boards Optimal Play (Single Label)" task involves predicting the best move for the O player based on the current state of the game. Here, the MLP classifier outperformed the others with an accuracy of 94.9%. MLP's success could be attributed to its ability to capture complex patterns in the data, which are necessary for making optimal move predictions in a dynamic game like tic-tac-toe.
 + The "Intermediate Boards Optimal Play (Multi Label)" task aims to predict multiple optimal moves for the O player. MLP regressor performed slightly better than KNN regressor and linear regression, possibly due to its ability to handle non-linear relationships and capture complex interactions between the game board features and the optimal moves.
 
Overall, MLP (both classifier and regressor) demonstrated superior performance in capturing the complex patterns present in the tic-tac-toe game, resulting in better predictive accuracy compared to other methods. Also, the nature of the task (classification or regression) influenced the choice of the best-performing model. Specifically, we chose the MLP classifier for the game implementation due to its high accuracy and robust performance across both classification tasks: final boards classification and intermediate boards optimal play. With an accuracy of 94.9% for intermediate boards optimal play, the MLP classifier outperformed the other classifiers, including Linear SVM and KNN. 
Additionally, the confusion matrices revealed that the MLP classifier achieved better balance in predicting optimal moves across different board configurations compared to the other classifiers. The MLP's ability to capture complex patterns in the data through its multilayer architecture likely contributed to its superior performance, making it a suitable choice for accurately predicting optimal moves in the tic-tac-toe game implementation.

### 6. Saving the model

In [11]:
from joblib import dump

# Save the model to a file
dump(classifiers["MLP_classifier"], 'model.joblib')

['model.joblib']