# Challenge 1 - Tic Tac Toe

In this lab you will perform deep learning analysis on a dataset of playing [Tic Tac Toe](https://en.wikipedia.org/wiki/Tic-tac-toe).

There are 9 grids in Tic Tac Toe that are coded as the following picture shows:

![Tic Tac Toe Grids](tttboard.jpg)

In the first 9 columns of the dataset you can find which marks (`x` or `o`) exist in the grids. If there is no mark in a certain grid, it is labeled as `b`. The last column is `class` which tells you whether Player X (who always moves first in Tic Tac Toe) wins in this configuration. Note that when `class` has the value `False`, it means either Player O wins the game or it ends up as a draw.

Follow the steps suggested below to conduct a neural network analysis using Tensorflow and Keras. You will build a deep learning model to predict whether Player X wins the game or not.

## Step 1: Data Engineering

This dataset is almost in the ready-to-use state so you do not need to worry about missing values and so on. Still, some simple data engineering is needed.

1. Read `tic-tac-toe.csv` into a dataframe.
1. Inspect the dataset. Determine if the dataset is reliable by eyeballing the data.
1. Convert the categorical values to numeric in all columns.
1. Separate the inputs and output.
1. Normalize the input data.

In [None]:
# The code above imports the following libraries:

# pandas is a data manipulation library that provides data structures and functions to manipulate and analyze data.
# train_test_split is a function from the sklearn.model_selection module that splits a dataset into two parts, a training set and a testing set, so that the model can be trained and evaluated on separate data.
# fetch_openml is a function from the sklearn.datasets module that retrieves datasets from the OpenML platform.
# ConvergenceWarning is a warning from the sklearn library that is raised when a model fails to converge.
# MLPClassifier is a class from the sklearn.neural_network module that implements a Multi-Layer Perceptron (MLP) classifier, a type of artificial neural network.
# preprocessing is a module from the sklearn library that provides preprocessing functions for data.
# OneHotEncoder is a class from the sklearn.preprocessing module that converts categorical variables into numerical variables through one-hot encoding.

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.datasets import fetch_openml
from sklearn.exceptions import ConvergenceWarning
from sklearn.neural_network import MLPClassifier
from sklearn import preprocessing
from sklearn.preprocessing import OneHotEncoder

In [2]:
data = pd.read_csv('tic-tac-toe.csv')

In [3]:
data.head()

Unnamed: 0,TL,TM,TR,ML,MM,MR,BL,BM,BR,class
0,x,x,x,x,o,o,x,o,o,True
1,x,x,x,x,o,o,o,x,o,True
2,x,x,x,x,o,o,o,o,x,True
3,x,x,x,x,o,o,o,b,b,True
4,x,x,x,x,o,o,b,o,b,True


In [4]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 958 entries, 0 to 957
Data columns (total 10 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   TL      958 non-null    object
 1   TM      958 non-null    object
 2   TR      958 non-null    object
 3   ML      958 non-null    object
 4   MM      958 non-null    object
 5   MR      958 non-null    object
 6   BL      958 non-null    object
 7   BM      958 non-null    object
 8   BR      958 non-null    object
 9   class   958 non-null    bool  
dtypes: bool(1), object(9)
memory usage: 68.4+ KB


In [None]:
# The code above performs one-hot encoding on a pandas data frame data.

# OneHotEncoder(drop='first') creates an instance of the OneHotEncoder class, with the drop='first' argument meaning that the first category in each column will be dropped to avoid the "dummy variable trap".
# encoder.fit(data) fits the encoder on the data data frame, learning the categories in each column.
# encoder.get_feature_names(input_features=data.columns) returns the names of the newly created one-hot encoded columns.
# pd.DataFrame(encoder.transform(data).toarray(),columns=cols) creates a new data frame data_encoded with the one-hot encoded data, and the columns are named using the cols variable.
# data_encoded.head() returns the first 5 rows of data_encoded.
# The resulting data_encoded data frame contains the one-hot encoded representation of the original data data frame.

In [7]:
encoder = OneHotEncoder(drop='first').fit(data)
cols = encoder.get_feature_names_out(input_features=data.columns)
data_encoded = pd.DataFrame(encoder.transform(data).toarray(),columns=cols)
data_encoded.head()

Unnamed: 0,TL_o,TL_x,TM_o,TM_x,TR_o,TR_x,ML_o,ML_x,MM_o,MM_x,MR_o,MR_x,BL_o,BL_x,BM_o,BM_x,BR_o,BR_x,class_True
0,0.0,1.0,0.0,1.0,0.0,1.0,0.0,1.0,1.0,0.0,1.0,0.0,0.0,1.0,1.0,0.0,1.0,0.0,1.0
1,0.0,1.0,0.0,1.0,0.0,1.0,0.0,1.0,1.0,0.0,1.0,0.0,1.0,0.0,0.0,1.0,1.0,0.0,1.0
2,0.0,1.0,0.0,1.0,0.0,1.0,0.0,1.0,1.0,0.0,1.0,0.0,1.0,0.0,1.0,0.0,0.0,1.0,1.0
3,0.0,1.0,0.0,1.0,0.0,1.0,0.0,1.0,1.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0
4,0.0,1.0,0.0,1.0,0.0,1.0,0.0,1.0,1.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0


In [None]:
# The code above splits the one-hot encoded data into features X and target y.

# data_encoded.drop('class_True', axis=1) creates a new data frame X by dropping the 'class_True' column from data_encoded. This column represents the target, so it will not be used as a feature.
# data_encoded['class_True'] creates a new data frame y that contains only the 'class_True' column from data_encoded. This column represents the target.
# Thus, X contains the one-hot encoded features and y contains the target.

In [8]:
X=data_encoded.drop('class_True', axis=1)
y=data_encoded['class_True']

## Step 2: Build Neural Network

To build the neural network, you can refer to your own codes you wrote while following the [Deep Learning with Python, TensorFlow, and Keras tutorial](https://www.youtube.com/watch?v=wQ8BIBpya2k) in the lesson. It's pretty similar to what you will be doing in this lab.

1. Split the training and test data.
1. Create a `Sequential` model.
1. Add several layers to your model. Make sure you use ReLU as the activation function for the middle layers. Use Softmax for the output layer because each output has a single lable and all the label probabilities add up to 1.
1. Compile the model using `adam` as the optimizer and `sparse_categorical_crossentropy` as the loss function. For metrics, use `accuracy` for now.
1. Fit the training data.
1. Evaluate your neural network model with the test data.
1. Save your model as `tic-tac-toe.model`.

In [9]:
X_train, X_test, y_train, y_test = train_test_split(X, y)

In [None]:
# The code above performs standard scaling on the training and test data.

# preprocessing.StandardScaler() creates an instance of the StandardScaler class.
# scaler.fit_transform(X_train) fits the scaler on the training data X_train and returns the scaled data.
# scaler.transform(X_test) scales the test data X_test using the parameters learned from the training data.
# Standard scaling is a preprocessing step that scales the features so that they have zero mean and unit variance. This can help the machine learning algorithm perform better by removing any inherent scale differences between the features.

In [10]:
scaler = preprocessing.StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

In [None]:
# The code above creates an instance of a Multi-layer Perceptron (MLP) classifier, which is a type of artificial neural network.

# MLPClassifier(hidden_layer_sizes=(15,15), solver='sgd', verbose=10, max_iter=500, random_state=1) creates an instance of the MLPClassifier class.
# hidden_layer_sizes=(15, 15) specifies the number of nodes in each hidden layer, with two hidden layers of 15 nodes each.
# solver='sgd' specifies the optimization algorithm to use, in this case Stochastic Gradient Descent (SGD).
# verbose=10 controls the verbosity of the training process, with a value of 10 meaning high verbosity.
# max_iter=500 specifies the maximum number of iterations the solver should run for.
# random_state=1 sets the random seed for reproducibility.
# This instance of the MLPClassifier class will be used to fit and make predictions with the data.





In [11]:
NeuralNetwork = MLPClassifier(hidden_layer_sizes= (15,15), solver='sgd', verbose=10, max_iter=500, random_state=1)


In [None]:
# The code above trains the MLP classifier on the training data and evaluates its performance on both the training and test data.

# NeuralNetwork.fit(X_train, y_train) trains the MLP classifier on the training data X_train and target y_train.
# NeuralNetwork.score(X_train, y_train) returns the mean accuracy on the given training data and target. The result is printed with a message "Training set score: %f".
# NeuralNetwork.score(X_test, y_test) returns the mean accuracy on the given test data and target. The result is printed with a message "Test set score: %f".
# The score method returns the mean accuracy of the classifier, which is the number of correct predictions divided by the total number of predictions. A high score indicates that the classifier is making accurate predictions.

In [12]:
NeuralNetwork.fit(X_train, y_train)

print("Training set score: %f" % NeuralNetwork.score(X_train, y_train))
print("Test set score: %f" % NeuralNetwork.score(X_test, y_test))

Iteration 1, loss = 0.79896907
Iteration 2, loss = 0.79321973
Iteration 3, loss = 0.78459831
Iteration 4, loss = 0.77537208
Iteration 5, loss = 0.76563471
Iteration 6, loss = 0.75631730
Iteration 7, loss = 0.74828337
Iteration 8, loss = 0.74016936
Iteration 9, loss = 0.73329525
Iteration 10, loss = 0.72707442
Iteration 11, loss = 0.72135838
Iteration 12, loss = 0.71635584
Iteration 13, loss = 0.71202812
Iteration 14, loss = 0.70795889
Iteration 15, loss = 0.70414780
Iteration 16, loss = 0.70088931
Iteration 17, loss = 0.69781675
Iteration 18, loss = 0.69494937
Iteration 19, loss = 0.69231716
Iteration 20, loss = 0.68966015
Iteration 21, loss = 0.68718923
Iteration 22, loss = 0.68495732
Iteration 23, loss = 0.68264461
Iteration 24, loss = 0.68050575
Iteration 25, loss = 0.67846301
Iteration 26, loss = 0.67650031
Iteration 27, loss = 0.67451880
Iteration 28, loss = 0.67267840
Iteration 29, loss = 0.67080309
Iteration 30, loss = 0.66903883
Iteration 31, loss = 0.66734064
Iteration 32, los

Iteration 343, loss = 0.47830530
Iteration 344, loss = 0.47785839
Iteration 345, loss = 0.47740362
Iteration 346, loss = 0.47694485
Iteration 347, loss = 0.47650411
Iteration 348, loss = 0.47603646
Iteration 349, loss = 0.47557912
Iteration 350, loss = 0.47511356
Iteration 351, loss = 0.47465569
Iteration 352, loss = 0.47418055
Iteration 353, loss = 0.47374679
Iteration 354, loss = 0.47327610
Iteration 355, loss = 0.47282289
Iteration 356, loss = 0.47235657
Iteration 357, loss = 0.47190716
Iteration 358, loss = 0.47144663
Iteration 359, loss = 0.47098070
Iteration 360, loss = 0.47051632
Iteration 361, loss = 0.47004859
Iteration 362, loss = 0.46959078
Iteration 363, loss = 0.46914884
Iteration 364, loss = 0.46866838
Iteration 365, loss = 0.46819978
Iteration 366, loss = 0.46776039
Iteration 367, loss = 0.46726498
Iteration 368, loss = 0.46681509
Iteration 369, loss = 0.46635160
Iteration 370, loss = 0.46589448
Iteration 371, loss = 0.46541937
Iteration 372, loss = 0.46496203
Iteration 



In [13]:
# !pip install tensorflow
import tensorflow as tf
from tensorflow.python import tf2
from keras.models import Sequential
from keras.layers import Dense
from tensorflow.keras.utils import to_categorical
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [None]:
# The code above creates a sequential model in TensorFlow's tf.keras API.

# The Sequential model is a linear stack of layers. Each layer is added to the model using the add method, which takes a layer instance as argument. The layers are stacked in the order they are added.

# The layers in this model are as follows:

# tf.keras.layers.Flatten flattens the input data into a 1D array, which is required as the input for a dense (fully connected) layer.
# tf.keras.layers.Dense is a dense (fully connected) layer. The arguments specify the number of neurons in the layer (128, 128, 128, 64, 32) and the activation function (tf.nn.relu). The activation function rectified linear unit (ReLU) is used in the hidden layers and is a common choice for neural network activation functions.
# The final dense layer has two neurons and uses the softmax activation function (tf.nn.softmax). The softmax activation is often used for multi-class classification problems to produce a probability distribution over the classes.

In [14]:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(128,activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(128,activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(128,activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(64,activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(32,activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(2,activation=tf.nn.softmax))

In [None]:
# This line of code compiles the model with the Adam optimizer, using sparse categorical crossentropy as the loss function and accuracy as the evaluation metric.

In [15]:
model.compile(optimizer = 'adam',loss = 'sparse_categorical_crossentropy',metrics = ['accuracy'])


In [None]:
#  this line of code trains the model on the training data X_train and y_train for 20 epochs.

In [16]:
model.fit(X_train,y_train,epochs =20)


Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x2409cc45b20>

In [None]:
# his code evaluates the model on the test data X_test and y_test and returns the loss and accuracy of the model on the test data. The val_loss and val_acc variables store the loss and accuracy respectively, which can then be printed to get the values.

In [17]:
val_loss,val_acc = model.evaluate(X_test,y_test)
print(val_loss)
print(val_acc)

0.03359709680080414
0.9916666746139526


In [None]:
# The code is saving the trained neural network model using the model.save method and loading it back using the tf.keras.models.load_model method. The saved model is saved in the H5 format, which is a binary file format for storing models.

In [18]:
model.save('model')
model_lab = tf.keras.models.load_model('model')



INFO:tensorflow:Assets written to: model\assets


INFO:tensorflow:Assets written to: model\assets


In [None]:
# The code above uses the trained Neural Network model to make predictions on the test data. The predict function of the model will return the predicted class probabilities for each sample in the test data (X_test). The resulting predictions object is a 2D numpy array with shape (number of samples in X_test, number of classes in the target).

In [19]:
predictions = model_lab.predict(X_test)



## Step 3: Make Predictions

Now load your saved model and use it to make predictions on a few random rows in the test dataset. Check if the predictions are correct.

In [20]:
# your code here

In [None]:
# This will output the maximum predicted value along the second axis (the rows) of the predictions array, which will give the predicted class (0 or 1). The full predictions array will contain the softmax probabilities for each class.

In [21]:
print(np.argmax(predictions, axis=1)) 
predictions

[1 0 0 1 1 1 0 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 0 1 1 0 1 0 1 1 1
 1 0 1 1 1 1 1 1 1 0 1 1 0 0 1 1 1 0 0 1 0 1 0 1 0 0 1 1 1 1 0 1 1 0 1 1 1
 1 1 1 0 0 1 1 1 0 0 1 1 1 1 1 1 1 0 1 0 0 1 1 0 0 1 1 0 1 0 1 1 1 0 1 0 1
 0 1 0 1 1 1 1 1 1 1 0 1 1 1 0 1 1 0 1 1 0 1 1 1 1 1 1 0 1 0 0 0 0 0 1 1 1
 0 1 1 1 1 0 1 1 1 1 0 1 1 1 0 0 0 1 0 0 1 0 0 1 0 1 1 1 0 0 1 1 1 1 1 1 0
 1 0 0 0 1 0 1 0 1 0 0 1 1 0 0 0 1 0 1 0 1 1 0 0 1 1 1 1 0 1 1 1 1 1 0 0 1
 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 0 1 1]


array([[3.37086618e-04, 9.99662876e-01],
       [9.99312639e-01, 6.87410182e-04],
       [9.99966502e-01, 3.35108452e-05],
       [4.01379184e-05, 9.99959826e-01],
       [9.30544388e-07, 9.99999046e-01],
       [2.85972455e-05, 9.99971390e-01],
       [9.99550641e-01, 4.49295942e-04],
       [1.76196886e-06, 9.99998212e-01],
       [1.50778578e-05, 9.99984980e-01],
       [1.29785081e-02, 9.87021506e-01],
       [3.56956519e-07, 9.99999642e-01],
       [2.76053231e-03, 9.97239470e-01],
       [5.73461839e-05, 9.99942660e-01],
       [2.29937432e-05, 9.99976993e-01],
       [4.51164342e-05, 9.99954939e-01],
       [4.61907848e-06, 9.99995351e-01],
       [9.99341905e-01, 6.58042962e-04],
       [9.99635100e-01, 3.64893262e-04],
       [9.99915361e-01, 8.45852119e-05],
       [3.05284266e-06, 9.99996901e-01],
       [6.21824729e-05, 9.99937773e-01],
       [7.27889346e-06, 9.99992728e-01],
       [2.88679119e-04, 9.99711335e-01],
       [5.34281775e-04, 9.99465764e-01],
       [7.446423

## Step 4: Improve Your Model

Did your model achieve low loss (<0.1) and high accuracy (>0.95)? If not, try to improve your model.

But how? There are so many things you can play with in Tensorflow and in the next challenge you'll learn about these things. But in this challenge, let's just do a few things to see if they will help.

* Add more layers to your model. If the data are complex you need more layers. But don't use more layers than you need. If adding more layers does not improve the model performance you don't need additional layers.
* Adjust the learning rate when you compile the model. This means you will create a custom `tf.keras.optimizers.Adam` instance where you specify the learning rate you want. Then pass the instance to `model.compile` as the optimizer.
    * `tf.keras.optimizers.Adam` [reference](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/Adam).
    * Don't worry if you don't understand what the learning rate does. You'll learn about it in the next challenge.
* Adjust the number of epochs when you fit the training data to the model. Your model performance continues to improve as you train more epochs. But eventually it will reach the ceiling and the performance will stay the same.

In [None]:
# your code here

**Which approach(es) did you find helpful to improve your model performance?**

In [None]:
# your answer here