# Challenge 1 - Tic Tac Toe

In this lab you will perform deep learning analysis on a dataset of playing [Tic Tac Toe](https://en.wikipedia.org/wiki/Tic-tac-toe).

There are 9 grids in Tic Tac Toe that are coded as the following picture shows:

![Tic Tac Toe Grids](tttboard.jpg)

In the first 9 columns of the dataset you can find which marks (`x` or `o`) exist in the grids. If there is no mark in a certain grid, it is labeled as `b`. The last column is `class` which tells you whether Player X (who always moves first in Tic Tac Toe) wins in this configuration. Note that when `class` has the value `False`, it means either Player O wins the game or it ends up as a draw.

Follow the steps suggested below to conduct a neural network analysis using Tensorflow and Keras. You will build a deep learning model to predict whether Player X wins the game or not.

## Step 1: Data Engineering

This dataset is almost in the ready-to-use state so you do not need to worry about missing values and so on. Still, some simple data engineering is needed.

1. Read `tic-tac-toe.csv` into a dataframe.
1. Inspect the dataset. Determine if the dataset is reliable by eyeballing the data.
1. Convert the categorical values to numeric in all columns.
1. Separate the inputs and output.
1. Normalize the input data.

In [None]:
#!pip install tensorflow

Collecting tensorflow
  Downloading tensorflow-2.18.0-cp311-cp311-win_amd64.whl.metadata (3.3 kB)
Collecting tensorflow-intel==2.18.0 (from tensorflow)
  Downloading tensorflow_intel-2.18.0-cp311-cp311-win_amd64.whl.metadata (4.9 kB)
Collecting absl-py>=1.0.0 (from tensorflow-intel==2.18.0->tensorflow)
  Downloading absl_py-2.1.0-py3-none-any.whl.metadata (2.3 kB)
Collecting astunparse>=1.6.0 (from tensorflow-intel==2.18.0->tensorflow)
  Downloading astunparse-1.6.3-py2.py3-none-any.whl.metadata (4.4 kB)
Collecting flatbuffers>=24.3.25 (from tensorflow-intel==2.18.0->tensorflow)
  Downloading flatbuffers-24.3.25-py2.py3-none-any.whl.metadata (850 bytes)
Collecting gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 (from tensorflow-intel==2.18.0->tensorflow)
  Downloading gast-0.6.0-py3-none-any.whl.metadata (1.3 kB)
Collecting google-pasta>=0.1.1 (from tensorflow-intel==2.18.0->tensorflow)
  Downloading google_pasta-0.2.0-py3-none-any.whl.metadata (814 bytes)
Collecting libclang>=13.0.0 (from tensorf

In [2]:
#To install Keras
#!pip install keras



In [80]:
# your code here
import pandas as pd
from sklearn.preprocessing import LabelEncoder

# Load the data
data = pd.read_csv('tic-tac-toe.csv')

# Inspect the data
print(data.head())

# Encode categorical values to numeric
encoder = LabelEncoder()
data = data.apply(encoder.fit_transform)

# Separate features and target
X = data.drop('class', axis=1)
y = data['class']

  TL TM TR ML MM MR BL BM BR  class
0  x  x  x  x  o  o  x  o  o   True
1  x  x  x  x  o  o  o  x  o   True
2  x  x  x  x  o  o  o  o  x   True
3  x  x  x  x  o  o  o  b  b   True
4  x  x  x  x  o  o  b  o  b   True


In [76]:
y

0      1
1      1
2      1
3      1
4      1
      ..
953    0
954    0
955    0
956    0
957    0
Name: class, Length: 958, dtype: int64

In [27]:
'''from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler

# Normalize the data
scaler = MinMaxScaler()
X = scaler.fit_transform(X)'''

'from sklearn.model_selection import train_test_split\nfrom sklearn.preprocessing import MinMaxScaler\n\n# Normalize the data\nscaler = MinMaxScaler()\nX = scaler.fit_transform(X)'

## Step 2: Build Neural Network

To build the neural network, you can refer to your own codes you wrote while following the [Deep Learning with Python, TensorFlow, and Keras tutorial](https://www.youtube.com/watch?v=wQ8BIBpya2k) in the lesson. It's pretty similar to what you will be doing in this lab.

1. Split the training and test data.
1. Create a `Sequential` model.
1. Add several layers to your model. Make sure you use ReLU as the activation function for the middle layers. Use Softmax for the output layer because each output has a single lable and all the label probabilities add up to 1.
1. Compile the model using `adam` as the optimizer and `sparse_categorical_crossentropy` as the loss function. For metrics, use `accuracy` for now.
1. Fit the training data.
1. Evaluate your neural network model with the test data.
1. Save your model as `tic-tac-toe.model`.

In [81]:
# your code here
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

print("Training data shape:", X_train.shape)
print("Testing data shape:", X_test.shape)


Training data shape: (766, 9)
Testing data shape: (192, 9)


In [83]:
import tensorflow as tf
import tensorflow.keras
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import RMSprop

batch_size = 32
num_classes = 2
epochs = 50

# Build the neural network
model = Sequential()

model.add(Dense(64, activation='relu', input_shape=(9,)))  # First hidden layer
model.add(Dropout(0.2))
model.add(Dense(32, activation='relu'))                      # Second hidden layer
model.add(Dropout(0.2))
model.add(Dense(16, activation='relu'))                      # Third hidden layer
model.add(Dropout(0.2))
model.add(Dense(8, activation='relu'))                      # Fourth hidden layer
model.add(Dropout(0.2))
model.add(Dense(num_classes, activation='softmax'))                    # Output layer

model.summary()

# Define the optimizer
#my_opt = Adam(learning_rate=0.001)
my_opt = tf.keras.optimizers.Adagrad(learning_rate=0.01, epsilon=0.1, decay=0.0)

# Compile the model
model.compile(loss='sparse_categorical_crossentropy',
              
              #optimizer=RMSprop(),
              optimizer = my_opt,  
              metrics=['accuracy'])

# Train the model
# convert class vectors to binary class matrices
#y_train = tf.keras.utils.to_categorical(y_train, num_classes)
#y_test = tf.keras.utils.to_categorical(y_test, num_classes)

history = model.fit(X_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=1,
                    validation_data=(X_test, y_test))

# Evaluate the model
score = model.evaluate(X_test, y_test, verbose=0)

print('Test loss:', score[0])
print('Test accuracy:', score[1])

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)




Epoch 1/50
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 20ms/step - accuracy: 0.5885 - loss: 0.6784 - val_accuracy: 0.6510 - val_loss: 0.6555
Epoch 2/50
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - accuracy: 0.6656 - loss: 0.6428 - val_accuracy: 0.6510 - val_loss: 0.6529
Epoch 3/50
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 24ms/step - accuracy: 0.6597 - loss: 0.6427 - val_accuracy: 0.6510 - val_loss: 0.6491
Epoch 4/50
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - accuracy: 0.6662 - loss: 0.6514 - val_accuracy: 0.6510 - val_loss: 0.6470
Epoch 5/50
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - accuracy: 0.6669 - loss: 0.6360 - val_accuracy: 0.6510 - val_loss: 0.6451
Epoch 6/50
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - accuracy: 0.6539 - loss: 0.6432 - val_accuracy: 0.6510 - val_loss: 0.6448
Epoch 7/50
[1m24/24[0m [32m━━━━━━━━

In [54]:
# Save the trained model
model.save('tic-tac-toe.keras')

## Step 3: Make Predictions

Now load your saved model and use it to make predictions on a few random rows in the test dataset. Check if the predictions are correct.

In [55]:
# your code here
from tensorflow.keras.models import load_model

# Load the saved model
loaded_model = load_model('tic-tac-toe.keras')

In [44]:
from tensorflow.keras.models import load_model
import numpy as np

# Make predictions on the test set
predictions = model.predict(X_test)

# Convert predictions to class labels (0 or 1)
predicted_classes = np.argmax(predictions, axis=1)

# Display the first 10 predictions alongside the true labels
print("Predicted Classes:", predicted_classes[:10])
print("True Classes:     ", y_test[:10])

[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step  
Predicted Classes: [1 1 1 1 0 1 1 0 0 1]
True Classes:      [[1. 0.]
 [0. 1.]
 [0. 1.]
 [1. 0.]
 [1. 0.]
 [0. 1.]
 [0. 1.]
 [0. 1.]
 [0. 1.]
 [1. 0.]]


## Step 4: Improve Your Model

Did your model achieve low loss (<0.1) and high accuracy (>0.95)? If not, try to improve your model.

But how? There are so many things you can play with in Tensorflow and in the next challenge you'll learn about these things. But in this challenge, let's just do a few things to see if they will help.

* Add more layers to your model. If the data are complex you need more layers. But don't use more layers than you need. If adding more layers does not improve the model performance you don't need additional layers.
* Adjust the learning rate when you compile the model. This means you will create a custom `tf.keras.optimizers.Adam` instance where you specify the learning rate you want. Then pass the instance to `model.compile` as the optimizer.
    * `tf.keras.optimizers.Adam` [reference](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/Adam).
    * Don't worry if you don't understand what the learning rate does. You'll learn about it in the next challenge.
* Adjust the number of epochs when you fit the training data to the model. Your model performance continues to improve as you train more epochs. But eventually it will reach the ceiling and the performance will stay the same.

In [84]:
# your code here
# First step removal of dropouts, layers reduction and Adam optimizer use

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam

batch_size = 32
num_classes = 2
epochs = 50

# Build the neural network
model = Sequential()

model.add(Dense(32, activation='relu', input_shape=(9,)))  # First hidden layer
model.add(Dense(16, activation='relu'))                   # Second hidden layer
model.add(Dense(num_classes, activation='softmax'))       # Output layer

model.summary()

# Define the optimizer
my_opt = Adam(learning_rate=0.001)

# Compile the model
model.compile(loss='sparse_categorical_crossentropy',
              optimizer=my_opt,
              metrics=['accuracy'])

# Train the model

history = model.fit(X_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=1,
                    validation_data=(X_test, y_test))

# Evaluate the model
score = model.evaluate(X_test, y_test, verbose=0)

print('Test loss:', score[0])
print('Test accuracy:', score[1])

'''Results with these modifications 
Test loss: 0.4359966814517975
Test accuracy: 0.8125'''

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/50
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 18ms/step - accuracy: 0.6071 - loss: 0.6879 - val_accuracy: 0.6146 - val_loss: 0.6462
Epoch 2/50
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - accuracy: 0.6593 - loss: 0.6307 - val_accuracy: 0.6198 - val_loss: 0.6287
Epoch 3/50
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - accuracy: 0.6807 - loss: 0.6059 - val_accuracy: 0.6458 - val_loss: 0.6190
Epoch 4/50
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - accuracy: 0.6820 - loss: 0.6170 - val_accuracy: 0.6771 - val_loss: 0.6144
Epoch 5/50
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - accuracy: 0.6880 - loss: 0.5969 - val_accuracy: 0.6771 - val_loss: 0.6102
Epoch 6/50
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - accuracy: 0.7002 - loss: 0.5910 - val_accuracy: 0.6719 - val_loss: 0.6052
Epoch 7/50
[1m24/24[0m [32m━━━━━━━━

'Results with these modifications \nTest loss: 0.4658966362476349\nTest accuracy: 0.7760416865348816'

In [None]:
# New Adjustments: epochs increased to 100, Learning rate from 0.001 to 0.0005 and adding Early Stopping function
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.optimizers import Adam

batch_size = 32
num_classes = 2
epochs = 100

# Build the neural network
model = Sequential()

model.add(Dense(32, activation='relu', input_shape=(9,)))  # First hidden layer
model.add(Dense(16, activation='relu'))                   # Second hidden layer
model.add(Dense(num_classes, activation='softmax'))       # Output layer

model.summary()

# Define the optimizer
my_opt = Adam(learning_rate=0.0005)

# Early stopping
early_stop = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)

# Compile the model
model.compile(loss='sparse_categorical_crossentropy',
              optimizer=my_opt,
              metrics=['accuracy'])

# Train the model

history = model.fit(X_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=1,
                    validation_data=(X_test, y_test),
                    callbacks=[early_stop])

# Evaluate the model
score = model.evaluate(X_test, y_test, verbose=0)

print('Test loss:', score[0])
print('Test accuracy:', score[1])

'''Results with these modifications 
Test loss: 0.4369371831417084
Test accuracy: 0.7916666865348816'''

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/100
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 35ms/step - accuracy: 0.6435 - loss: 0.6637 - val_accuracy: 0.6198 - val_loss: 0.6664
Epoch 2/100
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - accuracy: 0.6340 - loss: 0.6596 - val_accuracy: 0.6510 - val_loss: 0.6553
Epoch 3/100
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - accuracy: 0.6568 - loss: 0.6380 - val_accuracy: 0.6562 - val_loss: 0.6489
Epoch 4/100
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - accuracy: 0.6448 - loss: 0.6463 - val_accuracy: 0.6562 - val_loss: 0.6434
Epoch 5/100
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - accuracy: 0.6442 - loss: 0.6312 - val_accuracy: 0.6510 - val_loss: 0.6392
Epoch 6/100
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - accuracy: 0.6755 - loss: 0.6128 - val_accuracy: 0.6510 - val_loss: 0.6350
Epoch 7/100
[1m24/24[0m [32m━━

'Results with these modifications \nTest loss: 0.4446100890636444\nTest accuracy: 0.8125'

In [None]:
# New Adjustments: Duplicate neurons on each layer for learning capacity increase.
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.optimizers import Adam

batch_size = 32
num_classes = 2
epochs = 100

# Build the neural network
model = Sequential()

model.add(Dense(64, activation='relu', input_shape=(9,)))  # First hidden layer
model.add(Dense(32, activation='relu'))                    # Second hidden layer
model.add(Dense(num_classes, activation='softmax'))        # Output layer

model.summary()

# Define the optimizer
my_opt = Adam(learning_rate=0.0005)

# Early stopping
early_stop = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)

# Compile the model
model.compile(loss='sparse_categorical_crossentropy',
              optimizer=my_opt,
              metrics=['accuracy'])

# Train the model

history = model.fit(X_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=1,
                    validation_data=(X_test, y_test),
                    callbacks=[early_stop])

# Evaluate the model
score = model.evaluate(X_test, y_test, verbose=0)

print('Test loss:', score[0])
print('Test accuracy:', score[1])

'''Results after changes:
Test loss: 0.35729217529296875
Test accuracy: 0.8489583134651184'''

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/100
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 38ms/step - accuracy: 0.5125 - loss: 0.6917 - val_accuracy: 0.6510 - val_loss: 0.6449
Epoch 2/100
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 12ms/step - accuracy: 0.6380 - loss: 0.6389 - val_accuracy: 0.6510 - val_loss: 0.6354
Epoch 3/100
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 15ms/step - accuracy: 0.6703 - loss: 0.6068 - val_accuracy: 0.6510 - val_loss: 0.6263
Epoch 4/100
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 18ms/step - accuracy: 0.6607 - loss: 0.6090 - val_accuracy: 0.6510 - val_loss: 0.6176
Epoch 5/100
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 12ms/step - accuracy: 0.6834 - loss: 0.5784 - val_accuracy: 0.6667 - val_loss: 0.6105
Epoch 6/100
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - accuracy: 0.6626 - loss: 0.6030 - val_accuracy: 0.6771 - val_loss: 0.6048
Epoch 7/100
[1m24/24[0m [3

'Results after changes:\nTest loss: 0.4216085374355316\nTest accuracy: 0.8333333134651184'

In [None]:
# New Adjustments: Increase the patience parameter on the Early Stopping function to 15 and the epochs to 150.
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.optimizers import Adam

batch_size = 32
num_classes = 2
epochs = 150

# Build the neural network
model = Sequential()

model.add(Dense(64, activation='relu', input_shape=(9,)))  # First hidden layer
model.add(Dense(32, activation='relu'))                    # Second hidden layer
model.add(Dense(num_classes, activation='softmax'))        # Output layer

model.summary()

# Define the optimizer
my_opt = Adam(learning_rate=0.0005)

# Early stopping
early_stop = EarlyStopping(monitor='val_loss', patience=15, restore_best_weights=True)

# Compile the model
model.compile(loss='sparse_categorical_crossentropy',
              optimizer=my_opt,
              metrics=['accuracy'])

# Train the model

history = model.fit(X_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=1,
                    validation_data=(X_test, y_test),
                    callbacks=[early_stop])

# Evaluate the model
score = model.evaluate(X_test, y_test, verbose=0)

print('Test loss:', score[0])
print('Test accuracy:', score[1])

''' Results:
Test loss: 0.33100640773773193
Test accuracy: 0.8645833134651184'''

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/150
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 18ms/step - accuracy: 0.6338 - loss: 0.6475 - val_accuracy: 0.6615 - val_loss: 0.6142
Epoch 2/150
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 25ms/step - accuracy: 0.6317 - loss: 0.6338 - val_accuracy: 0.6771 - val_loss: 0.6031
Epoch 3/150
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 9ms/step - accuracy: 0.6828 - loss: 0.5937 - val_accuracy: 0.6979 - val_loss: 0.5961
Epoch 4/150
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - accuracy: 0.6807 - loss: 0.6146 - val_accuracy: 0.6875 - val_loss: 0.5926
Epoch 5/150
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 12ms/step - accuracy: 0.7025 - loss: 0.5744 - val_accuracy: 0.6927 - val_loss: 0.5878
Epoch 6/150
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 19ms/step - accuracy: 0.7076 - loss: 0.5814 - val_accuracy: 0.6979 - val_loss: 0.5861
Epoch 7/150
[1m24/24[0m [32

In [None]:
# New Adjustments: Use of Learning Rate Scheduler to reduce the learning rate throught the training process.
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ReduceLROnPlateau

batch_size = 32
num_classes = 2
epochs = 150

# Build the neural network
model = Sequential()

model.add(Dense(64, activation='relu', input_shape=(9,), kernel_regularizer=l2(0.001)))  # First hidden layer
model.add(Dense(32, activation='relu', kernel_regularizer=l2(0.001)))                    # Second hidden layer
model.add(Dense(num_classes, activation='softmax'))                                      # Output layer

model.summary()

# Define the optimizer
my_opt = Adam(learning_rate=0.0005)

# Early stopping
early_stop = EarlyStopping(monitor='val_loss', patience=15, restore_best_weights=True)

# Compile the model
model.compile(loss='sparse_categorical_crossentropy',
              optimizer=my_opt,
              metrics=['accuracy'])

# Reduce learning rate when a metric has stopped improving
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-5)

# Train the model

history = model.fit(X_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=1,
                    validation_data=(X_test, y_test),
                    callbacks=[early_stop, reduce_lr])

# Evaluate the model
score = model.evaluate(X_test, y_test, verbose=0)

print('Test loss:', score[0])
print('Test accuracy:', score[1])

'''REsults:
Test loss: 0.33169957995414734
Test accuracy: 0.8854166865348816'''

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/150
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 18ms/step - accuracy: 0.6506 - loss: 0.7066 - val_accuracy: 0.6510 - val_loss: 0.6767 - learning_rate: 5.0000e-04
Epoch 2/150
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - accuracy: 0.6503 - loss: 0.6834 - val_accuracy: 0.6510 - val_loss: 0.6664 - learning_rate: 5.0000e-04
Epoch 3/150
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 13ms/step - accuracy: 0.6776 - loss: 0.6566 - val_accuracy: 0.6562 - val_loss: 0.6583 - learning_rate: 5.0000e-04
Epoch 4/150
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 20ms/step - accuracy: 0.6450 - loss: 0.6672 - val_accuracy: 0.6562 - val_loss: 0.6506 - learning_rate: 5.0000e-04
Epoch 5/150
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 17ms/step - accuracy: 0.6438 - loss: 0.6661 - val_accuracy: 0.6562 - val_loss: 0.6435 - learning_rate: 5.0000e-04
Epoch 6/150
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━

In [143]:
# New Adjustments: Update the Learning Rate Scheduler, add a new layer and add regularizarion.
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.regularizers import l2

batch_size = 32
num_classes = 2
epochs = 200

# Build the neural network
model = Sequential()

model.add(Dense(128, activation='relu', input_shape=(9,), kernel_regularizer=l2(0.0005)))  # First hidden layer
model.add(Dense(64, activation='relu', kernel_regularizer=l2(0.0005)))                    # Second hidden layer
model.add(Dense(32, activation='relu', kernel_regularizer=l2(0.0005)))                    # Third hidden layer
model.add(Dense(16, activation='relu', kernel_regularizer=l2(0.0005)))                    # Third hidden layer
model.add(Dense(num_classes, activation='softmax'))                                       # Output layer

model.summary()

# Define the optimizer
my_opt = Adam(learning_rate=0.01)

#Callbacks:
# Early stopping
early_stop = EarlyStopping(monitor='val_loss', patience=15, restore_best_weights=True)

# Reduce learning rate when a metric has stopped improving
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=10, min_lr=1e-6)

# Compile the model
model.compile(loss='sparse_categorical_crossentropy',
              optimizer=my_opt,
              metrics=['accuracy'])


# Train the model

history = model.fit(X_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=1,
                    validation_data=(X_test, y_test),
                    callbacks=[early_stop, reduce_lr])

# Evaluate the model
score = model.evaluate(X_test, y_test, verbose=0)

print('Test loss:', score[0])
print('Test accuracy:', score[1])

'''REsults:
Test loss: 0.33169957995414734
Test accuracy: 0.8854166865348816'''

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/200
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 16ms/step - accuracy: 0.6322 - loss: 0.7574 - val_accuracy: 0.6510 - val_loss: 0.6787 - learning_rate: 0.0100
Epoch 2/200
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.7029 - loss: 0.6381 - val_accuracy: 0.6510 - val_loss: 0.6524 - learning_rate: 0.0100
Epoch 3/200
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - accuracy: 0.7062 - loss: 0.6231 - val_accuracy: 0.7135 - val_loss: 0.6336 - learning_rate: 0.0100
Epoch 4/200
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.7536 - loss: 0.5767 - val_accuracy: 0.7292 - val_loss: 0.6126 - learning_rate: 0.0100
Epoch 5/200
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - accuracy: 0.7383 - loss: 0.5631 - val_accuracy: 0.7760 - val_loss: 0.5867 - learning_rate: 0.0100
Epoch 6/200
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0

'REsults:\nTest loss: 0.33169957995414734\nTest accuracy: 0.8854166865348816'

**Which approach(es) did you find helpful to improve your model performance?**

In [144]:
# Save the trained model after optimization with the following results: Test loss: 0.034142881631851196 & Test accuracy: 0.9947916865348816
model.save('tic-tac-toe.keras')

In [None]:
# your answer here
'''The optimization process followed a structured, step-by-step approach to systematically improve the model's performance, focusing on balancing accuracy and loss:

Baseline Model Setup:

Started with a neural network containing two hidden layers and applied L2 regularization to control overfitting.
Used Adam optimizer with an initial learning rate of 0.005 and EarlyStopping to prevent unnecessary training.

Increasing Model Complexity:

Added additional hidden layers (128, 64, 32, 16 neurons) to increase the model's capacity for learning complex patterns.
Reduced L2 regularization from 0.001 to 0.0005 to allow for more flexible weight updates.

Learning Rate Adjustment:

Lowered the learning rate to 0.001 to enable finer adjustments to the weights, especially in later epochs.
Implemented ReduceLROnPlateau to dynamically reduce the learning rate when val_loss plateaued.

EarlyStopping Tuning:

Adjusted EarlyStopping's patience to ensure the model was given enough time to converge without overfitting.

Final Refinement:

Increased the initial learning rate to 0.01 for faster convergence.
Combined ReduceLROnPlateau with EarlyStopping to strike a balance between convergence speed and generalization.

This iterative process of modifying the network architecture, regularization, and learning rate, combined with careful use of callbacks, 
ultimately resulted in achieving near-perfect accuracy and minimal loss.'''