<a href="https://colab.research.google.com/github/kamillaknudsen/tmdl/blob/main/Lecture_Notebook_4.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Turing Machine and Deep Learning 2025
_Author: Satchit Chatterji (satchit.chatterji@gmail.com)_

## Lecture 4 -- Neural Network Playground
> Today's question: **How do NNs work?**

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras

In [None]:
import numpy as np
import pandas as pd
from sklearn.neural_network import MLPClassifier
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, SimpleRNN

In [None]:
# Load the CIFAR-10 dataset
cifar10 = keras.datasets.cifar10
(train_images, train_labels), (test_images, test_labels) = cifar10.load_data()

# Preprocess the data
train_images = train_images / 255.0
test_images = test_images / 255.0

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
[1m170498071/170498071[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 0us/step


In [None]:
train_images.shape

(50000, 32, 32, 3)

### Artificial Neural Network
The easiest way to use a Neural Network, is the Artificial Neural Network (ANN). For ANN's you do not have to manually "build" the network yourself. The ```MLPClassifier``` is such an artificial neural network.

The MLPClassifier starts with an input layer, taking in the data.
Then uses (multiple) hidden layers to attempt to analyze the data and recognize patterns.
And finally it has an output layer that converts these findings to classifications, with only one neuron for binary classification (generates the probability score) and for multi-class classification there are as many neurons as classes, with a generated probability score for each class.

MLPClassifier has a number of hyperparameters, the ones we use here include:

```hidden_layer_sizes``` The number of neurons in each hidden layer (here thus one layer with 32 and one with 64 neurons)

```max_iter``` the maximum number of iterations for training

To get the validation loss, we set ```early_stopping = True```, as this sets the stopping criterion to be the accuracy score, so it gets automatically computed every iteration.
The accuracy score is computed on a validation set, whose size is determined by the ```validation_fraction``` (which is 0 by default, and thus has to be  set as well).

There are many other (hyper)paramaters to be set, such as different activation functions, or solvers.

In [None]:
#Build the MLPClassifier
model = MLPClassifier(hidden_layer_sizes=(32,64), max_iter=500, early_stopping=True, validation_fraction = 0.2)

# Reshape the images and train the model
train_images_reshaped = train_images.reshape(train_images.shape[0], -1)
model.fit(train_images_reshaped, train_labels)

# Get the training loss and validation accuracy values from the model
train_loss = model.loss_curve_
val_acc = model.validation_scores_

# Plot the training losses
plt.figure(figsize=(10, 4))
plt.subplot(1, 2, 1)
plt.plot(range(len(train_loss)), train_loss, label='Training Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.title('Training Loss')

# Plot the validation accuracies
plt.subplot(1, 2, 2)
plt.plot(range(len(val_acc)), val_acc, label='Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.title('Validation Accuracy')

# Show the plot
plt.tight_layout()
plt.show()

###Convolutional Neural Network
For problems that work with visual data (Computer Vision) a Convolutional Neural Network (CNN) is used more commonly.

A CNN is like an extended version of an ANN. Instead of the hidden layers of the ANN, they use convolutional, pooling and dense layers.

####Convolutional layer
Our images are represented as a three-dimensional vector. They have a length and width of 32 pixels, and RGB values for each of those pixels. The convolutional layer then detects spatial features, such as edges, textures or patterns. It then maps the input to a feature map.

####Pooling layer
The pooling layer then reduces the size of these feature maps by using max pooling; taking the maximum value per region. This reduces the computational cost and makes the model more robust.

####Dense layer
The Dense layer then converts the extracted features into a final classification decision.

In [None]:
# Build the CNN model
model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    Flatten(),
    Dense(64, activation='relu'),
    Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
history = model.fit(train_images, train_labels, epochs=20, batch_size=32, validation_data=(test_images, test_labels))

# Get the training and validation loss values from the history object
train_loss = history.history['loss']
val_loss = history.history['val_loss']

# Get the training and validation accuracy values from the history object
train_acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

# Plot the training and validation losses
plt.figure(figsize=(10, 4))
plt.subplot(1, 2, 1)
plt.plot(range(len(train_loss)), train_loss, label='Training Loss')
plt.plot(range(len(val_loss)), val_loss, label='Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.title('Training and Validation Loss')
plt.legend()

# Plot the training and validation accuracies
plt.subplot(1, 2, 2)
plt.plot(range(len(train_acc)), train_acc, label='Training Accuracy')
plt.plot(range(len(val_acc)), val_acc, label='Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.title('Training and Validation Accuracy')
plt.legend()

# Show the plot
plt.tight_layout()
plt.show()

###Recurrent Neural Network
Recurrent Neural Networks (RNN's) function a bit differently. While ANN and CNN are feed-forward models (they only pass the information forward), RNN's also pass information backwards. They do this by saving the output of the processing nodes and feeding this back into the model. An often-used example for this is predicting words in sentences; in order to predict what word comes next the RNN doesn't just look at the current word, but also takes into account the words that came before it.

An RNN uses mostly the same layers as an CNN. An RNN model is built up out of Recurrent Layers, which thus reuse data, (optionally Pooling Layers) and Dense Layers.

RNN's are generally used for natural language processing and other text-related purposes, the visual data we analyze here is thus not ideal.

In [None]:
# Build the RNN model
model = Sequential([
    SimpleRNN(32, activation='relu', input_shape=(32, 32*3)),
    Dense(64, activation='relu'),
    Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Reshape the images and train the model
train_images_reshaped = train_images.reshape(train_images.shape[0], 32, 32*3)
test_images_reshaped = test_images.reshape(test_images.shape[0], 32, 32*3)
history = model.fit(train_images_reshaped, train_labels, epochs=20, batch_size=32, validation_data=(test_images_reshaped, test_labels))

# Get the training and validation loss values from the history object
train_loss = history.history['loss']
val_loss = history.history['val_loss']

# Get the training and validation accuracy values from the history object
train_acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

# Plot the training and validation losses
plt.figure(figsize=(10, 4))
plt.subplot(1, 2, 1)
plt.plot(range(len(train_loss)), train_loss, label='Training Loss')
plt.plot(range(len(val_loss)), val_loss, label='Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.title('Training and Validation Loss')
plt.legend()

# Plot the training and validation accuracies
plt.subplot(1, 2, 2)
plt.plot(range(len(train_acc)), train_acc, label='Training Accuracy')
plt.plot(range(len(val_acc)), val_acc, label='Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.title('Training and Validation Accuracy')
plt.legend()

# Show the plot
plt.tight_layout()
plt.show()