## Setup

In [1]:
from keras.models import Sequential

2023-03-09 14:21:24.404208: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


- Conv2D: This class is used to define a 2D convolutional layer in a CNN model. This layer is responsible for extracting features from the input data.
- MaxPooling2D: This class is used to define a 2D max pooling layer in a CNN model. This layer is responsible for down-sampling the input data.
- Flatten: This class is used to flatten the output of the convolutional and max pooling layers, before passing it to the dense layers.
- Dense: This class is used to define a fully connected layer in a neural network model.

In [2]:
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

The MNIST dataset is a dataset of handwritten digits, which is often used as a benchmark for training and evaluating machine learning models. The dataset contains 60,000 training images and 10,000 test images of handwritten digits, along with their corresponding labels (i.e., the digit that is written in each image).

In [3]:
from keras.datasets import mnist

## Prepare the data

Loading the MNIST data. Returns a tuple (used to store multiple items in a single variable) training and test sets, 
where X_train and X_test are the image data, and y_train and y_test are the corresponding labels.

In [9]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


Reshape and normalize the data
The CNNs expect a 4D tensor with the shape (batch_size, height, width, channels). 
This step is adding the missing channel dimension to the data, which is 1 in this case as the images are grayscale.
In the case of an RGB (Red, Green, Blue) image it would have been 3.

In [10]:
X_train = X_train.reshape(60000, 28, 28, 1)
X_test = X_test.reshape(10000, 28, 28, 1)

To ensure that the pixel values are in the range of 0 to 1, which is a common preprocessing step for image data.

In [11]:
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255 # why 255?
X_test /= 255

Convert the labels to categorical
This function converts the integer labels to a binary format where each label is represented as a one-hot encoded vector. 
This step is necessary because the final output layer of the network uses a softmax activation function 
which expects the labels to be in this format. 
The input argument 10 means that we have 10 classes (0-9 digits).

In [12]:
from keras.utils import to_categorical
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

## Build the model

Create a sequential model
A sequential model is a linear stack of layers, where the output of one layer is the input of the next.

In [13]:
model = Sequential()

2023-03-03 11:57:51.084414: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


Add a convolutional layer with 32 filters, a kernel size of 3x3, and a ReLU activation function
The ReLU activation function is a simple equation that takes the input of a neuron and returns the input if it is positive, 
and returns 0 if it is negative. 

In [14]:
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1))) # input is a 28x28 image with 1 color channel.

Add a max pooling layer with a pool size of 2x2
This layer applies a max operation over a 2x2 window of the input, reducing the spatial dimensions of the input by half.

In [15]:
model.add(MaxPooling2D(pool_size=(2, 2)))

# Add a convolutional layer with 64 filters, a kernel size of 3x3, and a ReLU activation function
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))

# Add a max pooling layer with a pool size of 2x2
model.add(MaxPooling2D(pool_size=(2, 2)))

# Flatten the output from the previous layers
model.add(Flatten())

Add a fully connected layer with 128 units and a ReLU activation function
This layer has 128 neurons and it is fully connected to the previous layer

In [16]:
model.add(Dense(128, activation='relu'))

Add a final output layer with 10 units and a softmax activation function
The softmax function is used to convert the output of the final layer into probability distribution over 10 possible classes.

In [17]:
model.add(Dense(10, activation='softmax'))

## Train the model

Compiling the model with a categorical crossentropy loss function and an Adam optimizer
- loss: This argument specifies the loss function that the model should use to measure its performance during training. A loss function is a mathematical equation that measures how well the model is able to make predictions. 
- optimizer: This argument specifies the optimization algorithm that the model should use to update its weights during training.
- metrics: this argument specifies the metric(s) that the model should use to evaluate its performance during training.

In [None]:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

Train the model on the training data
- X_train and y_train: These arguments specify the training data and labels. X_train is the input data and y_train is the corresponding target data.
- epochs: This argument specifies the number of times the model should iterate over the entire training data.
- batch_size: This argument specifies the number of samples per gradient update. ??

In [None]:
model.fit(X_train, y_train, epochs=10, batch_size=32)

The model will be trained on the X_train data with the corresponding y_train labels using the categorical crossentropy loss function and the Adam optimizer for 10 epochs with a batch size of 32. 
The training process will be evaluated with the accuracy metric.

## Evaluate the trained model

Evaluate the model on the test data. The evaluate function is used to evaluate the performance of the model on the test data.
- X_test and y_test: These arguments specify the test data and labels, respectively. X_test is the input data and y_test is the corresponding target data.
- The evaluate() function returns a list of evaluation metrics, which is stored in the variables test_loss and test_acc. 
- The test_loss variable contains the value of the loss function (e.g. categorical_crossentropy) on the test data, and 
- the test_acc variable contains the value of the accuracy metric on the test data

In [None]:
test_loss, test_acc = model.evaluate(X_test, y_test)

The test accuracy is a measure of how well the model is able to make predictions on unseen data.

In [None]:
print('Test accuracy:', test_acc)

It is important to note that the evaluate() function uses the data provided to make predictions and calculate the performance metric, but it doesn't update the model weights. It is used to evaluate the performance of a trained model on new data, to see how well it generalizes to unseen data.

Reflection:

In [None]:
import tensorflow as tf
from tensorflow.keras import layers

# Define the CNN model
model = tf.keras.Sequential([
  layers.Conv2D(32, (3, 3), activation='relu', input_shape=(width, height, channels)),
  layers.MaxPooling2D((2, 2)),
  layers.Conv2D(64, (3, 3), activation='relu'),
  layers.MaxPooling2D((2, 2)),
  layers.Conv2D(128, (3, 3), activation='relu'),
  layers.MaxPooling2D((2, 2)),
  layers.Flatten(),
  layers.Dense(64, activation='relu'),
  layers.Dense(num_classes, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(train_data, train_labels, epochs=10, validation_data=(val_data, val_labels))

# Evaluate the model
test_loss, test_acc = model.evaluate(test_data, test_labels)

# Make predictions using the model
predictions = model.predict(new_data)


In the above code, width, height, and channels refer to the dimensions of the input data. num_classes refers to the number of different behavior classes that the model can predict. train_data, train_labels, val_data, val_labels, test_data, test_labels, and new_data are the training, validation, testing, and new data inputs respectively.

Note that you'll need to preprocess the input data before passing it into the model. This can involve normalization, cropping, resizing, and other transformations that depend on your specific data format and application. Additionally, you may need to adjust the architecture and hyperparameters of the model to achieve optimal performance.