<a href="https://colab.research.google.com/github/khanfs/AI-Research/blob/main/CNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Building a CNN to Classify Images from the MNIST Dataset**

**Convolutional Neural Networks** for image classification are commonly used for image-related tasks, and they are designed to process data with a grid-like topology, such as images or sound waves.

In mathematics, convolution is an operation that combines two functions to produce a third function that describes how one function modifies the other. In the context of neural networks, convolutional layers use a convolution operation to extract features from the input data.

Convolutional means that the neural network is using a mathematical operation called convolution to process the input data. This operation involves sliding a small filter (also called a kernel) over the input data and computing a set of features at each location where the filter overlaps with the data. These features are then combined to form a new representation of the input data, which can be further processed by the network.

Convolutional layers are particularly well-suited for processing images and other types of spatial data, as they can capture local patterns and structures in the data, regardless of their position or orientation within the image. By stacking multiple convolutional layers on top of each other, a neural network can learn increasingly complex and abstract features, leading to better performance on tasks such as image recognition and object detection.

In [None]:
pip install tensorflow

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
import tensorflow as tf # Import the TensorFlow library
from tensorflow.keras.datasets import mnist # Import the MNIST dataset from the TensorFlow Keras library
from tensorflow.keras.models import Sequential # Import the Sequential class, which is a linear stack of neural network layers
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPooling2D # Import the Dense, Flatten, Conv2D, and MaxPooling2D layers, which are different types of neural network layers that we will use to build our model

**MNIST Database**: Modified National Institute of Standards and Technology database is a large database of handwritten digits that is commonly used for training various image processing systems. Consists of small, square 28x28 pixel grayscale images of handwritten single digits between 0 and 9: 70,000 image; training set 60,000 images; test set 10,000 images. All images are labeled with the respective digit that they represent.

In [None]:
# Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data() # Load the MNIST dataset and split it into training and testing sets

##**1. Data Preprocessing**

In this code, the input data (images) is being preprocessed by normalizing pixel values to a range between 0 and 1. This is achieved by dividing the pixel values by 255.0, which is the maximum pixel value for grayscale images. Normalizing the pixel values can help the model to converge faster during training and prevent numerical instability.

Therefore, x_train and x_test are the training and testing datasets respectively, and dividing them by 255.0 scales all the pixel values in the images between 0 and 1, making them easier for the neural network to process.

In [None]:
# Preprocess the data
x_train = x_train / 255.0 # Normalize the pixel values in the input images to be between 0 and 1
x_test = x_test / 255.0 # Normalize the pixel values in the test images to be between 0 and 1

##**2. Build the Model**

This code defines the architecture of a convolutional neural network (CNN) model using the Keras library:

**model = Sequential()**: This creates a new sequential model object, which is a linear stack of layers. It allows us to add layers to the model in a sequential manner.

**model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))**: This adds a 2D convolutional layer to the model with 32 filters, each with a 3x3 kernel size. The 'relu' activation function is applied to the output of this layer. The input_shape parameter specifies the shape of the input data, which is a 28x28 grayscale image with one channel.

**model.add(MaxPooling2D((2, 2)))**: This adds a max pooling layer to the model with a pool size of 2x2. This layer reduces the spatial dimensions of the output from the previous convolutional layer by taking the maximum value in each 2x2 window.

**model.add(Flatten())**: This flattens the output of the previous layer into a 1D array, which can be fed into a fully connected layer.

**model.add(Dense(64, activation='relu'))**: This adds a fully connected layer with 64 neurons, each with a 'relu' activation function.

**model.add(Dense(10, activation='softmax'))**: This adds the output layer of the model, which has 10 neurons (one for each class in the dataset) and a 'softmax' activation function. The output of this layer represents the probability distribution over the 10 classes.

Overall, this code defines a simple CNN architecture with one convolutional layer, one max pooling layer, and two fully connected layers. The output of the model is a probability distribution over the 10 classes in the dataset, and the model is trained to minimize the loss function using the specified optimizer and evaluation metric.

In [None]:
# Build the model
model = Sequential() # Create a new sequential model
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1))) # Add a convolutional layer with 32 filters of size 3x3 and a ReLU activation function to the model, with an input shape of 28x28x1 (since the images are grayscale)
model.add(MaxPooling2D((2, 2))) # Add a max pooling layer with a pool size of 2x2 to the model
model.add(Flatten()) # Flatten the output of the previous layer into a 1D array
model.add(Dense(64, activation='relu')) # Add a fully connected layer with 64 units and a ReLU activation function to the model
model.add(Dense(10, activation='softmax'))

## **3. Compile Model**

This line of code in is used to compile a neural network model. It specifies the optimizer, loss function, and evaluation metric that will be used during training.

**optimizer='adam'**: This specifies the optimizer that will be used to update the weights of the neural network during training. In this case, the optimizer is 'adam', which is a popular optimization algorithm that is well-suited for deep learning.

**loss='sparse_categorical_crossentropy'**: This specifies the loss function that will be used to measure the difference between the predicted output of the model and the true output. In this case, the loss function is 'sparse_categorical_crossentropy', which is commonly used for multiclass classification problems where the target labels are integers.

**metrics=['accuracy']**: This specifies the evaluation metric that will be used to measure the performance of the model during training and testing. In this case, the evaluation metric is 'accuracy', which measures the proportion of correct predictions made by the model.

Overall, these three settings define how the neural network will be trained and evaluated, and can have a significant impact on the performance of the model.

In [None]:
# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

##**4. Train the Model**
This is a line of code is used to train a convolutional neural network model. It specifies the training data, number of epochs, and validation data to be used during the training process.

**x_train.reshape(-1, 28, 28, 1)**: This specifies the input training data to the model, which is a set of images in this case. The reshape() function is used to convert the input data into a 4-dimensional tensor of shape (batch_size, height, width, channels), where batch_size is the number of training examples, height and width are the dimensions of each image, and channels is the number of color channels (1 for grayscale images, 3 for RGB images).

**y_train**: This specifies the target training data, which is a set of labels that correspond to the images in the x_train input data.

**epochs=5**: This specifies the number of epochs (iterations over the entire training set) to be used during training. In this case, the model will be trained for 5 epochs.

**validation_data=(x_test.reshape(-1, 28, 28, 1), y_test)**: This specifies the validation data to be used during training, which is a set of images and labels that are separate from the training data. The reshape() function is used to convert the validation data into the same 4-dimensional tensor format as the training data. The validation data is used to monitor the performance of the model on data that it has not seen before, and can help to prevent overfitting (when the model memorizes the training data without generalizing well to new data).

Overall, this line of code sets up the training process for a convolutional neural network model, and specifies the input and output data to be used during training, the number of epochs to train for, and the validation data to monitor the performance of the model.

In [None]:
# Train the model
model.fit(x_train.reshape(-1, 28, 28, 1), y_train, epochs=5, validation_data=(x_test.reshape(-1, 28, 28, 1), y_test))

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f41e75ea520>

##**5. Evaluate the Model**

This line of code is used to evaluate the performance of a trained convolutional neural network model on a test dataset. It computes the model's loss and accuracy on the test dataset and prints the accuracy score.

**model.evaluate(x_test.reshape(-1, 28, 28, 1), y_test)**: This evaluates the trained model on the test data, which is a set of images and labels that the model has not seen before. The evaluate() function takes two arguments: the test input data x_test (reshaped to match the format used during training) and the corresponding target labels y_test. It returns two values: the loss (a measure of the difference between the predicted outputs and the true outputs) and the accuracy (the proportion of correct predictions made by the model).

**loss, accuracy = model.evaluate(...)**: This line of code assigns the returned values from the evaluate() function to two variables, loss and accuracy, using Python's tuple unpacking syntax. The loss variable stores the loss value computed by the model on the test data, and the accuracy variable stores the accuracy score computed by the model on the test data.

**print('Test accuracy:', accuracy)**: This line of code prints the accuracy score of the model on the test data, which is stored in the accuracy variable. The output of this line will be a string that says "Test accuracy:" followed by the actual accuracy score.

Overall, this code is used to evaluate the performance of a trained convolutional neural network model on a test dataset, and print the accuracy score of the model on the test data.

In [None]:
# Evaluate the model
loss, accuracy = model.evaluate(x_test.reshape(-1, 28, 28, 1), y_test)
print('Test accuracy:', accuracy)

Test accuracy: 0.9843000173568726
