# Convolutional Neural Networks (CNN) Documentation

## Introduction
A Convolutional Neural Network (CNN) is a deep learning algorithm that can take in an input image, assign importance to various aspects/objects in the image, and be able to differentiate one from the other. This notebook will guide you through the creation, functioning, and hyperparameters of a CNN using Python and popular libraries such as TensorFlow and Keras. We will also provide examples to demonstrate how to build, train, and evaluate a CNN.


In [1]:
# Required Libraries
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical

# Installation (if not already installed)
# !pip install numpy tensorflow


# Data loading and Preprocessing 
loading the MNIST dataset, which is a widely-used dataset of handwritten digits. Here's a brief explanation of each step:

**Load Dataset**: The mnist.load_data() function loads the MNIST dataset, returning two tuples: (X_train, y_train) for training data and labels, and (X_test, y_test) for test data and labels.

**Reshape Data**: The MNIST dataset originally consists of 28x28 pixel grayscale images. The reshape() function is used to reshape the data to fit the model. Here, we add an extra dimension to the data to indicate the single channel (grayscale) since convolutional layers expect input in the shape (height, width, channels).

**Normalize Data**: The pixel values of the images are scaled to the range [0, 1] by dividing by 255. This step is essential to ensure that the input values fall within a similar range, which helps the optimization algorithm converge faster.

**One-Hot Encode Labels**: The to_categorical() function is used to convert the class labels into one-hot encoded format. In the MNIST dataset, there are 10 classes (digits 0 through 9), so each label is represented as a binary vector of length 10, with a 1 at the index corresponding to the class and 0s elsewhere.

**Display Dataset Information**: Finally, the shapes of the training and test datasets are printed to verify that the preprocessing steps were successful. The training data shape indicates the number of samples, height, width, and number of channels, while the test data shape provides similar information for the test set.

In [2]:
# Load dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Reshape data to fit the model
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)

# Normalize the data
X_train = X_train.astype('float32') / 255
X_test = X_test.astype('float32') / 255

# One-hot encode the labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

# Display dataset information
print("Training data shape:", X_train.shape)
print("Test data shape:", X_test.shape)


Training data shape: (60000, 28, 28, 1)
Test data shape: (10000, 28, 28, 1)


# Building the CNN model

In [3]:
# Build the model
model = Sequential()

# Convolutional layer with 32 filters, kernel size of 3x3, ReLU activation function
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1)))
print("Added Conv2D layer with 32 filters and 3x3 kernel size.")

# MaxPooling layer with pool size of 2x2
model.add(MaxPooling2D(pool_size=(2, 2)))
print("Added MaxPooling layer with 2x2 pool size.")

# Second Convolutional layer with 64 filters, kernel size of 3x3, ReLU activation function
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
print("Added second Conv2D layer with 64 filters and 3x3 kernel size.")

# Second MaxPooling layer with pool size of 2x2
model.add(MaxPooling2D(pool_size=(2, 2)))
print("Added second MaxPooling layer with 2x2 pool size.")

# Flatten the data for fully connected layers
model.add(Flatten())
print("Added Flatten layer.")

# Fully connected layer with 128 neurons, ReLU activation function
model.add(Dense(128, activation='relu'))
print("Added Dense layer with 128 neurons.")

# Dropout layer to prevent overfitting
model.add(Dropout(0.5))
print("Added Dropout layer with 0.5 rate.")

# Output layer with 10 neurons (one for each class), softmax activation function
model.add(Dense(10, activation='softmax'))
print("Added output layer with 10 neurons and softmax activation function.")

# Compile the model with Adam optimizer and categorical crossentropy loss function
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
print("Compiled the model with Adam optimizer and categorical crossentropy loss function.")

# Display the model summary
model.summary()


Added Conv2D layer with 32 filters and 3x3 kernel size.
Added MaxPooling layer with 2x2 pool size.
Added second Conv2D layer with 64 filters and 3x3 kernel size.
Added second MaxPooling layer with 2x2 pool size.
Added Flatten layer.
Added Dense layer with 128 neurons.
Added Dropout layer with 0.5 rate.
Added output layer with 10 neurons and softmax activation function.
Compiled the model with Adam optimizer and categorical crossentropy loss function.
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 26, 26, 32)        320       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 13, 13, 32)       0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 11, 11, 64)        18496   

## Hyperparameters
Hyperparameters are the parameters that are not learned during training but are set before the training process. Some important hyperparameters in CNNs include:

- **Learning Rate**: Controls how much to change the model in response to the estimated error each time the model weights are updated.
- **Batch Size**: The number of samples processed before the model is updated.
- **Epochs**: The number of times the entire dataset is passed forward and backward through the neural network.
- **Kernel Size**: The size of the filter in the convolutional layers.
- **Number of Filters**: The number of filters in the convolutional layers.
- **Dropout Rate**: The fraction of neurons to drop during training to prevent overfitting.

In this example, we use:
- Optimizer: Adam
- Loss function: Categorical Crossentropy
- Metrics: Accuracy
- Batch size: 128
- Epochs: 12
- Kernel size: 3x3
- Number of filters: 32 and 64
- Dropout rate: 0.5


#Model Training

In [4]:
# Train the model
history = model.fit(X_train, y_train, epochs=5, batch_size=128, validation_split=0.2)

# Visualize training process
import matplotlib.pyplot as plt

plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label='val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.ylim([0, 1])
plt.legend(loc='lower right')
plt.show()


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


: 

In [None]:
# Example predictions
predictions = model.predict(X_test[:5])
print("Predicted probabilities:\n", predictions)
print("Predicted classes:\n", np.argmax(predictions, axis=1))
print("Actual classes:\n", np.argmax(y_test[:5], axis=1))


# Model Evaluation

In [None]:
# Evaluate the model
test_loss, test_acc = model.evaluate(X_test, y_test, verbose=2)
print('\nTest accuracy:', test_acc)


## Conclusion
In this notebook, we have built and trained a Convolutional Neural Network using TensorFlow and Keras. We covered the preprocessing of data, building the model, setting hyperparameters, training, and evaluating the model.

### Further Reading
- [Deep Learning with Python by François Chollet](https://www.manning.com/books/deep-learning-with-python)
- [TensorFlow Documentation](https://www.tensorflow.org/learn)
- [Keras Documentation](https://keras.io/)

### Improvements
- Experiment with different architectures and hyperparameters.
- Use cross-validation for more robust model evaluation.
- Apply this approach to other image datasets and problem domains.
