# Convolutional Neural Network (CNN) for MNIST Classification

## Introduction
In this notebook, we will explore the implementation of a Convolutional Neural Network (CNN) for the classification of the MNIST dataset. The MNIST dataset is a widely-used benchmark dataset in the field of machine learning, consisting of handwritten digits.

## Objective
The main objective of this notebook is to demonstrate the effectiveness of CNNs in classifying handwritten digits and to provide a step-by-step guide on building and training a CNN model using the TensorFlow library.

## Author
- **Jose Ruben Garcia Garcia**
- March 2024

## Reference
- Practical Machine Learning Python Problems Solver


# Creation

## Model Architecture and Parameters
The CNN model consists of several layers, each serving a specific purpose:

- **Convolutional Layers (`Conv2D`)**: Two convolutional layers with 32 and 64 filters, respectively, are used to extract features from the input images. These layers employ the ReLU activation function to introduce non-linearity.
  
- **MaxPooling Layers (`MaxPooling2D`)**: After each convolutional layer, a max-pooling layer with a 2x2 kernel size is applied to downsample the features and reduce computational complexity.
  
- **Dropout Layers (`Dropout`)**: Dropout layers are included to prevent overfitting by randomly deactivating neurons during training. Dropout rates of 0.25 and 0.5 are used after the convolutional and fully connected layers, respectively.
  
- **Fully Connected Layers (`Dense`)**: Two fully connected layers with ReLU activation are added for classification. The final layer uses the softmax activation function to output probabilities for each class.

## Model Compilation and Training
The model is compiled using categorical cross-entropy loss and the Adadelta optimizer. The training process involves feeding the training data through the model for a specified number of epochs, with a batch size of 128. The model's performance is monitored using validation data to ensure generalization.

## Evaluation
After training, the model is evaluated using the test dataset to assess its performance on unseen data. The choice of evaluation metrics, including test loss and accuracy, was deliberate. Test loss provides a quantitative measure of the model's predictive error, allowing us to understand how well the model is performing in terms of minimizing prediction errors. On the other hand, accuracy measures the proportion of correctly classified instances, providing insights into the overall predictive capabilities of the model. By analyzing these metrics, we gain a comprehensive understanding of the model's performance and its ability to generalize to new, unseen data. This rigorous evaluation process is essential for assessing the model's reliability and guiding further improvements

## Conclusion
the development of this Convolutional Neural Network (CNN) model for MNIST classification has yielded valuable insights into the intricacies of image classification and deep learning methodologies. Through the iterative process of model design, training, and evaluation, several key conclusions emerge. Firstly, the effectiveness of CNN architectures in extracting hierarchical features from images is underscored, showcasing their robustness in handling complex data structures. Additionally, the significance of preprocessing techniques such as normalization and data augmentation in enhancing model performance becomes apparent, highlighting the importance of data preparation in machine learning workflows. Furthermore, the impact of hyperparameter tuning and optimization strategies on model convergence and generalization is evident, emphasizing the need for systematic experimentation and fine-tuning. Overall, this project not only demonstrates my proficiency in implementing advanced machine learning techniques but also underscores the importance of methodical approach and critical analysis in model development.



In [None]:
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K

# Parameters
batch_size = 128
num_classes = 10
epochs = 9

# Input image dimensions
img_rows, img_cols = 28, 28

# Load and preprocess the MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255

# Convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

# Define the CNN model
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

# Compile the model
model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

# Train the model
model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))

# Evaluate the model
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
