<a href="https://colab.research.google.com/github/nitishkpandey/AI-ML/blob/main/convolutional_neural_networks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Convolutional Neural Networks
You should build an end-to-end machine learning pipeline using a convolutional neural network model. In particular, you should do the following:
- Load the `mnist` dataset using [Pandas](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html). You can find this dataset in the datasets folder.
- Split the dataset into training and test sets using [Scikit-Learn](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html).
- Build an end-to-end machine learning pipeline, including a [convolutional neural network](https://keras.io/examples/vision/mnist_convnet/) model.
- Optimize your pipeline by validating your design decisions.
- Test the best pipeline on the test set and report various [evaluation metrics](https://scikit-learn.org/0.15/modules/model_evaluation.html).  
- Check the documentation to identify the most important hyperparameters, attributes, and methods of the model. Use them in practice.

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from tensorflow.keras.utils import to_categorical
import numpy as np

df = pd.read_csv('/content/mnist.csv')

X = df.drop(['id', 'class'], axis=1)
y = df['class']

X = X / 255.0

image_height = 28
image_width = 28
image_size = image_height * image_width

X = X.values.reshape(-1, image_height, image_width, 1)

num_classes = len(y.unique())
y_categorical = to_categorical(y, num_classes=num_classes)

X_train, X_test, y_train_categorical, y_test_categorical = train_test_split(
    X, y_categorical, test_size=0.2, random_state=42
)

X_train_normalized = X_train
X_test_normalized = X_test

print(f'X_train_normalized shape: {X_train_normalized.shape}')
print(f'y_train_categorical shape: {y_train_categorical.shape}')
print(f'X_test_normalized shape: {X_test_normalized.shape}')
print(f'y_test_categorical shape: {y_test_categorical.shape}')
batch_size = 128
epochs = 20


X_train_normalized shape: (3200, 28, 28, 1)
y_train_categorical shape: (3200, 10)
X_test_normalized shape: (800, 28, 28, 1)
y_test_categorical shape: (800, 10)


In [None]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

model = Sequential([
    Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(image_height, image_width, 1)),
    MaxPooling2D(pool_size=(2, 2)),
    Conv2D(64, kernel_size=(3, 3), activation='relu'),
    MaxPooling2D(pool_size=(2, 2)),
    Flatten(),
    Dense(128, activation='relu'),
    Dropout(0.5),
    Dense(num_classes, activation='softmax')
])

model.summary()

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [None]:
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

history = model.fit(
    X_train_normalized, y_train_categorical,
    epochs=epochs,
    batch_size=batch_size,
    validation_data=(X_test_normalized, y_test_categorical)
)

Epoch 1/20
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 155ms/step - accuracy: 0.3303 - loss: 1.9883 - val_accuracy: 0.8413 - val_loss: 0.6316
Epoch 2/20
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 143ms/step - accuracy: 0.7942 - loss: 0.6605 - val_accuracy: 0.9200 - val_loss: 0.2995
Epoch 3/20
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 165ms/step - accuracy: 0.8905 - loss: 0.3427 - val_accuracy: 0.9275 - val_loss: 0.2348
Epoch 4/20
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 213ms/step - accuracy: 0.9156 - loss: 0.2841 - val_accuracy: 0.9550 - val_loss: 0.1675
Epoch 5/20
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 177ms/step - accuracy: 0.9390 - loss: 0.2114 - val_accuracy: 0.9525 - val_loss: 0.1522
Epoch 6/20
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 98ms/step - accuracy: 0.9498 - loss: 0.1798 - val_accuracy: 0.9538 - val_loss: 0.1259
Epoch 7/20
[1m25/25[0m [32

In [None]:
import numpy as np
from sklearn.metrics import classification_report, confusion_matrix

loss, accuracy = model.evaluate(X_test_normalized, y_test_categorical, verbose=0)
print(f'Test Loss: {loss:.4f}')
print(f'Test Accuracy: {accuracy:.4f}')

predictions = model.predict(X_test_normalized)

y_predicted_labels = np.argmax(predictions, axis=1)
y_true_labels = np.argmax(y_test_categorical, axis=1)

print('\nClassification Report:')
print(classification_report(y_true_labels, y_predicted_labels))

print('\nConfusion Matrix:')
print(confusion_matrix(y_true_labels, y_predicted_labels))

Test Loss: 0.0958
Test Accuracy: 0.9675
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step

Classification Report:
              precision    recall  f1-score   support

           0       0.97      1.00      0.99        70
           1       0.98      1.00      0.99       100
           2       0.97      0.90      0.94        73
           3       0.99      0.93      0.96        86
           4       0.94      0.96      0.95        80
           5       0.97      0.98      0.98        64
           6       1.00      0.98      0.99        90
           7       0.99      0.99      0.99        67
           8       0.93      0.96      0.94        94
           9       0.95      0.97      0.96        76

    accuracy                           0.97       800
   macro avg       0.97      0.97      0.97       800
weighted avg       0.97      0.97      0.97       800


Confusion Matrix:
[[ 70   0   0   0   0   0   0   0   0   0]
 [  0 100   0   0   0   0   0   0   0   0

In [None]:
score = model.evaluate(X_test_normalized, y_test_categorical, verbose=0)
print("Test loss:", score[0])
print("Test accuracy:", score[1])

Test loss: 0.0957527756690979
Test accuracy: 0.9674999713897705


## Summary:

### Data Analysis Key Findings

*   **Data Preparation:** The `mnist.csv` dataset was loaded, separated into features (pixel data) and target (class labels), and then split into training and testing sets. Pixel data was reshaped to 28x28x1, normalized to a [0, 1] range, and target labels were one-hot encoded for multiclass classification.
*   **CNN Model Architecture:** A Keras Sequential CNN model was constructed with two `Conv2D` layers (32 and 64 filters) followed by `MaxPooling2D` layers, a `Flatten` layer, a `Dense` layer (128 units, 'relu' activation) with a `Dropout` layer (0.5), and a final `Dense` output layer with 'softmax' activation for 10 classes. The model had a total of 225,034 parameters.
*   **Training Process:** The model was compiled using the 'adam' optimizer, 'categorical\_crossentropy' loss function, and 'accuracy' metric. It was trained for 20 epochs with a specified batch size, using `X_train_normalized` and `y_train_categorical`, and validated against `X_test_normalized` and `y_test_categorical`.
*   **Final Evaluation Results:**
    *   The model achieved a **Test Loss of 0.095** and a **Test Accuracy of 0.967** on the unseen test dataset.
    *   The classification report revealed high performance across all 10 digit classes, with most precision, recall, and F1-scores being above 0.95, and several reaching 0.99 or 1.00.
    *   The macro average and weighted average F1-scores were both **0.97**, indicating robust and balanced classification performance.
    *   The confusion matrix confirmed very few misclassifications, demonstrating the model's strong ability to correctly identify handwritten digits.

### Insights or Next Steps

*   The current CNN model demonstrates excellent performance on the MNIST dataset. For deployment, consider optimizing the model for inference speed and size using techniques like quantization or pruning, especially if target environments have resource constraints.
*   To further validate the model's robustness, it would be beneficial to test its performance on a more diverse dataset of handwritten digits, or introduce slight image augmentations (e.g., rotations, shifts) during training to improve generalization to real-world variations.
