<a href="https://colab.research.google.com/github/nicoleolivetto/CNN/blob/main/Simple_MNIST_convnet.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import numpy as np
from tensorflow import keras
from tensorflow.keras import layers

-Define model and data parameters: It sets the number of classes (10 for the digits 0-9) and the input shape for the images (28x28 pixels with a single color channel).

-Load and preprocess the MNIST dataset: It loads the MNIST dataset, which consists of a training set and a test set of 28x28 grayscale images of handwritten digits and their corresponding labels. The images are scaled to the range [0, 1] and expanded to have a shape of (28, 28, 1).

-Convert labels to one-hot encoding: It converts the class labels to one-hot encoded vectors using keras.utils.to_categorical.

In [2]:
# Model / data parameters
num_classes = 10
input_shape = (28, 28, 1)

# Load the data and split it between train and test sets
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Scale images to the [0, 1] range
x_train = x_train.astype("float32") / 255
x_test = x_test.astype("float32") / 255
# Make sure images have shape (28, 28, 1)
x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)
print("x_train shape:", x_train.shape)
print(x_train.shape[0], "train samples")
print(x_test.shape[0], "test samples")


# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples


-Define the neural network model: The model is defined as a sequential neural network with the following layers:

-Input layer: It takes the input shape.
Convolutional layer with 32 filters, each with a 3x3 kernel and ReLU activation.
Max-pooling layer with a 2x2 pool size.
Another convolutional layer with 64 filters and a 3x3 kernel, followed by ReLU activation.
Another max-pooling layer with a 2x2 pool size.
A flatten layer to transform the 2D feature maps into a 1D vector.
A dropout layer with a 50% dropout rate to prevent overfitting.
A dense layer with 10 units and softmax activation for class prediction.
Model summary: It prints a summary of the model architecture.

In [6]:
model = keras.Sequential(
    [
        keras.Input(shape=input_shape),
        layers.Conv2D(32, kernel_size=(7, 7), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Flatten(),
        layers.Dropout(0,5),
        layers.Dense(num_classes, activation="softmax"),
    ]
)

model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_2 (Conv2D)           (None, 22, 22, 16)        800       
                                                                 
 max_pooling2d_2 (MaxPoolin  (None, 11, 11, 16)        0         
 g2D)                                                            
                                                                 
 conv2d_3 (Conv2D)           (None, 9, 9, 64)          9280      
                                                                 
 max_pooling2d_3 (MaxPoolin  (None, 4, 4, 64)          0         
 g2D)                                                            
                                                                 
 flatten_1 (Flatten)         (None, 1024)              0         
                                                                 
 dropout_1 (Dropout)         (None, 1024)             

-Set training parameters: It defines the batch size (128) and the number of training epochs (15).

-Compile the model: The model is compiled with the categorical cross-entropy loss function, the Adam optimizer, and accuracy as the metric.

-Train the model: The model is trained using the training data. It splits 10% of the data for validation. The training is performed for 15 epochs with a batch size of 128.

In [7]:
batch_size = 128
epochs = 15

model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])

model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1)

Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


<keras.src.callbacks.History at 0x7f2c4aebf8b0>

-Evaluate the model on the test set: The script evaluates the trained model on the test data and prints the test loss and test accuracy.

In [5]:
score = model.evaluate(x_test, y_test, verbose=0)
print("Test loss:", score[0])
print("Test accuracy:", score[1])

Test loss: 0.031006624922156334
Test accuracy: 0.9908000230789185


changing kernel size to 5x5 and 7x7 icreases the accuracy



Increasing the Number of Epochs:

If you increase the number of epochs, the model has more opportunities to learn from the training data.
Initially, as you train for more epochs, you might observe that the training loss decreases, and the training accuracy improves.
The model may start fitting the training data more closely and could potentially achieve a higher training accuracy.
However, increasing the number of epochs significantly without any control can lead to overfitting. The model may start to memorize the training data and not generalize well to unseen data.


Increasing the Number of Filters:

Increasing the number of filters can allow the network to capture more complex and fine-grained features in the input data.
It increases the network's capacity to learn from the data, which can be beneficial for tasks with intricate or detailed patterns.
However, using too many filters can also increase the number of model parameters and computation, potentially leading to longer training times and an increased risk of overfitting, especially if the dataset is small.