MNIST Website: https://yann.lecun.com/exdb/mnist/

## Fully Connected Neural Network, Steps:
1. Flatten the input image dimensions to 1D (width pixels x height pixels)
2. Normalize the image pixel values (divide by 255)
3. One-Hot Encode the categorical column
4. Build a model architecture (Sequential) with Dense layers(Fully connected layers)
5. Train the model and make predictions

In [9]:
!pip install -q np_utils


[notice] A new release of pip is available: 23.2.1 -> 23.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [12]:
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Conv2D, MaxPool2D
from tensorflow.keras.utils import to_categorical

In [13]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()

In [14]:
print("X_train shape", X_train.shape)
print("y_train shape", y_train.shape)
print("X_test shape", X_test.shape)
print("y_test shape", y_test.shape)

X_train shape (60000, 28, 28)
y_train shape (60000,)
X_test shape (10000, 28, 28)
y_test shape (10000,)


In [15]:
# Flattening the images from the 28x28 pixels to 1D 787 pixels
X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

In [16]:
# Normalizing the data
X_train /= 255
X_test /= 255

In [17]:
# One-Hot encoding using tensorflow.keras numpy-related utilities
n_classes = 10
print("Shape before one-hot encoding: ", y_train.shape)
Y_train = to_categorical(y_train, n_classes)
Y_test = to_categorical(y_test, n_classes)
print("Shape after one-hot encoding: ", Y_train.shape)

Shape before one-hot encoding:  (60000,)
Shape after one-hot encoding:  (60000, 10)


In [18]:
# Building a linear stack of layers with the sequential model
model = Sequential()

In [19]:
# Hidden layer
model.add(Dense(100, input_shape = (784,), activation = 'relu'))

In [20]:
# Output layer
model.add(Dense(10, activation = 'softmax'))

In [21]:
# Model summary
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 100)               78500     
                                                                 
 dense_1 (Dense)             (None, 10)                1010      
                                                                 
Total params: 79510 (310.59 KB)
Trainable params: 79510 (310.59 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [22]:
# Compiling the model
model.compile(loss = 'categorical_crossentropy', metrics = ['accuracy'], optimizer = 'adam')

In [23]:
# Training the model for N epochs
N = 10
model.fit(X_train, Y_train, batch_size = 128, epochs = N, validation_data = (X_test, Y_test))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x2353c25e4f0>

## CNN model

One major advantage of using ConvNets over NNs is that you do not need to flatten the input images to 1D as they are capable of working with image data in 2D. This helps in retaining the "spatial" properties of images

In [25]:
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Conv2D, MaxPool2D, Flatten
from tensorflow.keras.utils import to_categorical

In [26]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()

In [27]:
# Building the input vector from the 28x28 pixels
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

In [28]:
# Normalizing the data
X_train /= 255
X_test /= 255

In [29]:
# One-Hot encoding using tensorflow.keras numpy-related utilities
n_classes = 10
print("Shape before one-hot encoding: ", y_train.shape)
Y_train = to_categorical(y_train, n_classes)
Y_test = to_categorical(y_test, n_classes)
print("Shape after one-hot encoding: ", Y_train.shape)

Shape before one-hot encoding:  (60000,)
Shape after one-hot encoding:  (60000, 10)


In [30]:
# Building a linear stack of layers with the sequential model
model = Sequential()

In [31]:
# Convolutional layer
model.add(Conv2D(25, kernel_size = (3, 3), strides = (1, 1), padding = 'valid', activation = 'relu', input_shape = (28, 28, 1)))
model.add(MaxPool2D(pool_size = (1, 1)))

In [32]:
# Flatten output of convolutional layer
model.add(Flatten())

In [33]:
# Hidden layer
model.add(Dense(100, activation = 'relu'))

In [34]:
# Output layer
model.add(Dense(10, activation = 'softmax'))

In [35]:
# Compiling the model
model.compile(loss = 'categorical_crossentropy', metrics = ['accuracy'], optimizer = 'adam')

In [36]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 26, 26, 25)        250       
                                                                 
 max_pooling2d (MaxPooling2  (None, 26, 26, 25)        0         
 D)                                                              
                                                                 
 flatten (Flatten)           (None, 16900)             0         
                                                                 
 dense_2 (Dense)             (None, 100)               1690100   
                                                                 
 dense_3 (Dense)             (None, 10)                1010      
                                                                 
Total params: 1691360 (6.45 MB)
Trainable params: 1691360 (6.45 MB)
Non-trainable params: 0 (0.00 Byte)
________________

In [37]:
E = 10
model.fit(X_train, Y_train, batch_size = 128, epochs = E, validation_data = (X_test, Y_test))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x23539ea3820>