# Task 6P

The MNIST dataset is a widely used benchmark dataset for training and testing machine learning algorithms, particularly those for image recognition and classification. It consists of 60,000 training images and 10,000 testing images, each of which is a 28x28 grayscale image representing a handwritten digit from 0 to 9.

#### Key Characteristics:

- **Size**: 70,000 images in total (60,000 for training, 10,000 for testing)
- **Image Size**: 28x28 pixels
- **Format**: Grayscale images
- **Labels**: Each image is associated with a corresponding digit label (0-9)
- **Distribution**: The dataset is relatively balanced, with approximately the same number of images for each digit.

#### Import libraries

In [1]:
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.utils import to_categorical

#### Load the MNIST dataset

In [2]:
# Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Normalize the image data to [0, 1]
x_train = x_train.reshape(60000, 784).astype('float32') / 255
x_test = x_test.reshape(10000, 784).astype('float32') / 255

# Convert class labels to one-hot encoded vectors
num_classes = 10
y_train = to_categorical(y_train, num_classes)
y_test = to_categorical(y_test, num_classes)

# Print the shape of the training and test sets
print("Training set shape:", x_train.shape, y_train.shape)
print("Test set shape:", x_test.shape, y_test.shape)

Training set shape: (60000, 784) (60000, 10)
Test set shape: (10000, 784) (10000, 10)


#### Create the MLP model

In [3]:
model = Sequential([
    Dense(128, activation='relu', input_shape=(784,)),
    Dense(num_classes, activation='softmax')])

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10, batch_size=128, validation_data=(x_test, y_test))

# Evaluate the model on the test set
test_loss, test_acc = model.evaluate(x_test, y_test)
print("Test accuracy:", test_acc)

2024-08-30 20:03:46.465389: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-08-30 20:03:46.694836: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Test accuracy: 0.9771000146865845


In [4]:
hidden_layers = [2, 4, 6, 8, 10]
for num_layers in hidden_layers:
    model = Sequential()
    for _ in range(num_layers):
        model.add(Dense(100, activation='relu'))
    model.add(Dense(num_classes, activation='softmax'))
    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    model.fit(x_train, y_train, epochs=10, batch_size=128, verbose=0)
    test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
    print(f"Hidden layer: {num_layers} - Test accuracy: {test_acc}")

Hidden layer: 2 - Test accuracy: 0.9776999950408936
Hidden layer: 4 - Test accuracy: 0.9781000018119812
Hidden layer: 6 - Test accuracy: 0.9757000207901001
Hidden layer: 8 - Test accuracy: 0.9710999727249146
Hidden layer: 10 - Test accuracy: 0.9747999906539917


In [5]:
hidden_sizes = [50, 100, 150, 200]
for hidden_size in hidden_sizes:
    model = Sequential([
        Dense(hidden_size, activation='relu', input_shape=(784,)),
        Dense(num_classes, activation='softmax')
    ])
    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    model.fit(x_train, y_train, epochs=10, batch_size=128, verbose=0)
    test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
    print(f"Hidden layer size: {hidden_size} - Test accuracy: {test_acc}")

Hidden layer size: 50 - Test accuracy: 0.9695000052452087
Hidden layer size: 100 - Test accuracy: 0.9775999784469604
Hidden layer size: 150 - Test accuracy: 0.978600025177002
Hidden layer size: 200 - Test accuracy: 0.979200005531311


### Key Findings

- Impact of Number of Hidden Layers: Increasing the number of hidden layers generally improves accuracy up to a certain point. Beyond that, adding more layers might lead to overfitting.
- Impact of Hidden Layer Size: Increasing the hidden layer size also improves accuracy up to a certain point. However, excessively large hidden layers can also lead to overfitting.
- Optimal Configuration: The optimal number of hidden layers and hidden layer size depends on the specific dataset and problem. It's often found through experimentation and techniques like hyperparameter tuning.

_____