Train a fully connected neural network (multilayer perceptron) to classify handwritten digits using the MNIST dataset by building a simple neural network, experimenting with activation functions, and tuning hyperparameters.

1. Objective

Train a fully connected neural network (MLP) on the MNIST dataset for digit classification (0–9).
You will:

Load the MNIST dataset

Build an MLP based neural network architecture

Try different activation functions

Tune key hyperparameters

Train and evaluate the model

2. Import Required Libraries

In [1]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Dropout
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical

3. Load & Preprocess MNIST Dataset

In [2]:
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Normalize pixel values (0-255 → 0-1)
x_train = x_train / 255.0
x_test = x_test / 255.0

# One-hot encode labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


4. Build a Simple MLP Model

In [None]:
Flatten layer (28×28 → 784)

Dense(128) – hidden layer

Activation: Relu

Dense(10) – output layer (softmax for probability)

In [3]:
model = Sequential([
    Flatten(input_shape=(28, 28)),
    Dense(128, activation='relu'),
    Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

model.summary()

  super().__init__(**kwargs)


5. Train the Model

In [4]:
history = model.fit(x_train, y_train,
                    validation_split=0.1,
                    epochs=10,
                    batch_size=32)

Epoch 1/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 5ms/step - accuracy: 0.8686 - loss: 0.4675 - val_accuracy: 0.9627 - val_loss: 0.1332
Epoch 2/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 4ms/step - accuracy: 0.9629 - loss: 0.1293 - val_accuracy: 0.9710 - val_loss: 0.0962
Epoch 3/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 5ms/step - accuracy: 0.9760 - loss: 0.0798 - val_accuracy: 0.9725 - val_loss: 0.0936
Epoch 4/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 4ms/step - accuracy: 0.9823 - loss: 0.0603 - val_accuracy: 0.9745 - val_loss: 0.0853
Epoch 5/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 5ms/step - accuracy: 0.9865 - loss: 0.0451 - val_accuracy: 0.9747 - val_loss: 0.0862
Epoch 6/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 5ms/step - accuracy: 0.9888 - loss: 0.0363 - val_accuracy: 0.9752 - val_loss: 0.0838
Epoch 7/10
[1m1

6. Evaluate the Model

In [5]:
test_loss, test_acc = model.evaluate(x_test, y_test)
print("Test Accuracy:", test_acc)

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.9770 - loss: 0.0974
Test Accuracy: 0.9793000221252441


7. Experiment With Different Activation Functions

In [6]:
from tensorflow.keras.layers import LeakyReLU

model = Sequential([
    Flatten(input_shape=(28, 28)),
    Dense(256),
    LeakyReLU(alpha=0.1),
    Dense(128, activation='tanh'),
    Dense(10, activation='softmax')
])



8. Hyperparameter

In [None]:
1. Number of Hidden Layers

In [7]:
Dense(256, activation='relu')
Dense(128, activation='relu')
Dense(64, activation='relu')

<Dense name=dense_7, built=False>

In [None]:
2. Change Optimizer

In [8]:
optimizer='sgd'     # slower but stable
optimizer='adam'    # best default choice
optimizer='rmsprop'

In [None]:
✔ 3. Change Learning Rate

In [9]:
from tensorflow.keras.optimizers import Adam
model.compile(
    optimizer=Adam(learning_rate=0.0005),
    loss="categorical_crossentropy",
    metrics=["accuracy"]
)

In [None]:
Add Dropout

In [11]:

model = Sequential([
    Flatten(input_shape=(28, 28)),
    Dense(256, activation='relu'),
    Dropout(0.3),
    Dense(128, activation='relu'),
    Dense(10, activation='softmax')
])


Final Improved Model (Recommended)model = Sequential([
    Flatten(input_shape=(28, 28)),
    Dense(256, activation='relu'),
    Dropout(0.3),
    Dense(128, activation='relu'),
    Dropout(0.2),
    Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train,
          validation_split=0.1,
          epochs=15,
          batch_size=64)