Model Description:

This is a feedforward neural network (FNN) implemented using TensorFlow to classify handwritten digits from the MNIST dataset. The model consists of:

Input Layer: 784 neurons (flattened 28×28 images).
Hidden Layers:
1st Hidden Layer: 128 neurons, Step activation.
2nd Hidden Layer: 64 neurons, Step activation.
Output Layer: 10 neurons (one for each digit), producing raw logits.
Key Features:

Uses softmax cross-entropy loss for classification.
Optimized with Adam optimizer.
Trained for 10 epochs with a batch size of 100.
Achieves training and test accuracy evaluation at each step.

In [2]:
import tensorflow as tf
import numpy as np
from tensorflow.keras.datasets import mnist  # Only for dataset loading

# Load MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train.reshape(-1, 784).astype(np.float32) / 255.0, x_test.reshape(-1, 784).astype(np.float32) / 255.0

# Convert labels to one-hot encoding manually
num_classes = 10
y_train = np.eye(num_classes)[y_train]
y_test = np.eye(num_classes)[y_test]

# Define network parameters
n_input = 784    # Input layer (28x28 pixels)
n_hidden1 = 128  # First hidden layer neurons
n_hidden2 = 64   # Second hidden layer neurons
n_output = 10    # Output layer (10 digits)
learning_rate = 0.01
n_epochs = 10
batch_size = 100

# Initialize weights and biases
weights = {
    'h1': tf.Variable(tf.random.normal([n_input, n_hidden1], stddev=0.1)),
    'h2': tf.Variable(tf.random.normal([n_hidden1, n_hidden2], stddev=0.1)),
    'out': tf.Variable(tf.random.normal([n_hidden2, n_output], stddev=0.1))
}

biases = {
    'b1': tf.Variable(tf.zeros([n_hidden1])),
    'b2': tf.Variable(tf.zeros([n_hidden2])),
    'out': tf.Variable(tf.zeros([n_output]))
}

# Define the neural network model (Feedforward)
def neural_network(x):
    layer_1 = tf.nn.relu(tf.matmul(x, weights['h1']) + biases['b1'])
    layer_2 = tf.nn.relu(tf.matmul(layer_1, weights['h2']) + biases['b2'])
    output_layer = tf.matmul(layer_2, weights['out']) + biases['out']
    return output_layer

# Loss function (Cross-entropy)
def compute_loss(y_true, y_pred):
    return tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y_pred, labels=y_true))

# Optimizer (Gradient Descent)
optimizer = tf.optimizers.Adam(learning_rate)

# Training loop
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train)).shuffle(60000).batch(batch_size)

for epoch in range(n_epochs):
    avg_loss = 0.0
    for batch_x, batch_y in train_dataset:
        with tf.GradientTape() as tape:
            predictions = neural_network(batch_x)
            loss = compute_loss(batch_y, predictions)
        
        # Compute gradients
        gradients = tape.gradient(loss, list(weights.values()) + list(biases.values()))
        
        # Apply gradients
        optimizer.apply_gradients(zip(gradients, list(weights.values()) + list(biases.values())))
        
        avg_loss += loss.numpy() / len(train_dataset)

    # Compute training accuracy
    correct_preds = tf.equal(tf.argmax(neural_network(x_train), axis=1), tf.argmax(y_train, axis=1))
    train_acc = tf.reduce_mean(tf.cast(correct_preds, tf.float32)).numpy()
    print(f"Epoch {epoch+1}, Loss: {avg_loss:.4f}, Training Accuracy: {train_acc:.4f}")


# Testing model accuracy
test_preds = neural_network(x_test)
correct_test_preds = tf.equal(tf.argmax(test_preds, axis=1), tf.argmax(y_test, axis=1))
test_acc = tf.reduce_mean(tf.cast(correct_test_preds, tf.float32)).numpy()
print(f"Test Accuracy: {test_acc:.4f}")


Epoch 1, Loss: 0.2271, Training Accuracy: 0.9649
Epoch 2, Loss: 0.1247, Training Accuracy: 0.9679
Epoch 3, Loss: 0.1036, Training Accuracy: 0.9692
Epoch 4, Loss: 0.0995, Training Accuracy: 0.9747
Epoch 5, Loss: 0.0846, Training Accuracy: 0.9780
Epoch 6, Loss: 0.0755, Training Accuracy: 0.9808
Epoch 7, Loss: 0.0786, Training Accuracy: 0.9823
Epoch 8, Loss: 0.0677, Training Accuracy: 0.9854
Epoch 9, Loss: 0.0629, Training Accuracy: 0.9860
Epoch 10, Loss: 0.0680, Training Accuracy: 0.9757
Test Accuracy: 0.9617


Description of code:

This TensorFlow-based neural network classifies handwritten digits from the MNIST dataset using a three-layer feedforward architecture.

Data Preprocessing: Loads MNIST, reshapes images (28×28 → 784), normalizes pixel values, and one-hot encodes labels.


Model Architecture:

Input Layer: 784 neurons
Hidden Layers: 128 and 64 neurons (Step activation)
Output Layer: 10 neurons (raw logits)


Loss & Optimization: Uses softmax cross-entropy loss and Adam optimizer.


Training:
Performs forward propagation.
Computes loss and gradients using GradientTape.
Updates weights via Adam optimizer.


Evaluation: Computes training and test accuracy.