<a href="https://colab.research.google.com/github/ShreyanshSharma17/Neural_network/blob/master/Experiment3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Experiment 3

WAP to implement a three-layer neural network using Tensor flow library (only, no keras)
to classify MNIST handwritten digits dataset.
Demonstrate the implementation of feed-forward and back-propagation approaches

Model Description:

This is a simple neural network designed to classify handwritten digits from
the MNIST dataset. The model is implemented using TensorFlow 2.x without the use of Keras,
giving a hands-on experience with TensorFlow's lower-level operations.
The network consists of three layers, and we manually define the feed-forward and
backpropagation steps.

Model Architecture:

The model has three layers:

Input Layer: The MNIST images are 28x28 pixels, which are flattened into a 1D vector of 784 values (28 * 28 = 784). This forms the input layer.
Hidden Layer: A fully connected (dense) layer with 128 neurons. The activation function used here is ReLU (Rectified Linear Unit), which introduces non-linearity into the model and allows it to learn complex patterns.
Output Layer: The final layer is a softmax layer with 10 neurons. Each neuron corresponds to one of the 10 possible digits (0-9). The model outputs raw logits (i.e., unnormalized scores) for each class.

In [None]:
import tensorflow as tf
import tensorflow_datasets as tfds

# Load the MNIST dataset using tensorflow_datasets
mnist_data, info = tfds.load('mnist', with_info=True, as_supervised=True)

# Prepare the training and test sets
train_data, test_data = mnist_data['train'], mnist_data['test']

# Normalize the data
def preprocess(image, label):
    image = tf.cast(image, tf.float32) / 255.0  # Normalize the image to [0, 1]
    return image, label

# Map the preprocessing function and batch the data
train_data = train_data.map(preprocess).batch(128).shuffle(60000).repeat()
test_data = test_data.map(preprocess).batch(128)

# Define the three-layer neural network model with

In [None]:
class ThreeLayerNN(tf.Module):
    def __init__(self):
        super().__init__()
        # Initialize weights and biases
        self.w1 = tf.Variable(tf.random.normal([784, 128]))  # Weights for the first layer
        self.b1 = tf.Variable(tf.zeros([128]))  # Biases for the first layer
        self.w2 = tf.Variable(tf.random.normal([128, 10]))  # Weights for the second layer (output layer)
        self.b2 = tf.Variable(tf.zeros([10]))  # Biases for the second layer

    def __call__(self, x):
        # Feed-forward process: Apply the layers and activation functions
        x = tf.reshape(x, [-1, 784])  # Flatten input to 1D vector (28x28 = 784)

        # First layer (input -> hidden)
        hidden = tf.matmul(x, self.w1) + self.b1
        hidden = tf.nn.relu(hidden)  # ReLU activation function

        # Second layer (hidden -> output)
        output = tf.matmul(hidden, self.w2) + self.b2
        return output

# Instantiate the model
model = ThreeLayerNN()

# Define the loss function
def compute_loss(logits, labels):
    return tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=labels, logits=logits))

# Define the optimization step (gradient descent)
optimizer = tf.optimizers.Adam()

# Training step
def train_step(model, images, labels):
    with tf.GradientTape() as tape:
        logits = model(images)  # Forward pass
        loss = compute_loss(logits, labels)  # Compute the loss
    gradients = tape.gradient(loss, model.trainable_variables)  # Backpropagation
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))  # Update weights
    return loss

# Training loop
max_steps = 6000
steps_taken = 0
epochs = 10
for epoch in range(epochs):
    total_loss = 0
    for step, (images, labels) in enumerate(train_data):
        if steps_taken >= max_steps:
            print(f"Training stopped after {steps_taken} steps.")
            break
        loss = train_step(model, images, labels)
        total_loss += loss
        steps_taken += 1

        if step % 100 == 0:
            print(f"Epoch {epoch+1}, Step {step}, Loss: {loss.numpy()}")

    print(f"Epoch {epoch+1}, Average Loss: {total_loss / (step+1)}")
    if steps_taken >= max_steps:
        break

# Evaluate the model on the test set
def evaluate(model, test_data):
    correct_predictions = 0
    total_predictions = 0
    for images, labels in test_data:
        logits = model(images)
        predicted_classes = tf.argmax(logits, axis=1)
        correct_predictions += tf.reduce_sum(tf.cast(tf.equal(predicted_classes, labels), tf.int32))
        total_predictions += len(labels)

    accuracy = correct_predictions / total_predictions
    print(f"Test Accuracy: {accuracy.numpy():.4f}")

# Evaluate the model on the test set
evaluate(model, test_data)

Epoch 1, Step 0, Loss: 126.90067291259766
Epoch 1, Step 100, Loss: 27.15457534790039
Epoch 1, Step 200, Loss: 11.818521499633789
Epoch 1, Step 300, Loss: 16.30792236328125
Epoch 1, Step 400, Loss: 5.188839912414551
Epoch 1, Step 500, Loss: 3.3088412284851074
Epoch 1, Step 600, Loss: 8.113788604736328
Epoch 1, Step 700, Loss: 3.8102915287017822
Epoch 1, Step 800, Loss: 4.728327751159668
Epoch 1, Step 900, Loss: 3.5811080932617188
Epoch 1, Step 1000, Loss: 1.6203113794326782
Epoch 1, Step 1100, Loss: 2.0492782592773438
Epoch 1, Step 1200, Loss: 6.201204299926758
Epoch 1, Step 1300, Loss: 3.633856773376465
Epoch 1, Step 1400, Loss: 2.5876033306121826
Epoch 1, Step 1500, Loss: 2.502194881439209
Epoch 1, Step 1600, Loss: 2.4886116981506348
Epoch 1, Step 1700, Loss: 1.356600046157837
Epoch 1, Step 1800, Loss: 2.093639850616455
Epoch 1, Step 1900, Loss: 2.7216124534606934
Epoch 1, Step 2000, Loss: 0.9919930100440979
Epoch 1, Step 2100, Loss: 2.443976879119873
Epoch 1, Step 2200, Loss: 1.62967

In [None]:
Performance Evaluation
The model steadily reduces loss over training epochs, indicating effective learning.
It shows promising accuracy on the MNIST test set, confirming its reliability for digit recognition.
Overall, the evaluation reflects a well-optimized training process with space for fine-tuning.

My Comments
The custom neural network is straightforward yet effective for a classification task.
It nicely demonstrates the concepts of feed-forward processing and backpropagation.
Minor hyperparameter adjustments could further enhance its performance.