## Python script for classifying the MNIST dataset using NumPy and Matplotlib. It includes:

- Data loading and preprocessing: Load the MNIST dataset and normalize the images.
- Neural network implementation: A simple fully connected neural network with one hidden layer.
- Training and evaluation: Use forward propagation, backward propagation, and weight updates for training.
- Visualization: Plot training loss and some sample predictions.

In [1]:
from IPython.display import clear_output

In [2]:
# Install the necessary packages
%pip install numpy matplotlib torchvision

clear_output()

In [3]:
import numpy as np
import matplotlib.pyplot as plt
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

## Neural Network Architecture
- Input layer: 784 neurons (28x28 pixels)
- Hidden layer: 128 neurons with ReLU activation
- Output layer: 10 neurons with softmax activation
- Loss function: Cross-entropy loss
- Optimization: Stochastic gradient descent (SGD)

In [4]:
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
    return sigmoid(x) * (1 - sigmoid(x))

def relu(x):
    return np.maximum(0, x)

def relu_derivative(x):
    return (x > 0).astype(float)

def softmax(x):
    exp_x = np.exp(x - np.max(x, axis=1, keepdims=True))
    return exp_x / np.sum(exp_x, axis=1, keepdims=True)

def initialize_weights(input_size, hidden_size, output_size):
    weights = {
        "W1": np.random.randn(input_size, hidden_size) * 0.01,
        "b1": np.zeros((1, hidden_size)),
        "W2": np.random.randn(hidden_size, output_size) * 0.01,
        "b2": np.zeros((1, output_size))
    }
    return weights

def forward_propagation(X, weights):
    Z1 = X.dot(weights["W1"]) + weights["b1"]
    A1 = relu(Z1)
    Z2 = A1.dot(weights["W2"]) + weights["b2"]
    A2 = softmax(Z2)
    cache = {"Z1": Z1, "A1": A1, "Z2": Z2, "A2": A2}
    return A2, cache

def compute_loss(Y, A2):
    m = Y.shape[0]
    log_probs = -np.log(A2[range(m), np.argmax(Y, axis=1)])
    loss = np.sum(log_probs) / m
    return loss

def backward_propagation(X, Y, weights, cache):
    m = X.shape[0]
    A1, A2 = cache["A1"], cache["A2"]
    dZ2 = A2 - Y
    dW2 = A1.T.dot(dZ2) / m
    db2 = np.sum(dZ2, axis=0, keepdims=True) / m
    dA1 = dZ2.dot(weights["W2"].T)
    dZ1 = dA1 * relu_derivative(cache["Z1"])
    dW1 = X.T.dot(dZ1) / m
    db1 = np.sum(dZ1, axis=0, keepdims=True) / m
    gradients = {"dW1": dW1, "db1": db1, "dW2": dW2, "db2": db2}
    return gradients

def update_weights(weights, gradients, learning_rate):
    for key in weights.keys():
        weights[key] -= learning_rate * gradients[f"d{key}"]
    return weights

def train(X, Y, input_size, hidden_size, output_size, epochs=10, learning_rate=0.1):
    weights = initialize_weights(input_size, hidden_size, output_size)
    losses = []
    for epoch in range(epochs):
        A2, cache = forward_propagation(X, weights)
        loss = compute_loss(Y, A2)
        gradients = backward_propagation(X, Y, weights, cache)
        weights = update_weights(weights, gradients, learning_rate)
        losses.append(loss)
        if epoch % 1 == 0:
            print(f"Epoch {epoch}, Loss: {loss}")
    return weights, losses

def predict(X, weights):
    A2, _ = forward_propagation(X, weights)
    predictions = np.argmax(A2, axis=1)
    return predictions

def one_hot_encode(y, num_classes):
    one_hot = np.zeros((y.size, num_classes))
    one_hot[np.arange(y.size), y] = 1
    return one_hot

In [5]:
# Load MNIST dataset using torchvision
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
train_dataset = datasets.MNIST(root="./data", train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root="./data", train=False, download=True, transform=transform)


Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz


100.0%


Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz to ./data/MNIST/raw/train-labels-idx1-ubyte.gz


100.0%


Extracting ./data/MNIST/raw/train-labels-idx1-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz


100.0%


Extracting ./data/MNIST/raw/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz


100.0%


Extracting ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw



In [6]:
train_loader = DataLoader(train_dataset, batch_size=len(train_dataset))
test_loader = DataLoader(test_dataset, batch_size=len(test_dataset))

In [7]:
# Extract data from DataLoader
train_data = next(iter(train_loader))
test_data = next(iter(test_loader))

In [8]:
X_train = train_data[0].numpy().reshape(-1, 28 * 28)
Y_train = one_hot_encode(train_data[1].numpy(), 10)

X_test = test_data[0].numpy().reshape(-1, 28 * 28)
Y_test = one_hot_encode(test_data[1].numpy(), 10)

In [9]:
# Hyperparameters
input_size = 784
hidden_size = 128
output_size = 10
epochs = 10
learning_rate = 0.1

: 

In [None]:
# Train the model
weights, losses = train(X_train, Y_train, input_size, hidden_size, output_size, epochs, learning_rate)


In [None]:
# Plot the loss curve
plt.plot(losses)
plt.title("Training Loss")
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.show()

In [None]:
# Evaluate the model
predictions = predict(X_test, weights)
accuracy = np.mean(predictions == np.argmax(Y_test, axis=1))
print(f"Test Accuracy: {accuracy}")

In [None]:
# Visualize some predictions
plt.figure(figsize=(10, 10))
for i in range(16):
    plt.subplot(4, 4, i + 1)
    plt.imshow(X_test[i].reshape(28, 28), cmap="gray")
    plt.title(f"Pred: {predictions[i]}")
    plt.axis("off")
plt.show()