# MNIST with micrograd
In this notebook, you will implement a 2-layer (784-800-10) fully connected
feed-forward neural network for MNIST classification.

In [None]:
from keras.datasets import mnist
import matplotlib.pyplot as plt
import numpy as np
import csv
import micrograd.nn as nn
from micrograd.engine import Value
%matplotlib inline

First, let's setup our dataset. Keras automatically splits the MNIST data into
train and test segments for us.

The "x" variables are the images, while the "y" variables are the ground truths.

In [None]:
(train_x, train_y), (test_x, test_y) = mnist.load_data()
print(f"{train_x.shape=}\n{train_y.shape=}\n{test_x.shape=}\n{test_y.shape=}")

Let's visualize one of the data points:

In [None]:
plt.imshow(train_x[0], cmap="gray")
plt.title(f"Ground truth={train_y[0]}");

Now that we have our data loaded, we can initialize our model, using the
abstractions we wrote in `micrograd/nn.py` (imported above as `nn`).

Remember that we are looking to create a multi-layer perceptron, with one hidden
layer of dimension 800, an input layer of 784, and output layer of 10.

Since our input dimension is now 784, we need to remember to reshape the input
images (and normalize).

In [None]:
model = nn.MLP(784, [800, 10])

train_x = train_x.reshape(-1, 784) / 255
test_x = test_x.reshape(-1, 784) / 255

Copy your implementation of softmax from step two:

In [None]:
def softmax(z: list[Value], C:int=10) -> list[Value]:
  pass # TODO

Now we can try evaluating the model (with random weights).

In [None]:
softmax(model(train_x[0]))

Unfortunately, our engine is way to slow to train this much larger model. As such,
pre-trained weights have been provided in the weights.csv file. The code below
loads these weights into your model.

In [None]:
print(len(model.layers[0].parameters()))

In [None]:
with open("weights.csv", "r") as f:
  reader = csv.reader(f)
  weights = list(reader)

for i, p in enumerate(model.parameters()):
  p.data = float(weights[i][0])

Now that our model is trained, let's try running it on some test examples.
Try changing `SAMPLE_IDX` to see different examples.

NOTE: This may be quite slow, depending on your computer.

In [None]:
SAMPLE_IDX = 0

pred = softmax(model(test_x[SAMPLE_IDX]))
pred_idx = pred.index(max(pred, key=lambda x: x.data))
plt.imshow(test_x[SAMPLE_IDX].reshape(28,28), cmap="gray")
plt.title(f"Ground truth={test_y[SAMPLE_IDX]} Prediction={pred_idx}");