# Micrograd - Arnie's version

I built this project as a way to solidify my understanding of the Stanford/DeepLearningAI Coursera course.

Before taking the class (June start - July end), I had watched Andrej Karpathy's Micrograd lecture (around May), but not since then.

To really test my understanding, I built this out *without* re-watching Karpathy's lecture. That is, I built this with a rough understanding of autograd, and thought through first principles of differentiation, partial derivatives, chain_rule, software design, back_propagation, linear regression, optimizers, loss functions.

Note:
- I did have a working micrograd implementation, however my loss was incorrect.
- After manually calculating the backprop and still not figuring out, I skimmed through Karpathy's micrograd code.
- I took the opportunity to:
-> Add in additional methods I hand't thought like __sub__, __rsub__, ....
-> Clean-up my implementation of Neuron (notably, passing in input_feature and a single x sample, rather than the x as array itself)

In the end, I realized the problem was with that I forgot to call the zero_grad() which led to the loss bouncing since the grads were compounding across iterations

In principle, I don't take shortcuts, but given my urgency to progress as fast as possible in ML and since I already had a working implementation (with correct intuition), it did not make sense to spend more time debugging (I can always do that later).

Extra notes: Obviously I did not use Copilot xD.

In [1]:
# Install requirements

!pip install pydantic



## Operation Tests

In [2]:
from micrograd.engine.value import Value

a = Value(6.0)
b = Value(3.0)

c = a * b
print(c)

d = a / b
print(d)

e = Value(100.0)
f = e.log()
# print(f)

Val(value=18.0000, grad=0.0000, parents=(6.0 * 3.0))
Val(value=2.0000, grad=0.0000, parents=(6.0 * 0.3333333333333333))


In [3]:
f = f.log()
# print(f)

## Simple Linear Regression Example

In [4]:
from micrograd.nets.SimpleLinearRegression import SimpleLinearRegression
from micrograd.loss_functions.MSE import MSE
from micrograd.optimizers.SimpleOptimizer import SimpleOptimizer

from micrograd.engine.value import Value

# Set-up model, optimizers and loss function
model = SimpleLinearRegression()
criterion = MSE()

num_epochs = 1000
lr = 0.01

optimizer = SimpleOptimizer(model.parameters(), lr)

# Dataset
x = [Value(1.0), Value(4.0), Value(9.0)]
y = [Value(1.0), Value(8.0), Value(18.0)]

for epoch in range(num_epochs):
    # Need to applying the logistic activation!
    y_pred = [model([xi]) for xi in x]

    loss = criterion(y_pred, y)

    optimizer.zero_grad()

    loss.backward()
    optimizer.step()

    if (epoch + 1) % 100 == 0:
        print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.value:.4f}')

test_x = [Value(6.0)]
test_y_pred = model(test_x)

print(f"Prediction for x={test_x[0].value} is y_pred={test_y_pred.value} (Expected is 12)")

Epoch [100/1000], Loss: 0.0575
Epoch [200/1000], Loss: 0.0442
Epoch [300/1000], Loss: 0.0427
Epoch [400/1000], Loss: 0.0425
Epoch [500/1000], Loss: 0.0425
Epoch [600/1000], Loss: 0.0425
Epoch [700/1000], Loss: 0.0425
Epoch [800/1000], Loss: 0.0425
Epoch [900/1000], Loss: 0.0425
Epoch [1000/1000], Loss: 0.0425
Prediction for x=6.0 is y_pred=11.816328060532037 (Expected is 12)


## Hot Dog Classifier
Inspired by Jian-Yang from Silicon Valley, I built out a simple hot dog classifer.

I downloaded a dataset from HuggingFace and used my Micrograd implementation to train and evaluate test images :)

In [5]:
!pip install Pillow
!pip install numpy



In [6]:
# Load data and pre-process it
import os
from PIL import Image
import numpy as np
from micrograd.engine.value import Value

IMAGE_SIZE = 225

def load_and_preprocess_images(folder_path, label, image_size):
    images = []
    labels = []
    for filename in os.listdir(folder_path):
        if filename.endswith(('.jpg', '.jpeg', '.png')):
            img_path = os.path.join(folder_path, filename)
            with Image.open(img_path) as img:
                img = img.resize((image_size, image_size))
                img = img.convert('L')  # Convert to grayscale
                img_array = np.array(img).flatten() / 255.0  # Normalize to [0, 1]
                images.append([Value(float(pixel)) for pixel in img_array])
                labels.append(Value(float(label)))
    return images, labels

# Load training data
train_hotdog_path = 'dataset/hotdog_nothotdog/train/hotdog'
train_nothotdog_path = 'dataset/hotdog_nothotdog/train/not_hotdog'

x_train_hotdog, y_train_hotdog = load_and_preprocess_images(train_hotdog_path, 1, IMAGE_SIZE)
x_train_nothotdog, y_train_nothotdog = load_and_preprocess_images(train_nothotdog_path, 0, IMAGE_SIZE)

x_train = x_train_hotdog + x_train_nothotdog
y_train = y_train_hotdog + y_train_nothotdog

# Load validation data
val_hotdog_path = 'dataset/hotdog_nothotdog/val/hotdog'
val_nothotdog_path = 'dataset/hotdog_nothotdog/val/not_hotdog'

x_valid_hotdog, y_valid_hotdog = load_and_preprocess_images(val_hotdog_path, 1, IMAGE_SIZE)
x_valid_nothotdog, y_valid_nothotdog = load_and_preprocess_images(val_nothotdog_path, 0, IMAGE_SIZE)

x_valid = x_valid_hotdog + x_valid_nothotdog
y_valid = y_valid_hotdog + y_valid_nothotdog

# Shuffle the training data
combined = list(zip(x_train, y_train))
np.random.shuffle(combined)
x_train, y_train = zip(*combined)

print(f"Training samples: {len(x_train)}")
print(f"Validation samples: {len(x_valid)}")

Training samples: 200
Validation samples: 50


In [7]:
from micrograd.nets.ComplexLogisticRegression import ComplexLogisticRegression

from micrograd.loss_functions.BinaryCrossEntropy import BinaryCrossEntropy
from micrograd.optimizers.SimpleOptimizer import SimpleOptimizer

import random


model = ComplexLogisticRegression(image_size=IMAGE_SIZE)
criterion = BinaryCrossEntropy()

num_epochs = 1000
lr = 0.02

optimizer = SimpleOptimizer(model.parameters(), lr)

# Train the model
for epoch in range(num_epochs):

    y_pred = [model(xi) for xi in x_train]

    # for prediction in y_pred:
        # print(f"Prediction: {prediction}")

    loss = criterion(y_pred, y_train)

    optimizer.zero_grad()

    loss.backward()

    # Dynamically adjust the learning rate
    if (loss.value < 20.0):
        optimizer.set_learn_rate(0.04)
    
    if (loss.value < 15.0):
        optimizer.set_learn_rate(0.02)    

    if (loss.value < 10.0):
        optimizer.set_learn_rate(0.001)

    if (loss.value < 2.0):
        optimizer.set_learn_rate(random.uniform(0.0001, 0.002))

    if (loss.value < 0.85):
        optimizer.set_learn_rate(random.uniform(0.0000001, 0.00001))

    # Early stopping
    if (loss.value < 0.70):
        print(f'Early stopping at epoch {epoch+1}, loss: {loss.value:.4f}')
        break

    optimizer.step()

    if (epoch + 1) % 1 == 0:
        print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.value} - (step: {optimizer.get_learn_rate()})')

Epoch [1/1000], Loss: 18.39591028953272 - (step: 0.04)
Epoch [2/1000], Loss: 18.132157515113217 - (step: 0.04)
Epoch [3/1000], Loss: 17.985969556895956 - (step: 0.04)
Epoch [4/1000], Loss: 17.79856883488213 - (step: 0.04)
Epoch [5/1000], Loss: 17.57129163371318 - (step: 0.04)
Epoch [6/1000], Loss: 14.607587045929161 - (step: 0.02)
Early stopping at epoch 7, loss: 2.4208


In [8]:
# Write the model to a file
import json

exported_model = model.export()
model_json = exported_model.model_dump_json()
print(model_json)

# Write the model to a file
with open('model_weights.json', 'w') as f:
    f.write(model_json)

Exporting model...
{"layers":[{"input_features":225,"output_features":3,"activation":"linear_activation","neurons":[{"values":[0.8326694434135926,0.8653004940246046,0.3575845181427726,0.1947515208968238,0.029102446590822294,0.8109089889425899,0.04853141427575596,0.6245817133439922,0.08544539703794027,0.30272382227925576,0.26675499157529653,0.8834605003728313,0.26869382868199526,0.48020011629024434,0.3552782953539262,0.25445039077070797,0.09730523567084494,0.06427740051879416,0.398665332186434,0.061530901886086974,0.19993514416263755,0.007028087036286916,0.09016701665663628,0.20587455358745801,0.47581884377844763,0.7742250603489343,0.16015109789667054,0.3524284788489767,0.6782049302904016,0.4731117653159907,0.04494060589865303,0.9716073609031944,0.741290362638135,0.5855292208911226,0.39708157076095807,0.06204290999238257,0.5616020145439548,0.44935801357467425,0.5284439367146707,0.1693121025933208,0.041208162534073375,0.9213932798748579,0.05647854384199753,0.6998856836004788,0.9793317593

In [9]:
from micrograd.nets.NeuralNet import NeuralNet
from micrograd.schemas.schemas import ModelSchema

# Read the model from the file
with open('model_weights.json', 'r') as f:
    model_json = f.read()

# Parse and validate the JSON using pydantic
model_schema = ModelSchema.model_validate_json(model_json)

# Create a new NeuralNet instance from the validated schema
imported_model = NeuralNet.from_json(model_schema)

print("Model successfully imported and validated.")


Importing model...
Imported model!
Model successfully imported and validated.


In [10]:
print(loss)
print(loss.value)
print(f'Loss: {loss.value:.4f}')
print(f'Epoch [], Loss: {loss.value:.4f}')

Val(value=2.4208, grad=1.0000, parents=(484.155516258618 * 0.005))
2.42077758129309
Loss: 2.4208
Epoch [], Loss: 2.4208


In [11]:
# Test the model
n = len(x_valid)
correct_guesses = 0
wrong_guesses = 0

y_pred = [model(xi) for xi in x_valid]

predictions = []

for pred in y_pred:
    if pred.value <= 0.5:
        predictions.append(0)
    else:
        predictions.append(1)
    
for i in range(n):
    # print(f'Prediction: {predictions[i]} with confidence {y_pred[i].value:.2f} for actual: {y_valid[i].value}')

    if predictions[i] == y_valid[i].value:
        correct_guesses += 1
    else:
        wrong_guesses += 1

print(f"Correct guesses: {correct_guesses}, Wrong guesses: {wrong_guesses}, Total: {n} - Accuracy: {(correct_guesses/n)*100}%")
print(f"Loss: {loss}")


Correct guesses: 25, Wrong guesses: 25, Total: 50 - Accuracy: 50.0%
Loss: Val(value=2.4208, grad=1.0000, parents=(484.155516258618 * 0.005))


In [12]:
# Testing the imported model
# Test the model
n = len(x_valid)
correct_guesses = 0
wrong_guesses = 0

y_pred = [imported_model(xi) for xi in x_valid]

predictions = []

for pred in y_pred:
    if pred.value <= 0.5:
        predictions.append(0)
    else:
        predictions.append(1)
    
for i in range(n):
    # print(f'Prediction: {predictions[i]} with confidence {y_pred[i].value:.2f} for actual: {y_valid[i].value}')

    if predictions[i] == y_valid[i].value:
        correct_guesses += 1
    else:
        wrong_guesses += 1

print(f"Correct guesses: {correct_guesses}, Wrong guesses: {wrong_guesses}, Total: {n} - Accuracy: {(correct_guesses/n)*100}%")
print(f"Loss: {loss}")

Correct guesses: 25, Wrong guesses: 25, Total: 50 - Accuracy: 50.0%
Loss: Val(value=2.4208, grad=1.0000, parents=(484.155516258618 * 0.005))


In [13]:
y_pred = Value(0.1)
y = Value(1.0)

loss_test = Value(0.0)
loss_test += (-(y * y_pred.log()) - ((Value(1.0)-y)*(Value(1.0)-y_pred).log()))
loss_test += (-(y * y_pred.log()) - ((Value(1.0)-y)*(Value(1.0)-y_pred).log()))


print(loss_test)

Val(value=4.6052, grad=0.0000, parents=(2.3025850929940455 + 2.3025850929940455))
