Skip to content

fourlhs/micrograd-cpp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

micrograd-cpp

A lightweight, scalar-valued automatic differentiation engine and neural network library implemented from scratch in Modern C++ (C++17). Inspired by Andrej Karpathy's micrograd, built for clarity and systems correctness.


What is this?

Most ML engineers use PyTorch or TensorFlow without understanding what happens under the hood. This project implements the core machinery from scratch:

  • A Value node that tracks data, gradients, and graph edges
  • Automatic construction of a Directed Acyclic Graph (DAG) during the forward pass
  • A topological sort + chain rule backward pass for exact gradient computation
  • A full MLP built on top of the engine

No external libraries. Pure C++ STL.


Architecture

Value (node)
  ├── data          → the scalar value
  ├── grad          → accumulated gradient
  ├── _backward     → std::function for local gradient computation
  └── _prev         → shared_ptr edges to parent nodes

Neuron → Layer → MLP

Memory model: Every node is heap-allocated via std::shared_ptr. The graph owns its nodes through _prev edges — no manual memory management, no leaks. std::enable_shared_from_this ensures safe self-referencing during the backward pass.

Backward pass: Topological sort via DFS, then reverse traversal applying each node's _backward lambda. Raw pointers used internally for traversal (no ownership) — shared_ptr overhead only where lifetime management is needed.


Supported Operations

Operation Backward Rule
a + b ∂/∂a = 1, ∂/∂b = 1
a * b ∂/∂a = b, ∂/∂b = a
tanh(a) ∂/∂a = 1 - tanh²(a)
relu(a) ∂/∂a = 1 if a > 0, else 0
a + 2.0 scalar overload
3.0 * a scalar overload

Demos

XOR Classification

A non-linear problem that cannot be solved by a single linear layer.

epoch 0   | loss: 5.57
epoch 500 | loss: 0.046
epoch 900 | loss: 0.019

[0, 0] pred: -0.95  target: -1  ✅
[0, 1] pred:  0.93  target:  1  ✅
[1, 0] pred:  0.94  target:  1  ✅
[1, 1] pred: -0.93  target: -1  ✅

2D Spiral Classification

Two interleaved spirals — visualized as ASCII decision boundary in terminal.

............................................................
....................++++++++++++++++++++....................
.................+++++++++++++++++++++++....................
..............++++++++++++++++++++++++......................
.........++++++++++++++++++++++++++.........................
++++++++++++++++++++++++++++++++............................

Build

Requirements: g++ with C++17 support

git clone https://github.com/fourlhs/micrograd-cpp
cd micrograd-cpp
g++ -std=c++17 -Wall -I include src/main.cpp -o micrograd
./micrograd

With CMake:

mkdir build && cd build
cmake ..
make
./micrograd

Project Structure

micrograd-cpp/
├── include/
│   ├── value.h      ← Value node, operators, backward
│   ├── nn.h         ← Neuron, Layer, MLP, zero_grad
│   └── spiral.h     ← dataset generation, ASCII visualization
├── src/
│   └── main.cpp     ← XOR and Spiral demos
├── CMakeLists.txt
└── README.md

Key Design Decisions

Why shared_ptr for graph nodes? A single Value can be a parent to multiple nodes (e.g. a * a). Shared ownership prevents premature deallocation while the graph is alive.

Why += for gradients? If a node appears multiple times in the graph, its gradient contributions must be accumulated — not overwritten.

Why topological sort before backward? Gradients must flow from output to input in dependency order. Topological sort guarantees every node receives its full gradient before passing it to its children.

Why raw pointers inside backward()? The topological sort needs to visit nodes without affecting their lifetime. Raw pointers express "I observe this, I don't own it" — no unnecessary reference count overhead.


Inspired By

About

A lightweight, scalar-valued automatic differentiation engine and neural network library implemented from scratch in Modern C++. Inspired by micrograd, built for performance and clarity.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors