Skip to content

ddlees/NAND2LLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

NAND2LLM

NAND2LLM is a from-first-principles computer science and machine learning project inspired by Nand2Tetris.

Note: NAND2LLM is an independent educational project inspired by the pedagogical style of Nand2Tetris. It is not affiliated with, endorsed by, or sponsored by Nand2Tetris, Noam Nisan, Shimon Schocken, MIT Press, or Coursera.

Summary

Where Nand2Tetris starts with a NAND gate and builds toward a working computer and a playable version of tetris, NAND2LLM starts with the core loop underneath modern learning systems and builds toward a tiny, inspectable language model and assistant.

This loop is the heartbeat of the project:

prediction -> loss -> gradient -> update -> better prediction

The goal is not to train a competitive model but instead to demystify large language models and make them feel understandable, buildable, and testable.

By the end, the reader should have built enough of the technology stack to understand how a small language model works from the inside out having covered the following concepts:

  • Arrays
  • Tensors
  • Autograd
  • Neural networks
  • Tokenization
  • Embeddings
  • Attention
  • Transformers
  • Training loops
  • Decoding
  • Evaluation
  • Retrieval
  • Tool use
  • Serving
  • Scaling pressure
  • GPU optimization

The final artifact of this project is TinyTutor: a small assistant trained, tuned, and retrieval-augmented over the NAND2LLM material itself. The reader will ultimately build the model, teach the model what was built, and ask the model to explain the build.

Project Status

This project is in its early stages. Scaffolding of the project and the curriculum design is expected to change.

The repository is intended to grow into:

  • A GitHub learning project
  • A Book
  • An interactive course

For now, the priority is to make the project executable, testable, and easy to contribute to.

Guiding Principles

Build the Machinery

The reader should implement the core machinery before leaning on established libraries or frameworks. The NAND2LLM curriculum is intentionally conservative about its dependencies.

The project should build, at minimum:

  • Arrays, tensors, and shape-aware operations
  • Scalar and tensor automatic differentiation
  • Neural network layers
  • Optimizers
  • Tokenizers
  • Embeddings
  • Attention
  • Transformer blocks
  • Training loops
  • Sampling and decoding
  • Evaluation harnesses
  • Retrieval components
  • Tool-calling abstractions
  • Basic inference serving

Libraries for tests, CLIs, serialization, plotting, documentation, and comparison against mature frameworks are acceptable but the central learning machinery should be derived from the reader.

Earn Every Abstraction

Each abstraction should solve a concrete problem introduced by the previous layer. For example:

  • When vectors and matrices become too limiting, introduce tensors.
  • When hand-derived gradients become tedious and error-prone, introduce autograd.
  • When fixed-size context representations become insufficient, introduce attention and transformers.

Keep the Math Approachable

The project assumes basic algebra skills to start and should build the necessary mathematical skills incrementally:

  • Scalars, vectors, and matrices
  • Functions and composition
  • Rates of change
  • Derivatives and gradients
  • Probability
  • Entropy and cross-entropy
  • Optimization
  • Dot products and similarity
  • Attention as learned routing over information

NAND2LLM should introduce the math when it becomes useful, make it concrete in code, then give the formal version after the mathematical intuition has landed.

Teach Both Modeling and Systems

NAND2LLM is a dual-track project:

  • The Python Track teaches mathematical, modeling, and training ideas as clearly as possible.
  • The Rust Track teaches the systems substrate:
    • memory layout
    • performance
    • error handling
    • explicit APIs
    • deployment-oriented implementation

A Go Track may appear later as an optional comparison point for approachable service implementation.

Touch on Scaling

Since scaling is fundamental to modern language models, GPU programming is a topic that can't be ignored in the NAND2LLM material.

The main path should not require GPU expertise up front. Instead, the project should focus on understandable CPU implementations, and then use those implementations to motivate:

  • Why matrix multiplication dominate neural network workloads
  • Why batching improves throughput
  • Why data layout matters
  • Why memory bandwidth matters
  • Why parallel execution matters
  • Why GPUs are effective
  • Why mature tensor libraries and compilers exist

Incremental Lessons

The NAND2LLM project should evolve in small, testable increments. A good contribution should include some combination of:

  • Lesson material
  • Implementation
  • Exercises
  • Tests
  • Diagrams
  • Experiments
  • Failure cases
  • Lead up to the next abstraction

Curriculum Outputs

Part I: The Computation of Learning

  1. Introduction - An outline of the project and development environment
  2. Numbers, bits, and arrays - The minimal array substrate
  3. The Dot Product Machine - A linear model
  4. Measuring Wrongness - Loss functions
  5. Learning by Descent - Gradient descent over simple models

Part II: Building Neural Networks

  1. Scalar Autograd - A tiny computational graph engine
  2. Tensor Autograd - Shape-aware differential operations
  3. Neural Networks - MLP layers, activations, batching
  4. Optimization - SGD, momentum, Adam, clipping, schedules

Part III: Turning Language Into Data

  1. Text, corpora, and datasets - Dataset pipeline
  2. Tokenization - Character, byte, and simple BPE tokenizers
  3. Embeddings - Token and positional embeddings

Part IV: From Sequence Models to Transformers

  1. Next Token Prediction - Crude text generator
  2. Attention - Scaled dot-product attention
  3. Multi-head self-attention - Reusable self-attention layer
  4. Transformer block - Residual, norm, feed-forward, dropout
  5. TinyGPT - Small autoregressive language model

Part V: Training, Inspecting, and Improving

  1. Training Runs as Experiments - Configs, metrics, checkpoints
  2. Sampling and Decoding - Greedy, temperature, top-k, top-p
  3. Evaluation - Perplexity, validation loss, prompt regressions
  4. Model Introspection - Embeddings, logits, attention, activations

Part VI: A Model Assistant

  1. Instruction Tuning - Simple instruction-following model
  2. Conversation and Context Windows - Tiny chat interface
  3. Retrieval-augmented generation - Local document assistant
  4. Tools and Function Calling - Structured Action Selection
  5. Safety, Alignment, and Boundaries - Policy-aware assistant shell
  6. Serving the Model - Local inference server and UI

About

NAND2LLM is a from-first-principles computer science and machine learning project inspired by Nand2Tetris

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors