How LLMs Actually Work

A visual, interactive guide to how large language models are built — from raw internet text to a conversational assistant.

Live site: https://ynarwal.github.io/how-llms-work/

Based on Andrej Karpathy's Intro to Large Language Models lecture.

What's inside

Data Collection — how the web is scraped and filtered into training data (Common Crawl, FineWeb)
Tokenization — how text is broken into subword tokens via Byte Pair Encoding (BPE)
Neural Network Training — the loss function, gradient descent, and what a forward pass looks like
Inference & Sampling — how the model generates text token by token, and how temperature works
The Base Model — what a model knows after pre-training and what it can't do yet
Post-Training — RLHF, instruction tuning, and how a base model becomes an assistant
LLM Psychology — hallucinations, context windows, and how to think about what models "know"
RAG — retrieval-augmented generation: embeddings, vector search, and context injection
Full Pipeline Summary — end-to-end visual of every stage

Files

File	Description
`index.html`	Main site (v2 redesign)
`v1.html`	Original dark-theme version
`transcript.txt`	Full Karpathy lecture transcript
`council.py`	LLM council fact-checker (runs via `uv run council.py`)
`report.html`	Latest council fact-check report

HN discussion

Posted to Hacker News and generated heated debate, mostly about it being LLM-generated. Fair point — but the content isn't the AI's. Every claim, figure, and framing is traced directly to Karpathy's lecture, not hallucinated by a model.

Vibe check

The code and content in this repo is mostly LLM-generated (Claude via Claude Code). The ideas, direction, and editorial decisions are mine — the implementation was largely written by AI. The council fact-checker exists precisely because of this: automated content warrants automated verification.

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
how-to-use-llms		how-to-use-llms
neural-networks		neural-networks
.gitignore		.gitignore
README.md		README.md
council.py		council.py
council_report.md		council_report.md
hn-update-comment.txt		hn-update-comment.txt
index.html		index.html
main.js		main.js
pyproject.toml		pyproject.toml
report.html		report.html
style.css		style.css
transcript.txt		transcript.txt
uv.lock		uv.lock
v1.html		v1.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How LLMs Actually Work

What's inside

Files

HN discussion

Vibe check

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

How LLMs Actually Work

What's inside

Files

HN discussion

Vibe check

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages