Visual flashcards on how LLMs work.
Click any card to open it full size.
Study in Anki: download llm-flashcards.apkg (these 30 cards) and import it into Anki. Front is the concept, back is the card.
I work on LLM efficiency at LLMs Research, and a lot of that work happens on a whiteboard. Drawing a thing forces you to know what you're drawing. A vague hand-wave on a slide hides confusion. A diagram doesn't.
After enough whiteboards I had a stack of diagrams. The stack turned into a study set for myself. I tightened the lines, kept the labels honest, and put them on cards. That's the set.
The cards are for someone who has used an LLM API and wants the layer underneath. Some technical background helps. No heavy math.
332 cards across 22 topics:
| Tokenization (12) | Embeddings and retrieval (14) | Transformer architecture (30) |
| Architecture variants (16) | Training (18) | Distributed training (10) |
| Scaling laws (10) | Fine-tuning (15) | RLHF and alignment (19) |
| Inference and decoding (19) | Quantization (12) | Prompting (19) |
| Reasoning (15) | Context management (10) | RAG (24) |
| Agents and tools (22) | Multimodal (8) | Advanced concepts (6) |
| Evaluation (16) | Safety (17) | Interpretability (7) |
| APIs and practical use (13) |
Three formats: a PDF (332 pages, printable), an .apkg for Anki spaced-repetition review, and every card as a separate image. New cards get added regularly, and past buyers get every update free.
CC BY-NC-ND 4.0. Share the cards with credit and a link back to this repo. No repackaging, no reselling, no modified versions, no commercial use. Full text in LICENSE.
If something on a card is wrong or unclear, open an issue. If you want a card on a concept that is not in the set yet, open one too. I read them.
LLMs Research is an independent applied research lab. We work on LLM efficiency: inference, KV cache compression, adaptive compute, multi-agent systems. The set started as study notes for that work.
Website · Newsletter · X · LinkedIn





























