The Engineering Handbook

Open-source engineering curricula under one CC BY-SA 4.0 license. HLD and DSA today; more to come.

HLD Handbook: 159 chapters + 22 trade-off pages · 181 pages · ~773,000 words · 719 Mermaid diagrams · 3,100+ citations DSA Handbook: 120 chapters across 15 parts · 37 pattern decision pages · 5 long-form editorials · ~470,000 words · 226 Mermaid diagrams · sibling Python / Java / C++ / Go solutions for 155 LeetCode problems · 46 interactive widget specs

Read free at handbook.academy — landing page links to both books · HLD at hld.handbook.academy · DSA at dsa.handbook.academy (public beta)

Start reading · HLD curriculum · DSA curriculum · Trade-offs · Contributing

What this is
Reading the handbooks online
What's in this repo
Why this exists
Who this is for
How these handbooks compare
Start reading
The full HLD curriculum — 12 parts + 22-page Trade-offs Library
The full DSA curriculum — 15 parts + 37 pattern decision pages + editorials + 46 widgets
Study plans
Project statistics
Quality standards
Contributing
Project structure
Development setup
Governance
FAQ
License
Citation
Acknowledgments
Community

What this is

This repository hosts two sibling open-source engineering handbooks under content/, sharing the same CC BY-SA 4.0 license, the same contribution workflow, and the same CI quality bar:

The HLD Handbook (content/hld/) — an opinionated, end-to-end textbook on high-level software design, distributed systems, and modern infrastructure. 159 teaching chapters across 12 parts plus a 22-page Trade-offs Library — 181 pages, ~773,000 words, 719 Mermaid diagrams, 3,100+ citations. Equivalent in scope to a ~2,400-page book — longer than Designing Data-Intensive Applications and Alex Xu's System Design Interview volumes 1+2 combined. Covers TCP/IP and the OS contract through to LLM serving, multi-agent orchestration, and post-quantum crypto.
The DSA Handbook (content/dsa/) — a practice-first data-structures-and-algorithms curriculum aimed at coding interviews and competitive programming. 120 chapters across 15 parts plus 37 pattern decision pages, 5 long-form LC editorials, and 46 interactive widget specs. ~470,000 words, 226 Mermaid diagrams. Each chapter teaches a structure or pattern, then walks the canonical LeetCode problems, with sibling sol.py / sol.java / sol.cpp / sol.go files for 155 problems. Python is inlined in the chapter; the other three languages are linked.

More curricula are planned. This repo is structured as an umbrella so additional handbooks (e.g. operating systems, databases, ML systems) can land alongside HLD and DSA over time, sharing the same workflow and license.

Every concept in either handbook is taught inline as a full-length article: introduction, first-principles explanation, diagrams, worked examples, trade-offs, production gotchas, citations to primary sources. No stubs. No "coming soon." No external blog redirects. No bullet outlines that tell you to go somewhere else to actually learn.

Every .md file under content/ also renders natively on GitHub — all 945 Mermaid diagrams display in the GitHub UI, all footnotes work, all cross-references resolve. The websites add search, dark mode, per-chapter diagram zoom, social cards, OG images, the interactive DSA widgets, and fast client-side navigation.

Both handbooks are continuously updated. Every chapter's frontmatter declares date_created and date_updated, so you can see exactly how fresh each page is. We care about numbers being right today, not right in 2022.

Reading the handbooks online

Each book has its own subdomain. The umbrella landing page links to both:

Site	URL	What it serves
Landing page	handbook.academy	Two cards — pick HLD or DSA.
HLD Handbook	hld.handbook.academy	The full HLD curriculum.
DSA Handbook	dsa.handbook.academy	The full DSA curriculum, with interactive widgets.

Each subdomain serves one book at the root — URLs do not nest. HLD chapters live at hld.handbook.academy/curriculum/..., not handbook.academy/hld/.... Same for DSA.

What's in this repo

This repository is the canonical content source for both books. PRs to either handbook are welcome.

content/hld/ — 159 HLD chapters across 12 parts + 22 trade-off decision pages. The chapter template is intro → first principles → diagrams → worked example → trade-offs → production gotchas → references. See STYLE_GUIDE.md.
content/dsa/ — 120 DSA chapters across 15 parts. The chapter template is practice-first: cheat-sheet table, deep dive on the structure or pattern, prompt cards for representative LeetCode problems, and <details> solution + common-mistakes blocks. Code samples live as sibling files (sol.py, sol.java, sol.cpp, sol.go) under each chapter's directory. See STYLE_GUIDE.md § DSA-specific deviations and content/dsa/part-1-linear-data-structures/00-arrays.md as the canonical example.
content/dsa/editorials/ — long-form LeetCode editorials covering hard problems with multiple approaches.
content/dsa/patterns/ — pattern decision references that compare interchangeable approaches (e.g. recursion vs iteration, BFS vs DFS, sliding window vs prefix sum).
content/dsa/widgets/ — 46 YAML specs for the interactive widgets that render on the DSA website. On GitHub these are callouts that link to the spec; on the website the widget renders.
content/dsa/_problem-registry.yml — canonical LC-NNN ID registry; the website source-of-truth for every LeetCode problem the curriculum touches.

Why this exists

The open-source curricula for system design and DSA have a shape, and that shape is frustrating:

Link dumps — curated lists that point at dozens of scattered blog posts, LeetCode discuss threads, and talks. Great for discovery, useless as a learning path. You end up with 60 tabs open and no through-line.
Teaser-and-redirect repos — READMEs with a few hundred words on each topic that nudge you toward a paid course hosted elsewhere. The GitHub repo is the marketing page; the actual teaching is behind a $150-$400+/year paywall.
Monolithic READMEs frozen in time — a single 5,000-line README.md that was great in 2021 but hasn't kept up with the modern stack. No LLM serving, no CRDTs, no post-quantum crypto, no FinOps. And nobody wants to review a PR against a 5,000-line file.
Surface-level outlines — bullet-point summaries that tell you what exists without explaining why you'd pick one approach over another. You learn the vocabulary without the judgment.
Interview-only prep — focused on passing a specific 45-minute screen, not on actually operating systems at scale or actually understanding the algorithm.

These handbooks fix all of that:

100% inline content. Every chapter is a full teaching article written from scratch, living in this repo as plain Markdown. Nothing is a stub. Nothing redirects you elsewhere to learn.
Progressive curricula. HLD has 159 teaching chapters across 12 parts plus a 22-page Trade-offs Library, sequenced from prerequisites (Part 0) to Staff+ topics. DSA has 120 chapters across 15 parts, from arrays to bitmask DP, with each part introducing a tighter pattern family. Each chapter declares its prerequisites, learning objectives, estimated reading time, and difficulty tier.
Research-backed. Every HLD chapter ends with a Further Reading & References section citing primary sources — SIGMOD and OSDI papers, RFCs, IETF drafts, engineering postmortems, official docs, canonical books. 3,100+ citations across HLD. DSA chapters cite the original papers behind each algorithm where they exist (Floyd, Tarjan, Knuth-Morris-Pratt, Aho-Corasick).
Opinionated. Every topic picks a recommended approach and explains why. Where reasonable people disagree, the trade-offs are made explicit — HLD has a dedicated 22-page Trade-offs Library; DSA has 37 pattern decision pages that compare interchangeable approaches (recursion vs iteration, BFS vs DFS, sliding window vs prefix sum, and so on).
Modern (2025+). HLD covers LLM serving, RAG pipelines, AI agents, multi-agent orchestration, CRDTs, edge computing, FinOps, post-quantum cryptography, platform engineering, local-first software, differential privacy, MCP. DSA covers the modern competitive-programming toolkit: monotonic stacks/deques, Morris traversal, suffix arrays, Aho-Corasick, bitmask DP, randomised algorithms.
Practice-first for DSA. Every DSA chapter ships with sibling code samples in Python, Java, C++, and Go for the canonical LeetCode problems it teaches. Python is inlined in the chapter; the other three languages are one click away. 155 LeetCode problems are covered in this format.
Interactive on the website. The DSA book's 46 widget specs render as live, animated visualisations on dsa.handbook.academy (sliding windows that you can drag, monotonic stacks that animate the pop sequence, etc.). On GitHub the widget callouts link to the YAML spec.

Who this is for

These handbooks are written for:

SDE1 engineers preparing for SDE2 interviews. Read DSA Parts 0-9 for coding-round fluency, then HLD Part 0 + Part 1 + Part 11, plus 8-10 targeted case studies from HLD Part 8, plus the top-5 most-cited trade-off pages.
SDE2s preparing for Senior/Staff. The full HLD Parts 3-7 are the core "you need to know this to design anything meaningful" material. Parts 6 (Reliability) and 7 (Security) separate Senior candidates from Staff candidates. DSA Part 14 covers narration and trade-off articulation in the algorithmic round.
Senior/Staff engineers refreshing or filling gaps. HLD Part 9 (AI/ML systems) and the Trade-offs Library are valuable even if you've been doing this for 15 years. Modern LLM serving and vector search weren't a thing when most of us learned backend.
Self-taught engineers without a CS degree who want the vocabulary and the reasoning that bootcamps and YouTube don't teach. HLD Part 0 is explicit prerequisites: TCP/IP, the OS contract, database internals, API design. DSA Part 0 is the foundations — Big-O, recursion, bit manipulation — before pattern-matching becomes useful.
Career switchers moving from adjacent roles (frontend → backend, backend → infra, SWE → MLE) who need to build system-level intuition fast.
Competitive programmers and ICPC/Codeforces contestants who want a structured reference for the algorithmic toolkit (KMP, Aho-Corasick, suffix arrays, segment trees, Dijkstra, Bellman-Ford, DP variants).
Teachers and course creators who want to build curriculum without writing thousands of pages from scratch. The CC BY-SA 4.0 license explicitly allows this.
Anyone operating a production system at non-trivial scale who wants a reference shelf that covers the full stack — from packet-level networking to multi-tenant SaaS isolation models.

These handbooks are NOT for:

Absolute beginners with no programming experience. HLD Part 0 assumes you can read code and have built at least one CRUD app. DSA Part 0 assumes you can write a for loop and a recursive function. If you're brand new, start with The Odin Project or CS50, then come back.
People who want low-level language-specific tutorials. We don't teach Rust syntax or Go's goroutines — we teach concepts that apply across languages.

How these handbooks compare

	This repo (HLD + DSA)	Typical OSS system-design repos	Typical OSS DSA repos	Paid courses ($150-$400+/yr)	O'Reilly-style books
Free and open-source	Yes	Yes	Yes	No	No
100% inline content (no external redirects)	Yes	No	No	Yes	Yes
Both HLD + DSA in one curriculum	Yes	No	No	Rarely	No
150+ HLD chapters, 100+ DSA chapters	Yes	No	No	Only their sliver	Usually one topic
56 end-to-end HLD case studies	Yes	5-15 typical	—	15-25 typical	Varies
Sibling code in Python / Java / C++ / Go	Yes	—	Rarely all 4	Sometimes	Rarely
Interactive widgets on the DSA site	Yes	—	No	Sometimes	No
Dedicated Trade-offs Library (22 pages)	Yes	No	—	No	No
37 DSA pattern decision pages	Yes	—	No	No	No
Covers LLMs, RAG, agents, multimodal AI	Yes	Rarely modern	—	Sometimes	Rarely
Opinionated decisions, not just options	Yes	No	No	Sometimes	Varies
3,100+ primary-source citations	Yes	No	—	No	Yes
Community-editable, PRs welcome	Yes	Yes	Yes	No	No
Updated continuously	Yes	Often frozen	Often frozen	Varies	Every 3-5 years

Start reading

For the best reading experience, visit hld.handbook.academy for HLD or dsa.handbook.academy for DSA — free, no sign-up, full search, dark mode, per-chapter diagram zoom, and the live DSA widgets. The links below open the source on GitHub, where Mermaid diagrams also render natively.

Pick one:

I want a quick HLD taste. Read these three in order — they give you the vocabulary and the first real worked example:

I want a quick DSA taste. Read these three to see the chapter shape and the sibling-code workflow:

I'm preparing for an SDE2 interview in the next 6 weeks. Follow the SDE1 → SDE2 study plan.

I'm a Senior engineer refreshing my distributed-systems foundations. Read HLD Part 3 — Distributed Systems Theory straight through.

I want to learn AI-systems design. Read HLD Part 9 — AI & ML System Design + the 8 AI case studies in Part 8 (chapters 30-37).

I want to grind interview algorithms. Open the DSA curriculum. Read Parts 0-2 in order, then jump into the pattern parts (3-9) as the problems you encounter call for them.

I want to read both books cover-to-cover. Start at HLD Part 0 for the systems track and DSA Part 0 for the algorithms track. At one 25-minute chapter per day, the whole repo is roughly a year of reading.

The full HLD curriculum

181 pages across 12 parts + a 22-page Trade-offs Library. Each part below is collapsed by default — click to expand the chapter list. Linking to a specific part from elsewhere in the README auto-expands it on GitHub.

#	Part	Chapters	Difficulty	Reading time
0	Prerequisites	5	Beginner	~3 hrs
1	Core Fundamentals	7	Beginner-Intermediate	~4 hrs
2	Building Blocks	16	Intermediate	~10 hrs
3	Distributed Systems Theory	11	Intermediate-Advanced	~9 hrs
4	Data Systems	10	Intermediate-Advanced	~8 hrs
5	Architecture Patterns	11	Intermediate	~8 hrs
6	Reliability & Operations	11	Intermediate-Advanced	~8 hrs
7	Security at Scale	10	Intermediate-Advanced	~7 hrs
8	Case Studies	56	Intermediate-Advanced	~45 hrs
9	AI & ML System Design	15	Intermediate-Advanced	~11 hrs
10	Emerging Patterns	1	Intermediate-Advanced	~45 min
11	Interview Framework	6	Intermediate	~4 hrs
T	Trade-offs Library	22	Intermediate	~8 hrs

Part 0 — Prerequisites (5 chapters) — networking, OS, data structures, databases, API design

Part 0 — Prerequisites (5 chapters)

Audience: engineers without a CS degree, or anyone who wants to confirm their foundations before moving on. Difficulty: Beginner. Total reading time: ~3 hours.

Foundational topics that the rest of the handbook assumes. If you can explain TCP's three-way handshake, the difference between a process and a thread, a B-tree's internal structure, and why idempotent PUT is better than non-idempotent POST for a retry-prone endpoint — you can skip this part.

Networking Fundamentals for System Design — OSI and TCP/IP layers, TCP vs UDP, HTTP/1.1 vs HTTP/2 vs HTTP/3, DNS resolution, TLS handshake, what "the network is unreliable" really means in practice.
Operating System Essentials for System Design — Processes vs threads, context switching cost, virtual memory, page cache, filesystem I/O, epoll/kqueue/IOCP, why O_DIRECT matters for databases.
Data Structures for Distributed Systems — Hash tables, B-trees, LSM-trees, skip lists, Bloom filters, tries, HyperLogLog, Count-Min Sketch, and when each shows up in real systems.
Database Fundamentals for System Design — Transactions, isolation levels (read-committed to serializable), indexes, query planners, joins, and why your ORM hides things from you that you need to see.
API Design Basics: REST, GraphQL, gRPC, and the Hard Parts — Resource modeling, idempotency, versioning, pagination, rate-limit headers, error envelopes, HATEOAS in theory vs practice.

Part 1 — Core Fundamentals (7 chapters) — scalability, latency, availability, consistency, estimation, trade-off thinking

Part 1 — Core Fundamentals (7 chapters)

Audience: everybody — read this part even if you know the topic, because the vocabulary here is used for the rest of the book. Difficulty: Beginner-Intermediate. Total reading time: ~4 hours.

The vocabulary and the reasoning habits that every later chapter assumes. "Scalability," "consistency," "trade-off," and "back-of-envelope" get defined here rigorously so they mean something specific when we use them later.

Part 2 — Building Blocks (16 chapters) — load balancers, caches, queues, CDNs, databases, sharding, pub/sub, rate limiting, blob storage, geo, edge

Part 2 — Building Blocks (16 chapters)

Audience: anyone who'll assemble backend systems. This part is the Lego brick inventory. Difficulty: Intermediate. Total reading time: ~10 hours.

Deep dives on the pieces you assemble to build real systems: load balancers, caches, queues, CDNs, databases (SQL and NoSQL), partitioning, replication, pub/sub, rate limiters, service discovery, blob storage, geospatial indexes, and edge compute.

Part 3 — Distributed Systems Theory (11 chapters) — consensus, consistency, clocks, CRDTs, transactions

Part 3 — Distributed Systems Theory (11 chapters)

Audience: SDE2+ preparing for Senior/Staff. If you haven't internalized linearizability vs serializability, read this. Difficulty: Intermediate-Advanced. Total reading time: ~9 hours.

The theory that makes distributed systems distributed: consensus (Raft/Paxos), the full consistency spectrum, CAP/PACELC in 2025 framing, logical clocks, CRDTs, distributed transactions (2PC/Saga), exactly-once delivery, failure detection, consistent hashing, and Merkle-tree anti-entropy.

Part 4 — Data Systems (10 chapters) — storage engines, OLTP/OLAP, warehouses/lakes, streams, search, time-series, graph, vector, KV

Part 4 — Data Systems (10 chapters)

Audience: anyone who owns a data pipeline or picks a database. Difficulty: Intermediate-Advanced. Total reading time: ~8 hours.

Every flavor of data system you might pick, and when each actually fits. Storage engines (B-tree vs LSM), OLTP vs OLAP, warehouses/lakes/lakehouses, streaming vs batch, CDC, search, time-series, graph, vector, and key-value.

Part 5 — Architecture Patterns (11 chapters) — monolith vs micro, event-driven, CQRS, ES, serverless, BFF, strangler, hex, multi-region, multi-tenant, CRDT apps

Part 5 — Architecture Patterns (11 chapters)

Audience: engineers making architecture-level decisions or leading service migrations. Difficulty: Intermediate. Total reading time: ~8 hours.

The architectural shapes you choose between when you design anything bigger than a single service: monolith vs microservices, event-driven, CQRS, event sourcing, serverless, BFF, strangler fig, hexagonal/clean, multi-region, multi-tenancy, CRDT-based apps.

Part 6 — Reliability & Operations (11 chapters) — observability, SLOs, resilience, scaling, deploys, chaos, incidents, FinOps, platform

Part 6 — Reliability & Operations (11 chapters)

Audience: anyone on-call, anyone who signs SLAs, anyone paying a cloud bill. Difficulty: Intermediate-Advanced. Total reading time: ~8 hours.

The engineering that separates "it compiles" from "it runs reliably at 3 a.m. on a long weekend": observability, SLOs, resilience patterns, auto-scaling, deployments, chaos engineering, incident response, health checks, FinOps, platform engineering.

Part 7 — Security at Scale (10 chapters) — AuthN/Z, OAuth/OIDC, JWT, mTLS, secrets, DDoS/WAF, compliance, supply chain, privacy, PQC

Part 7 — Security at Scale (10 chapters)

Audience: Senior/Staff engineers, platform teams, security-adjacent builders. Difficulty: Intermediate-Advanced. Total reading time: ~7 hours.

Security architecture for real systems, not a CISSP crib sheet: AuthN vs AuthZ, OAuth2/OIDC, JWT (and why you probably shouldn't), mTLS, secrets management, DDoS/WAF, compliance (GDPR/DPDP/CCPA), software supply chain, privacy-preserving systems, post-quantum cryptography.

Part 8 — Case Studies (56 chapters) — 56 end-to-end designs, grouped by theme; the centerpiece of interview prep

Part 8 — Case Studies (56 chapters)

Audience: interview prep candidates, engineers building adjacent systems, anyone who learns best from worked examples. Difficulty: Intermediate-Advanced. Total reading time: ~45 hours.

56 end-to-end system designs, each following a consistent structure: requirements (functional + non-functional), back-of-envelope numbers, high-level architecture, data model, deep-dive components, scalability and reliability considerations, and real-world references. Grouped thematically below for easier navigation.

Core primitives (chapters 00-03) — the four designs you see in every interview

Messaging & social (chapters 04-09) — notifications, chat, feeds, photos, crawlers, autocomplete

Media & consumer products (chapters 10-17) — video, ride-hailing, maps, file sync, editing, cache, recommenders

Commerce & financial (chapters 18-21) — ticketing, payments, stock exchange, food delivery

Data & infrastructure (chapters 22-29) — metrics, ad-click, logs, proximity, leaderboards, IDs, hotels, schedulers

AI systems (chapters 30-37) — ChatGPT, RAG, coding agents, AI search, voice, moderation, semantic cache, model routing

Infra services (chapters 38-39) — feature flags, DNS

Consumer products II (chapters 40-49) — dating, auctions, SaaS, video conf, email, live comments, fraud, fitness, online judge, price tracking

Developer & ops platforms (chapters 50-55) — API gateway, CI/CD, observability, search engine, brokerage, chat-at-scale

Part 9 — AI & ML System Design (15 chapters) — LLM serving, RAG, vector search, agents, LLMOps, safety, recommenders, multimodal

Part 9 — AI & ML System Design (15 chapters)

Audience: anyone building with or around LLMs, agents, or production ML. Difficulty: Intermediate-Advanced. Total reading time: ~11 hours.

Modern AI-systems architecture, treated with the same rigor as Part 3: LLM serving, RAG, vector search, agent architectures, multi-agent orchestration, LLM evaluation, LLMOps, cost optimization, safety, ML fundamentals, feature stores, recommenders, real-time AI, multimodal, and the data infra underneath all of it.

Part 10 — Emerging Patterns (1 chapter) — green/sustainable computing; growing list of forward-looking topics

Part 10 — Emerging Patterns (1 chapter)

Audience: Staff+ engineers and architects thinking past 2026. Difficulty: Intermediate-Advanced. Total reading time: ~45 minutes.

Forward-looking topics that are adjacent to everything else. Currently one chapter, with more planned (WebAssembly at the edge, unikernels, confidential computing, on-device AI).

Green Computing (Carbon-Aware Scheduling, PUE, Sustainable Systems)

Part 11 — Interview Framework (6 chapters) — RESHADED/PEDALS/ADEPT, requirements, diagrams, trade-offs, company flavors, RFCs

Part 11 — Interview Framework (6 chapters)

Audience: anyone preparing for or giving system-design interviews. Difficulty: Intermediate. Total reading time: ~4 hours.

How to run a 45-minute system-design interview, from both sides of the whiteboard. Compares RESHADED / PEDALS / ADEPT frameworks, teaches requirements scoping, diagramming, trade-off articulation, company-specific flavors, and RFC/design-doc authoring for Staff-level work.

Trade-offs Library (22 pages) — the canonical "X vs Y" decision pages cross-referenced from every part

Trade-offs Library (22 pages)

Audience: everyone — these pages are referenced from every other part. Difficulty: Intermediate. Total reading time: ~8 hours.

The 22 most-asked architectural-choice questions, each answered in a dedicated decision-comparison page: flowchart, comparison table, "when to pick A" vs "when to pick B" sections, real-world examples, and citations.

The full DSA curriculum

120 chapters across 15 parts, plus 37 pattern decision pages, 5 long-form editorials, and 46 interactive widget specs. Each chapter is centered on a specific data structure or algorithmic pattern, taught with worked LeetCode problems. Sibling sol.py / sol.java / sol.cpp / sol.go files live under each problem directory; the Python solution is inlined in the chapter, the others are linked. Editorials, pattern decision diagrams, and widget YAML specs are siblings under content/dsa/.

Each part below is collapsed by default — click to expand the chapter list. Linking to a specific part from elsewhere in the README auto-expands it on GitHub.

#	Part	Chapters	Difficulty	Reading time
0	Foundations	7	Beginner	~3 hrs
1	Linear Data Structures	8	Beginner	~5 hrs
2	Search & Sort	8	Beginner-Intermediate	~5 hrs
3	Two Pointers, Sliding Window, Prefix Sums	6	Intermediate	~4 hrs
4	Stack & Queue Patterns	5	Intermediate	~3 hrs
5	Linked Lists	6	Intermediate	~4 hrs
6	Trees & Heaps	11	Intermediate-Advanced	~7 hrs
7	Recursion & Backtracking	7	Intermediate-Advanced	~5 hrs
8	Graphs	13	Intermediate-Advanced	~9 hrs
9	Dynamic Programming	15	Advanced	~10 hrs
10	Greedy	5	Intermediate-Advanced	~3 hrs
11	Bit Manipulation	4	Intermediate	~2 hrs
12	Strings & Pattern Matching	6	Intermediate-Advanced	~4 hrs
13	Design the Data Structure	7	Intermediate-Advanced	~5 hrs
14	Interview Framework	12	All levels	~5 hrs
P	Pattern decision pages	37	Intermediate	~10 hrs
E	Long-form editorials	5	Intermediate	~2 hrs
W	Interactive widget specs	46	—	reference

Part 0 — Foundations (7 chapters) — Big-O, recursion, bit ops, interview math, language idioms, choosing your language

Part 0 — Foundations (7 chapters)

Audience: anyone starting interview prep, or returning after a long pause. Difficulty: Beginner. Total reading time: ~3 hours.

The mental models and language fluency that everything else assumes. Big-O, the recursion mental model, bit-manipulation primer, the math you actually need for interviews, language idioms across Python / Java / C++ / Go, and how to pick which language to interview in.

Part 1 — Linear Data Structures (8 chapters) — arrays, dynamic-array internals, strings, hash maps, stacks, queues, matrices

Part 1 — Linear Data Structures (8 chapters)

Audience: everyone — these are the structures that show up in 70% of all interview problems. Difficulty: Beginner. Total reading time: ~5 hours.

The contiguous-memory and bucket-based data structures: arrays (static / dynamic / multi-dimensional), the amortized-O(1) doubling rule that makes a vector work, strings as encoded byte arrays, hash maps and hash sets, the load-factor / collision math behind them, stacks and queues with the call-stack analogy, and matrix manipulation tricks (rotate-in-place, spiral, transpose).

Part 2 — Search & Sort (8 chapters) — linear/binary search, comparison & linear-time sorts, heap sort, quickselect

Part 2 — Search & Sort (8 chapters)

Audience: every interviewee — binary search alone shows up in roughly a quarter of all problems. Difficulty: Beginner-Intermediate. Total reading time: ~5 hours.

How to find things and how to order things. Linear search and when it's actually the right answer; the canonical binary search written without off-by-one bugs; the lower_bound / upper_bound / peak / rotated-array variants; the comparison-sort family (insertion, merge, quicksort with the production hybrids like Timsort and Introsort); heap sort and why n log n is the comparison-sort lower bound; the linear-time sorts (counting, radix, bucket); and quickselect for the top-k problems.

Part 3 — Two Pointers, Sliding Window, Prefix Sums (6 chapters) — the array-walking patterns that turn O(n²) into O(n)

Part 3 — Two Pointers, Sliding Window, Prefix Sums (6 chapters)

Audience: anyone whose nested-loop solutions keep timing out. Difficulty: Intermediate. Total reading time: ~4 hours.

Three closely related patterns that all amount to "walk the array smarter": opposite-ends two pointers (Container With Most Water, 3Sum), same-direction two pointers, fixed and variable sliding windows, prefix sums and difference arrays, and the prefix-sum + hash-map combo for "subarray sum equals K"-shaped problems.

Part 4 — Stack & Queue Patterns (5 chapters) — monotonic stack/deque, min/max stack, expression parsing, queue-from-stacks

Part 4 — Stack & Queue Patterns (5 chapters)

Audience: anyone struggling with Next-Greater-Element and sliding-window-maximum problems. Difficulty: Intermediate. Total reading time: ~3 hours.

Stack and queue as algorithmic patterns, not just data structures. Monotonic stacks (Daily Temperatures, Largest Rectangle in Histogram), monotonic deques (Sliding Window Maximum), min/max stacks (O(1) min query under push/pop), expression parsing (Shunting-Yard / RPN), and the queue-from-stacks amortization argument.

Part 5 — Linked Lists (6 chapters) — pointer rewiring, reversal, k-group reversal, Floyd's cycle, merging, LRU cache

Part 5 — Linked Lists (6 chapters)

Audience: anyone who's drawn boxes-and-arrows on a whiteboard and gotten lost. Difficulty: Intermediate. Total reading time: ~4 hours.

Linked lists are pointer surgery. Sentinel-node patterns, the canonical iterative reversal (and the recursive cousin), reverse-in-groups-of-k, Floyd's tortoise-and-hare cycle detection (and why the cycle-start formula works), merging sorted lists, and the LRU cache as the canonical hash-map + doubly-linked-list combo.

Part 6 — Trees & Heaps (11 chapters) — traversals, Morris, BFS, heaps, BST, AVL/RB, tries, tree DP, segment trees

Part 6 — Trees & Heaps (11 chapters)

Audience: anyone past the linear-data-structure phase of prep. Difficulty: Intermediate-Advanced. Total reading time: ~7 hours.

The hierarchical structures and the priority queue. Binary tree fundamentals, the three depth-first traversals (pre/in/post — both recursive and iterative), Morris traversal for O(1) space, level-order BFS, heaps and priority queues (heapify in O(n)), binary search trees, AVL rotations, a red-black overview, tries, the tree-DP primer (post-order with side state), and an introduction to segment trees for range queries.

Part 7 — Recursion & Backtracking (7 chapters) — recursion patterns, the template, subsets/perms, N-Queens, Sudoku, randomized algos

Part 7 — Recursion & Backtracking (7 chapters)

Audience: anyone whose subset/permutation/combination solutions feel like guesswork. Difficulty: Intermediate-Advanced. Total reading time: ~5 hours.

Backtracking is just DFS with state-restoration discipline. The recursion patterns (linear, tree, divide-and-conquer); the backtracking template you can adapt to any constraint problem; subsets, combinations, and permutations; N-Queens with pruning; Sudoku with constraint propagation and forward checking; word search on a grid; and Fisher-Yates, reservoir sampling, and rejection sampling for randomized algorithms.

Part 8 — Graphs (13 chapters) — BFS/DFS, components, topo, cycles, bipartite, union-find, Dijkstra, Bellman-Ford, MSTs

Part 8 — Graphs (13 chapters)

Audience: anyone preparing for FAANG-tier interviews — graph problems are the differentiator. Difficulty: Intermediate-Advanced. Total reading time: ~9 hours.

The graph chapters are interview-grade end-to-end. Adjacency-list vs adjacency-matrix representations; BFS and DFS as orthogonal traversal templates; connected components and flood fill (Number of Islands); topological sort via Kahn's queue and via DFS post-order reverse; cycle detection on directed and undirected graphs; bipartite checking; union-find with path compression and union-by-rank; Dijkstra's algorithm; Bellman-Ford and negative cycles; and minimum spanning trees via both Kruskal and Prim.

Part 9 — Dynamic Programming (15 chapters) — memo↔tab, 1D/grid/interval/tree/bitmask, knapsack, LCS, edit distance, LIS, palindromes

Part 9 — Dynamic Programming (15 chapters)

Audience: the part most candidates fear most. Read this if "DP problems just don't click." Difficulty: Advanced. Total reading time: ~10 hours.

The most demanding part of the handbook. Build DP up from recursion (top-down memoization) and re-derive the bottom-up tabulation; 1D-state DPs (Climbing Stairs, House Robber, decision DP); the string-prefix decision DPs (Decode Ways, Word Break); 0/1 and unbounded knapsacks; longest-common-subsequence and edit distance; LIS in O(n²) and the patience-sort O(n log n) variant; palindrome DP; interval DP (matrix chain, burst balloons); grid DP; tree DP; and the bitmask DP family for "all subsets" problems.

Part 10 — Greedy (5 chapters) — when local choices win, intervals, activity selection, Huffman, jump games

Part 10 — Greedy (5 chapters)

Audience: anyone who's been burned by a greedy that "looked right" and got Wrong Answer. Difficulty: Intermediate-Advanced. Total reading time: ~3 hours.

When local choices yield global optima, and how to prove it. Greedy thinking framed against DP; interval scheduling (the sorting comparator is the algorithm); activity selection and the task-scheduler family; Huffman encoding; and Jump Games / Gas Station as canonical "scan once, maintain a running invariant" problems.

Part 11 — Bit Manipulation (4 chapters) — the bit-ops cookbook, XOR patterns, bitmask techniques, performance tricks

Part 11 — Bit Manipulation (4 chapters)

Audience: anyone preparing for low-level / performance-oriented interviews (HFT, embedded, kernel, GPU). Difficulty: Intermediate. Total reading time: ~2 hours.

The bit-ops cookbook (set/clear/toggle, lowest-set-bit, popcount); XOR patterns (Single Number I/II/III, Missing Number); bitmask techniques as compact subset state; and bit-level performance tricks for the "your code is correct, just make it 10× faster" interview round.

Part 12 — Strings & Pattern Matching (6 chapters) — naive matching, Rabin-Karp, KMP, Z-array, Aho-Corasick, suffix arrays

Part 12 — Strings & Pattern Matching (6 chapters)

Audience: anyone interviewing where strings are a focus area (search infra, NLP infra, IDEs, compiler tooling). Difficulty: Intermediate-Advanced. Total reading time: ~4 hours.

Substring search beyond the naive O(n·m) baseline. Rabin-Karp with rolling hashes; KMP and the failure function (and why it generalizes the prefix-function); the Z-algorithm; Aho-Corasick for matching many patterns in one pass; and an introduction to suffix arrays for "all substring" queries.

Part 13 — Design the Data Structure (7 chapters) — LRU/LFU, min stacks, hit counter, trie autocomplete, Twitter feed, game state

Part 13 — Design the Data Structure (7 chapters)

Audience: SDE2+ candidates — "design X" is the staple senior-coding-round question type. Difficulty: Intermediate-Advanced. Total reading time: ~5 hours.

The bridge between DSA and HLD: how to compose primitives into APIs that meet a per-operation complexity contract. The full LRU treatment with concurrency and multi-tier framing; LFU via frequency-bucketed doubly-linked lists; min stacks and max-frequency stacks; hit counters and rate limiters; trie-backed autocomplete; Twitter-feed design; and a game-state design vignette.

Part 14 — Interview Framework (12 chapters) — pattern recognition, clarifying questions, communicating, complexity, mocks, per-company tracks

Part 14 — Interview Framework (12 chapters)

Audience: anyone within a month of an actual interview window. Difficulty: All levels. Total reading time: ~5 hours.

The meta-skills that turn a correct solution into a passing one. Pattern-recognition drills; the first-five-minutes clarifying-questions script; how to narrate while you code; how to discuss complexity convincingly; running productive mock interviews; the most common pitfalls; Amazon Leadership Principles (brief and narration); the Meta AI-assisted round (format, prompting tactics, common failures); and the per-company tracks index that points at company-specific reading lists.

DSA pattern decision pages (37 pages)

Audience: anyone deciding "which pattern fits this problem?" mid-interview. Difficulty: Intermediate. Total reading time: ~10 hours total; each page is a 15-30 min decision aid.

Each page is a focused decision-comparison: a Mermaid decision tree, a "pick A vs pick B" table, archetype problems on each side, and references. Click to expand the full list.

All 37 pattern decision pages

DSA long-form editorials (5 pages)

Audience: anyone who's solved an "easy" problem and wants the senior-engineer-level analysis behind it. Difficulty: Intermediate. Total reading time: ~2 hours total.

Five canonical interview problems treated as full editorial-style essays: every approach (brute force → optimized → optimal), the proof of correctness, the edge cases, the production framing, and the follow-up variants.

All 5 long-form editorials

LC-001 — Two Sum (full editorial under LC-001-two-sum/)
LC-003 — Longest Substring Without Repeating Characters (full editorial under LC-003-longest-substring-without-repeating-characters/)
LC-005 — Longest Palindromic Substring (full editorial under LC-005-longest-palindromic-substring/)
LC-011 — Container With Most Water (full editorial under LC-011-container-with-most-water/)
LC-015 — 3Sum (full editorial under LC-015-3sum/)

DSA interactive widget specs (46 YAML files)

Audience: contributors authoring or editing the website's interactive widgets. Format: YAML; rendered as interactive animations on dsa.handbook.academy.

Each widget is described as keyframe data plus narration in YAML. The website's renderer turns each spec into an interactive (play/pause/scrub/step) animation alongside the chapter prose. Click to expand the full list.

Editorial-tied widgets (5)

Pattern widgets (41)

Widgets are catalogued in content/dsa/widgets/_widget-registry.yml; see content/dsa/widgets/README.md for authoring conventions.

Study plans

You don't need one — pick any chapter and start reading. These are here if you prefer a pre-built route, typically because you're preparing for a specific interview window or filling a specific gap.

SDE1 → SDE2 — 6-week interview prep

Goal: pass a 45-60 minute system-design screen at a mid-to-senior level. Roughly 8-10 hours of reading per week.

Week	Focus	Chapters
1	Foundations	Part 0 + Part 1 (12 chapters, ~7 hrs)
2	Building blocks I	Part 2 chapters 0-7: load balancers, proxies, CDN, cache, SQL, NoSQL, partitioning, replication
3	Building blocks II	Part 2 chapters 8-15: queues, pub/sub, real-time, rate limiting, service mesh, blob storage, geo, edge
4	Case studies (core)	Pick 5 from Part 8 chapters 0-9: URL shortener, rate limiter, chat, feed, web crawler, autocomplete
5	Case studies (your target)	Pick 5 more from Part 8 relevant to your target company (see Company-Specific Flavors)
6	Interview mechanics	Part 11 + top 5 most-cited pages in Trade-offs Library

SDE2 → Senior — 3-month deep dive

Goal: operate at Senior level, own cross-team architecture, pass loops at Senior+ bars.

Phase	Focus	Duration
1	Foundations + building blocks	3 weeks — Parts 0-2 (28 chapters)
2	Distributed theory + data + architecture	4 weeks — Parts 3, 4, 5 (32 chapters)
3	Case studies (all 56)	4 weeks — Part 8
4	Reliability, security, AI, frontier, interview	2 weeks — Parts 6, 7, 9, 10, 11 + Trade-offs Library

Full curriculum — Staff+ preparation (6 months)

Read everything in order. Use the end-of-chapter questions for active recall. Average one 25-minute chapter per day gets you through the whole thing in about 6 months, with buffer for harder chapters and re-reading.

AI/ML-only track (4 weeks)

If you already know the fundamentals and want to become fluent specifically in AI-systems design:

Week	Focus	Chapters
1	LLM serving + RAG	Part 9 chapters 0-3
2	Agents + evaluation	Part 9 chapters 3-9
3	AI case studies	Part 8 chapters 30-37 (ChatGPT, RAG, coding agent, Perplexity, voice, moderation, semantic cache, model router)
4	ML fundamentals	Part 9 chapters 9-14 + Recommendation System case study

Interview-triage track (one weekend)

If you have a loop on Monday:

Saturday morning: How to Approach a System Design Question + Back-of-the-Envelope Estimation + Requirements Scoping + Diagramming Skills
Saturday afternoon: 3 case studies in your domain
Sunday morning: 3 more case studies + Trade-off Articulation
Sunday evening: Company-Specific Interview Flavors for your target

Project statistics

Metric	HLD Handbook	DSA Handbook	Repo total
Parts	12 + Trade-offs Library	15	27 + library
Teaching chapters	159	120	279
Decision/pattern pages	22 trade-offs	37 patterns	59
Long-form editorials	—	5	5
Interactive widget specs	—	46	46
Sibling code samples (per language)	—	155 problems × 4 langs ≈ 620 files	620
Total Markdown pages	181	162	343
Total words	~773,000	~470,000	~1,240,000
Mermaid diagrams	719	226	945
Citations to primary sources	3,100+	hundreds	3,500+
Estimated total reading time	~110 hours	~60 hours	~170 hours
Equivalent printed book page count	~2,400 pages	~1,500 pages	~3,900 pages

For comparison, the HLD book alone is longer than Designing Data-Intensive Applications (~600 pages) + Alex Xu's System Design Interview Vol. 1 (~300 pages) + Vol. 2 (~340 pages) combined. The DSA book is comparable in length to Cracking the Coding Interview (~700 pages) plus Elements of Programming Interviews (~480 pages).

Quality standards

Every chapter in this repository passes 7 automated CI checks on every PR. The validators run against both books and dispatch on a per-book schema:

markdownlint — Markdown style and structure conformance.
typos — source-code spell-check with a project-specific allowlist for technical terms.
Citation integrity — every [^1]-style footnote has a corresponding [^1]: source definition; every citation is a real URL; no orphan citations.
Frontmatter validation — HLD chapters declare title, difficulty, prerequisites, date_created, date_updated, reading_time_minutes, tags (canonical taxonomy), and technologies (curated allowlist). DSA chapters declare title, slug, part, chapter, difficulty, languages (subset of python/java/cpp/go), canonical_test, widgets, and ladder (referencing LC-NNN IDs in _problem-registry.yml). Editorials and pattern decision pages have their own thinner schemas. See scripts/check-frontmatter.mjs for the per-book branches.
Mermaid diagram validation — all 945 diagrams across both books must parse with @mermaid-js/mermaid-cli on CI so broken syntax doesn't ship.
Vale — prose-style linter for voice, passive voice, weasel words, and banned phrases. Custom rules in .vale.ini.
lychee — external link checker run weekly; flags rotted URLs so citations stay valid.

CI configuration lives in .github/workflows/content-ci.yml. Validator scripts are in scripts/.

Beyond automation, every chapter is reviewed for:

Internal consistency (terminology matches Part 1 definitions)
Difficulty calibration (a "Beginner" chapter doesn't assume Staff-level context)
Diagram quality (Mermaid, not screenshots; captioned; accessibility-tagged)
Citation quality (primary sources preferred over summarizing blog posts)

Contributing

Contributions of all sizes are welcome, from a typo fix to a full chapter. Read CONTRIBUTING.md for the full workflow. A short summary:

Contribution paths

Time	What you can do	Issue required?
5 min	Fix a typo or dead link	No
15 min	Add a missing citation or update an out-of-date number	No
30 min	Add a real-world example or clarify a confusing paragraph	No
1 hour	Create a Mermaid diagram for an existing chapter	Optional
2-4 hours	Review a chapter for technical accuracy and leave feedback	Yes
4-8 hours	Write a full chapter from an outline	Yes, required
Translator	Translate one chapter or the full handbook into your language	Yes

How to contribute

Read the STYLE_GUIDE.md for voice, structure, diagram conventions, and citation format.
Open an issue first for anything bigger than 30 minutes of work — this ensures you don't duplicate in-flight work.
Fork, branch, edit. Use descriptive branch names like fix/raft-quorum-math or add/mcp-protocol-chapter.
Run validators locally (optional):
```
npm install
npm run check:all
```
If you skip this, CI will run them on your PR anyway and tell you what to fix.
Submit a pull request using the PR template. Small fix? A one-sentence description is fine. Full chapter? Describe the pedagogical approach and list your primary sources.
Respond to review. A maintainer will review within 7 days (usually faster). For content chapters, expect at least one round of technical review.

What makes a great contribution

Correctness. If you're citing a number, cite the primary source. If you're claiming a property (like "Raft guarantees linearizability"), cite the paper.
Clarity. Write for the difficulty tier declared in the chapter's frontmatter. Don't introduce Part-7 concepts in a Part-0 chapter.
Opinion with evidence. If you think the chapter should recommend something different, make the case with citations, not vibes.
Pedagogical structure. Intro → first principles → diagram → worked example → trade-offs → production gotchas → references. Deviate only when the topic genuinely demands it.

Reviewers wanted

If you have operated one of the systems we cover in Part 8 (payment processing, real-time chat, feeds, video streaming, etc.) — your review is worth more than a month of solo research. Open an issue with your expertise area and which chapter you'd like to review.

Private/sensitive concerns

Security issues in validator scripts or CI: open a private security advisory.
Code of conduct concerns: email hello@handbook.academy.
Legal or licensing questions: email hello@handbook.academy.
Direct contact: @invincible04.

Project structure

engineering-handbook/
├── content/                           # Two open-source books (CC BY-SA 4.0)
│   ├── hld/                           # HLD Handbook — 181 pages
│   │   ├── part-0-prerequisites/          # 5 chapters
│   │   ├── part-1-core-fundamentals/      # 7 chapters
│   │   ├── part-2-building-blocks/        # 16 chapters
│   │   ├── part-3-distributed-systems-theory/  # 11 chapters
│   │   ├── part-4-data-systems/           # 10 chapters
│   │   ├── part-5-architecture-patterns/  # 11 chapters
│   │   ├── part-6-reliability-and-operations/  # 11 chapters
│   │   ├── part-7-security-at-scale/      # 10 chapters
│   │   ├── part-8-case-studies/           # 56 chapters
│   │   ├── part-9-ai-ml-system-design/    # 15 chapters
│   │   ├── part-10-emerging-patterns/     # 1 chapter
│   │   ├── part-11-interview-framework/   # 6 chapters
│   │   └── trade-offs/                    # 22 decision pages
│   └── dsa/                           # DSA Handbook — 120 chapters
│       ├── part-0-foundations/        # 7 chapters
│       ├── part-1-linear-data-structures/  # 8 chapters
│       ├── part-2-search-sort/        # 8 chapters
│       ├── part-3-pointers-window-prefix/  # 6 chapters
│       ├── part-4-stack-queue-patterns/  # 5 chapters
│       ├── part-5-linked-lists/       # 6 chapters
│       ├── part-6-trees-heaps/        # 11 chapters
│       ├── part-7-recursion-backtracking/  # 7 chapters
│       ├── part-8-graphs/             # 13 chapters
│       ├── part-9-dynamic-programming/  # 15 chapters
│       ├── part-10-greedy/            # 5 chapters
│       ├── part-11-bit-manipulation/  # 4 chapters
│       ├── part-12-strings-pattern-matching/  # 6 chapters
│       ├── part-13-design-the-data-structure/  # 7 chapters
│       ├── part-14-interview-framework/  # 12 chapters
│       ├── editorials/                # Per-LC long-form editorials
│       ├── patterns/                  # Pattern decision references
│       ├── widgets/                   # YAML specs for interactive widgets
│       ├── _problem-registry.yml      # Canonical LC-NNN registry
│       └── _widget-registry.yml       # Canonical w-NN/e-LCNNN registry
│
├── writing-guides/                    # Author handbooks (CC BY-SA 4.0)
│   ├── case-study-template.md         # Skeleton for Part 8 case studies
│   └── trade-off-template.md          # Skeleton for trade-off decision pages
│
├── scripts/                           # Content validators used by CI
│   ├── check-citations.mjs            # Footnote integrity
│   ├── check-frontmatter.mjs          # YAML frontmatter schema (per-book)
│   ├── check-mermaid.mjs              # Mermaid syntax validator
│   ├── content-stats.mjs              # Word / diagram / citation counts
│   ├── technologies.json              # Curated technology name taxonomy (HLD)
│   └── update-frontmatter-dates.mjs   # Bulk-touch date_updated on edited files
│
├── .github/
│   ├── workflows/
│   │   ├── content-ci.yml             # PR validation pipeline (both books)
│   │   └── stale.yml                  # Issue/PR hygiene
│   ├── ISSUE_TEMPLATE/                # Correction / Request / Writing templates
│   ├── PULL_REQUEST_TEMPLATE.md
│   ├── CODEOWNERS
│   ├── FUNDING.yml
│   ├── dependabot.yml
│   └── release.yml
│
├── STYLE_GUIDE.md                     # Voice, structure, diagram conventions
├── CONTRIBUTING.md                    # Full contribution workflow
├── CODE_OF_CONDUCT.md                 # Contributor Covenant v2.1
├── CONTRIBUTORS.md                    # All-contributors list
├── SECURITY.md                        # Security disclosure process
├── CITATION.cff                       # Citation metadata (Zenodo, academic tools)
├── LICENSE                            # CC BY-SA 4.0 (everything in this repo)
├── package.json                       # Dev dependencies for validators
├── lychee.toml                        # Link-check configuration
├── .vale.ini                          # Prose style rules
├── .markdownlint.json                 # Markdown style rules
└── .typos.toml                        # Spell-check allowlist

Development setup

You don't need anything installed to contribute content — edit Markdown, push, let CI validate. But if you want to run the checks locally before pushing:

Prerequisites

Node.js 20+ (the .nvmrc file pins the exact version)
npm (ships with Node)
Optional: Vale for prose-style linting
Optional: lychee for link checking

Setup

git clone https://github.com/handbook-academy/engineering-handbook.git
cd engineering-handbook
nvm use                    # pins Node version from .nvmrc
npm install                # installs markdownlint, @mermaid-js/mermaid-cli, etc.

Common commands

# Run all 7 content checks (same as CI)
npm run check:all

# Individual checks
npm run lint               # markdownlint
npm run check:citations    # footnote integrity
npm run check:frontmatter  # YAML schema
npm run check:mermaid      # Mermaid diagram syntax
npm run check:typos        # spell-check
npm run check:links        # lychee link check (slow; runs weekly)

# Optional
npm run prose              # Vale prose-style linter (requires Vale installed)
npm run stats              # word / diagram / citation counts

Editor

Use Markdownlab (source) for editing chapters. It's a browser-based Markdown editor with live preview, Mermaid rendering, and the same callouts/footnotes/admonitions this handbook uses — no install, no setup.

Open any chapter directly in the editor by prefixing its GitHub URL with https://markdownlab.vercel.app/:

https://markdownlab.vercel.app/https://github.com/handbook-academy/engineering-handbook/blob/main/content/hld/part-1-core-fundamentals/00-scalability.md

That URL opens the chapter in Markdownlab pre-loaded and ready to edit. Paste your edits back into a fork → branch → PR.

The repo includes .editorconfig, .nvmrc, .markdownlint.json, .vale.ini, and .typos.toml so any editor that reads them picks up project settings automatically.

Governance

Benevolent maintainer model. The project is currently solo-maintained by @invincible04. Major architectural decisions (curriculum scope, licensing, CI strategy) are made by the maintainer after consultation with active contributors via Issues/Discussions.

Decision process:

Typo / dead link / factual correction: any maintainer can merge.
New chapter or major rewrite: requires an issue with pedagogical rationale and at least one maintainer approval.
Curriculum scope change (new part, renamed part, reordering): requires a Discussion thread open for at least 7 days before merge.
License change: requires all-contributors consent; CC BY-SA 4.0 for everything in this repository is a stable commitment.

Becoming a maintainer. Sustained, high-quality contributions over 3+ months can result in a maintainer invitation. This includes merge rights on your areas of expertise and a vote on curriculum-scope decisions.

See CODEOWNERS for the current maintainer routing.

FAQ

Is this really free?

Yes, and specifically: everything in this repository — both books' content, writing guides, style guide, validator scripts — is released under CC BY-SA 4.0. Anyone (including us) can share and adapt it, but adaptations must carry the same license. That means the chapter text cannot be paywalled by anybody — not by us, not by a future company, not by anyone who forks this repo. You can always read both books free at handbook.academy.

Can I use this to build a paid course?

Yes. CC BY-SA 4.0 allows commercial use provided you:

Attribute the original ("Adapted from The Engineering Handbook, CC BY-SA 4.0, https://github.com/handbook-academy/engineering-handbook")
License your derivative under the same CC BY-SA 4.0 terms
Indicate what you changed

You can teach courses, run boot camps, build YouTube channels, or sell books derived from this content — under share-alike. What you can't do is take the content, strip the attribution, and relicense it as proprietary.

Can I translate it into my language?

Please do. Translations are explicitly welcomed. Open an issue saying which language you'd like to translate into and which chapters you plan to start with. Translations live in parallel directories (e.g. content-hi/ for Hindi, content-zh/ for Chinese) under the same repository, or as a linked sister repo — your call.

How is this different from `donnemartin/system-design-primer`?

system-design-primer is a link-dump README that points at scattered blog posts and talks. It's great as a discovery tool. The HLD Handbook in this repo is inline content — every topic is taught fully within these pages. They're complementary: system-design-primer for breadth of resources, this handbook for depth of teaching.

How is this different from Alex Xu's books?

Alex Xu's books are excellent and we cite them in nearly every case study. The differences:

Scope: ~640 pages (Vols 1+2) vs ~2,400 equivalent pages of HLD here.
Freshness: Xu Vol. 2 was published in 2022 and doesn't cover LLMs, RAG, agents, CRDTs, post-quantum crypto, MCP, etc. This handbook does.
Format: Xu is case-study-focused. This handbook has 56 case studies plus 103 foundational HLD chapters that teach the building blocks.
License: Xu's books are copyrighted and paid. This handbook is CC BY-SA 4.0.
Community: Xu's books are one-author; this handbook accepts PRs.
Bonus: this repo also includes a 120-chapter DSA curriculum.

How is the DSA book different from NeetCode / LeetCode editorials / "Cracking the Coding Interview"?

Structured by pattern, not by problem. Every DSA chapter is a teaching chapter on a structure or pattern (e.g. monotonic stack, sliding-window-variable, segment tree). LeetCode problems show up as worked examples inside the chapter, not as the unit of study.
Four languages, side by side. Each canonical problem ships with sol.py, sol.java, sol.cpp, sol.go siblings. You read the chapter in Python, then click through to your interview language without re-deriving the logic.
37 pattern decision pages. When a problem could be solved by recursion or iteration, BFS or DFS, sliding window or prefix sum — there's a pattern decision page that compares them and tells you which to reach for first. This is the part NeetCode skips.
Interactive widgets. On dsa.handbook.academy, the relevant chapters render live, animated visualisations for sliding windows, monotonic stacks, Morris traversal, quickselect partitioning, and so on. Useful when the words on the page aren't enough.
Open and editable. CTCI is paywalled and frozen at one author's voice. This is CC BY-SA 4.0 and accepts PRs.

Is this the same as the websites at hld.handbook.academy and dsa.handbook.academy?

The content is identical. The websites add a polished reading UI: full-text search, dark mode, syntax-highlighted code blocks, per-chapter diagram zoom, social cards, OG images, fast client-side navigation, and — on the DSA site — the live interactive widgets. Same content, better reading experience — and still free, with no sign-up.

Why don't you include an "Awesome" list of external resources?

We cite primary sources inline where they're relevant. A separate "awesome" list would duplicate that work and go stale faster. If a resource is worth reading, it's cited in a chapter's References section.

I want to submit a chapter on [topic X]. How?

Open an issue first with the writing template explaining:

Which part this chapter belongs in
Why it belongs (not already covered? modern emerging topic? missing from Part X?)
Proposed outline (1-2 paragraphs)
Your primary sources

A maintainer will respond with scope feedback within a week. Once approved, fork → draft → PR. For Part 8 chapters use writing-guides/case-study-template.md; for trade-off pages use writing-guides/trade-off-template.md. All other chapters follow the skeleton documented in STYLE_GUIDE.md.

How do I cite this in an academic paper?

Use the BibTeX entry in the Citation section below, or the CITATION.cff file (which academic tools like Zotero and Zenodo can parse directly).

License

Everything in this repository — the 343 chapters and decision pages across both books, writing guides, style guide, validator scripts, templates, and configuration — is licensed under Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).

What CC BY-SA 4.0 means in practice:

You can:

Read it, share it, print it, screenshot it — forever.
Build courses, boot camps, YouTube channels, or books derived from it.
Translate it into any language.
Use it commercially.

You must:

Attribute the original (link back to this repo).
Release derivatives under the same CC BY-SA 4.0 license (share-alike).
Not relicense the content as proprietary or impose additional restrictions.

Nobody can paywall the chapter text itself under CC BY-SA 4.0. You can build anything on top of it, as long as the text itself stays open. Read both handbooks free at handbook.academy — no sign-up, no paywall, ever.

Citation

If you reference this project in academic work, blog posts, books, or courses:

BibTeX

@misc{engineering-handbook,
  author       = {Soni, Aayush},
  title        = {The Engineering Handbook: Open-Source Engineering Curricula (HLD + DSA)},
  year         = {2026},
  publisher    = {GitHub},
  url          = {https://github.com/handbook-academy/engineering-handbook},
  note         = {HLD: 159 chapters + 22 trade-off pages, ~773K words. DSA: 120 chapters across 15 parts, ~470K words, 155 LeetCode problems with 4-language solutions. CC BY-SA 4.0.}
}

Prose

"The Engineering Handbook (HLD + DSA)" by Aayush Soni, https://github.com/handbook-academy/engineering-handbook, CC BY-SA 4.0.

Machine-readable

The CITATION.cff file in this repository is a Citation File Format descriptor. GitHub's "Cite this repository" button, Zenodo, Zotero, and academic reference managers can parse it directly.

Acknowledgments

This project stands on the shoulders of many, and we cite primary sources liberally throughout. A few that shaped the handbook most:

Canonical books — HLD:

Martin Kleppmann — Designing Data-Intensive Applications is the foundation underneath most of HLD Parts 3-4.
Alex Xu — System Design Interview Volumes 1 and 2 defined the interview case-study format we build on in HLD Part 8.
Chip Huyen — Designing Machine Learning Systems and AI Engineering shape HLD Part 9.
Betsy Beyer et al. (Google) — Site Reliability Engineering and The SRE Workbook are behind HLD Part 6.
Mark Richards & Neal Ford — Fundamentals of Software Architecture and Software Architecture: The Hard Parts for HLD Part 5.

Canonical books — DSA:

Cormen, Leiserson, Rivest, Stein (CLRS) — Introduction to Algorithms is the bedrock for proofs, complexity arguments, and the algorithms in DSA Parts 2, 8, 9, and 12.
Robert Sedgewick & Kevin Wayne — Algorithms, 4th Edition shapes the data-structure exposition in DSA Parts 1, 5, 6, and 8.
Jon Bentley — Programming Pearls is the spirit behind the prompt cards.
Steven & Felix Halim — Competitive Programming for the contest-driven techniques in DSA Parts 9, 11, and 12.
Adnan Aziz, Tsung-Hsien Lee, Amit Prakash — Elements of Programming Interviews for the problem-shaping that influenced the DSA prompt cards.
Gayle Laakmann McDowell — Cracking the Coding Interview for setting the bar on interview-style explanations.

Open-source references:

Donne Martin — system-design-primer.
NeetCode — neetcode.io and the NeetCode 150 problem list for the DSA practice ladder template.
LeetCode community — for the canonical problem corpus and the editorial conventions we adapt.
The Kubernetes, CNCF, and OpenTelemetry communities — for cloud-native and observability standards.
The Apache projects (Kafka, Cassandra, Hadoop, Spark, Flink) — for foundational distributed-systems code and docs.

Papers, RFCs, and postmortems:

SIGMOD / VLDB / OSDI / SOSP / NSDI papers — the bedrock of distributed systems.
IETF / W3C / IANA — for networking and web standards.
Every engineer who has written a public postmortem. Real outages taught us everything about graceful degradation, blast radius, and the limits of testing.

We teach the same concepts in our own words, with our own diagrams, and we cite the originals.

Community

GitHub Discussions — design questions, feedback, sharing what you're building.
GitHub Issues — corrections, content requests, chapter proposals.
Email — hello@handbook.academy for private or sensitive matters.
Maintainer — @invincible04.

Star history

If you find this useful, star the repo. Stars help signal to new contributors that the project is worth their time.

Start reading

HLD: Networking Fundamentals · Scalability · URL Shortener Case Study · Trade-offs Library

DSA: Arrays · Two pointers · Sliding window (variable) · Patterns

Or skim the full HLD curriculum or the full DSA curriculum above.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.githooks		.githooks
.github		.github
content		content
scripts		scripts
writing-guides		writing-guides
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.markdownlint.json		.markdownlint.json
.nvmrc		.nvmrc
.typos.toml		.typos.toml
.vale.ini		.vale.ini
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTORS.md		CONTRIBUTORS.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
STYLE_GUIDE.md		STYLE_GUIDE.md
logo.svg		logo.svg
lychee.toml		lychee.toml
package-lock.json		package-lock.json
package.json		package.json

Folders and files

Latest commit

History

Repository files navigation

The Engineering Handbook

Table of Contents

What this is

Reading the handbooks online

What's in this repo

Why this exists

Who this is for

How these handbooks compare

Start reading

The full HLD curriculum

Part 0 — Prerequisites (5 chapters)

Part 1 — Core Fundamentals (7 chapters)

Part 2 — Building Blocks (16 chapters)

Part 3 — Distributed Systems Theory (11 chapters)

Part 4 — Data Systems (10 chapters)

Part 5 — Architecture Patterns (11 chapters)

Part 6 — Reliability & Operations (11 chapters)

Part 7 — Security at Scale (10 chapters)

Part 8 — Case Studies (56 chapters)

Part 9 — AI & ML System Design (15 chapters)

Part 10 — Emerging Patterns (1 chapter)

Part 11 — Interview Framework (6 chapters)

Trade-offs Library (22 pages)

The full DSA curriculum

Part 0 — Foundations (7 chapters)

Part 1 — Linear Data Structures (8 chapters)

Part 2 — Search & Sort (8 chapters)

Part 3 — Two Pointers, Sliding Window, Prefix Sums (6 chapters)

Part 4 — Stack & Queue Patterns (5 chapters)

Part 5 — Linked Lists (6 chapters)

Part 6 — Trees & Heaps (11 chapters)

Part 7 — Recursion & Backtracking (7 chapters)

Part 8 — Graphs (13 chapters)

Part 9 — Dynamic Programming (15 chapters)

Part 10 — Greedy (5 chapters)

Part 11 — Bit Manipulation (4 chapters)

Part 12 — Strings & Pattern Matching (6 chapters)

Part 13 — Design the Data Structure (7 chapters)

Part 14 — Interview Framework (12 chapters)

DSA pattern decision pages (37 pages)

DSA long-form editorials (5 pages)

DSA interactive widget specs (46 YAML files)

Study plans

SDE1 → SDE2 — 6-week interview prep

SDE2 → Senior — 3-month deep dive

Full curriculum — Staff+ preparation (6 months)

AI/ML-only track (4 weeks)

Interview-triage track (one weekend)

Project statistics

Quality standards

Contributing

Contribution paths

How to contribute

What makes a great contribution

Reviewers wanted

Private/sensitive concerns

Project structure

Development setup

Prerequisites

Setup

Common commands

Editor

Governance

FAQ

License

Citation

BibTeX

Prose

Machine-readable

Acknowledgments

Community

Star history

Start reading

About

Resources

License