Skip to content

handbook-academy/engineering-handbook

Handbook Academy

The Engineering Handbook

Open-source engineering curricula under one CC BY-SA 4.0 license. HLD and DSA today; more to come.

HLD Handbook: 159 chapters + 22 trade-off pages · 181 pages · ~773,000 words · 719 Mermaid diagrams · 3,100+ citations DSA Handbook: 120 chapters across 15 parts · 37 pattern decision pages · 5 long-form editorials · ~470,000 words · 226 Mermaid diagrams · sibling Python / Java / C++ / Go solutions for 155 LeetCode problems · 46 interactive widget specs

License: CC BY-SA 4.0 PRs Welcome CI GitHub stars

Read free at handbook.academy — landing page links to both books · HLD at hld.handbook.academy · DSA at dsa.handbook.academy (public beta)

Start reading · HLD curriculum · DSA curriculum · Trade-offs · Contributing


Table of Contents


What this is

This repository hosts two sibling open-source engineering handbooks under content/, sharing the same CC BY-SA 4.0 license, the same contribution workflow, and the same CI quality bar:

  • The HLD Handbook (content/hld/) — an opinionated, end-to-end textbook on high-level software design, distributed systems, and modern infrastructure. 159 teaching chapters across 12 parts plus a 22-page Trade-offs Library — 181 pages, ~773,000 words, 719 Mermaid diagrams, 3,100+ citations. Equivalent in scope to a ~2,400-page book — longer than Designing Data-Intensive Applications and Alex Xu's System Design Interview volumes 1+2 combined. Covers TCP/IP and the OS contract through to LLM serving, multi-agent orchestration, and post-quantum crypto.
  • The DSA Handbook (content/dsa/) — a practice-first data-structures-and-algorithms curriculum aimed at coding interviews and competitive programming. 120 chapters across 15 parts plus 37 pattern decision pages, 5 long-form LC editorials, and 46 interactive widget specs. ~470,000 words, 226 Mermaid diagrams. Each chapter teaches a structure or pattern, then walks the canonical LeetCode problems, with sibling sol.py / sol.java / sol.cpp / sol.go files for 155 problems. Python is inlined in the chapter; the other three languages are linked.

More curricula are planned. This repo is structured as an umbrella so additional handbooks (e.g. operating systems, databases, ML systems) can land alongside HLD and DSA over time, sharing the same workflow and license.

Every concept in either handbook is taught inline as a full-length article: introduction, first-principles explanation, diagrams, worked examples, trade-offs, production gotchas, citations to primary sources. No stubs. No "coming soon." No external blog redirects. No bullet outlines that tell you to go somewhere else to actually learn.

Every .md file under content/ also renders natively on GitHub — all 945 Mermaid diagrams display in the GitHub UI, all footnotes work, all cross-references resolve. The websites add search, dark mode, per-chapter diagram zoom, social cards, OG images, the interactive DSA widgets, and fast client-side navigation.

Both handbooks are continuously updated. Every chapter's frontmatter declares date_created and date_updated, so you can see exactly how fresh each page is. We care about numbers being right today, not right in 2022.

Reading the handbooks online

Each book has its own subdomain. The umbrella landing page links to both:

Site URL What it serves
Landing page handbook.academy Two cards — pick HLD or DSA.
HLD Handbook hld.handbook.academy The full HLD curriculum.
DSA Handbook dsa.handbook.academy The full DSA curriculum, with interactive widgets.

Each subdomain serves one book at the root — URLs do not nest. HLD chapters live at hld.handbook.academy/curriculum/..., not handbook.academy/hld/.... Same for DSA.

What's in this repo

This repository is the canonical content source for both books. PRs to either handbook are welcome.

  • content/hld/ — 159 HLD chapters across 12 parts + 22 trade-off decision pages. The chapter template is intro → first principles → diagrams → worked example → trade-offs → production gotchas → references. See STYLE_GUIDE.md.
  • content/dsa/ — 120 DSA chapters across 15 parts. The chapter template is practice-first: cheat-sheet table, deep dive on the structure or pattern, prompt cards for representative LeetCode problems, and <details> solution + common-mistakes blocks. Code samples live as sibling files (sol.py, sol.java, sol.cpp, sol.go) under each chapter's directory. See STYLE_GUIDE.md § DSA-specific deviations and content/dsa/part-1-linear-data-structures/00-arrays.md as the canonical example.
  • content/dsa/editorials/ — long-form LeetCode editorials covering hard problems with multiple approaches.
  • content/dsa/patterns/ — pattern decision references that compare interchangeable approaches (e.g. recursion vs iteration, BFS vs DFS, sliding window vs prefix sum).
  • content/dsa/widgets/ — 46 YAML specs for the interactive widgets that render on the DSA website. On GitHub these are callouts that link to the spec; on the website the widget renders.
  • content/dsa/_problem-registry.yml — canonical LC-NNN ID registry; the website source-of-truth for every LeetCode problem the curriculum touches.

Why this exists

The open-source curricula for system design and DSA have a shape, and that shape is frustrating:

  • Link dumps — curated lists that point at dozens of scattered blog posts, LeetCode discuss threads, and talks. Great for discovery, useless as a learning path. You end up with 60 tabs open and no through-line.
  • Teaser-and-redirect repos — READMEs with a few hundred words on each topic that nudge you toward a paid course hosted elsewhere. The GitHub repo is the marketing page; the actual teaching is behind a $150-$400+/year paywall.
  • Monolithic READMEs frozen in time — a single 5,000-line README.md that was great in 2021 but hasn't kept up with the modern stack. No LLM serving, no CRDTs, no post-quantum crypto, no FinOps. And nobody wants to review a PR against a 5,000-line file.
  • Surface-level outlines — bullet-point summaries that tell you what exists without explaining why you'd pick one approach over another. You learn the vocabulary without the judgment.
  • Interview-only prep — focused on passing a specific 45-minute screen, not on actually operating systems at scale or actually understanding the algorithm.

These handbooks fix all of that:

  • 100% inline content. Every chapter is a full teaching article written from scratch, living in this repo as plain Markdown. Nothing is a stub. Nothing redirects you elsewhere to learn.
  • Progressive curricula. HLD has 159 teaching chapters across 12 parts plus a 22-page Trade-offs Library, sequenced from prerequisites (Part 0) to Staff+ topics. DSA has 120 chapters across 15 parts, from arrays to bitmask DP, with each part introducing a tighter pattern family. Each chapter declares its prerequisites, learning objectives, estimated reading time, and difficulty tier.
  • Research-backed. Every HLD chapter ends with a Further Reading & References section citing primary sources — SIGMOD and OSDI papers, RFCs, IETF drafts, engineering postmortems, official docs, canonical books. 3,100+ citations across HLD. DSA chapters cite the original papers behind each algorithm where they exist (Floyd, Tarjan, Knuth-Morris-Pratt, Aho-Corasick).
  • Opinionated. Every topic picks a recommended approach and explains why. Where reasonable people disagree, the trade-offs are made explicit — HLD has a dedicated 22-page Trade-offs Library; DSA has 37 pattern decision pages that compare interchangeable approaches (recursion vs iteration, BFS vs DFS, sliding window vs prefix sum, and so on).
  • Modern (2025+). HLD covers LLM serving, RAG pipelines, AI agents, multi-agent orchestration, CRDTs, edge computing, FinOps, post-quantum cryptography, platform engineering, local-first software, differential privacy, MCP. DSA covers the modern competitive-programming toolkit: monotonic stacks/deques, Morris traversal, suffix arrays, Aho-Corasick, bitmask DP, randomised algorithms.
  • Practice-first for DSA. Every DSA chapter ships with sibling code samples in Python, Java, C++, and Go for the canonical LeetCode problems it teaches. Python is inlined in the chapter; the other three languages are one click away. 155 LeetCode problems are covered in this format.
  • Interactive on the website. The DSA book's 46 widget specs render as live, animated visualisations on dsa.handbook.academy (sliding windows that you can drag, monotonic stacks that animate the pop sequence, etc.). On GitHub the widget callouts link to the YAML spec.

Who this is for

These handbooks are written for:

  • SDE1 engineers preparing for SDE2 interviews. Read DSA Parts 0-9 for coding-round fluency, then HLD Part 0 + Part 1 + Part 11, plus 8-10 targeted case studies from HLD Part 8, plus the top-5 most-cited trade-off pages.
  • SDE2s preparing for Senior/Staff. The full HLD Parts 3-7 are the core "you need to know this to design anything meaningful" material. Parts 6 (Reliability) and 7 (Security) separate Senior candidates from Staff candidates. DSA Part 14 covers narration and trade-off articulation in the algorithmic round.
  • Senior/Staff engineers refreshing or filling gaps. HLD Part 9 (AI/ML systems) and the Trade-offs Library are valuable even if you've been doing this for 15 years. Modern LLM serving and vector search weren't a thing when most of us learned backend.
  • Self-taught engineers without a CS degree who want the vocabulary and the reasoning that bootcamps and YouTube don't teach. HLD Part 0 is explicit prerequisites: TCP/IP, the OS contract, database internals, API design. DSA Part 0 is the foundations — Big-O, recursion, bit manipulation — before pattern-matching becomes useful.
  • Career switchers moving from adjacent roles (frontend → backend, backend → infra, SWE → MLE) who need to build system-level intuition fast.
  • Competitive programmers and ICPC/Codeforces contestants who want a structured reference for the algorithmic toolkit (KMP, Aho-Corasick, suffix arrays, segment trees, Dijkstra, Bellman-Ford, DP variants).
  • Teachers and course creators who want to build curriculum without writing thousands of pages from scratch. The CC BY-SA 4.0 license explicitly allows this.
  • Anyone operating a production system at non-trivial scale who wants a reference shelf that covers the full stack — from packet-level networking to multi-tenant SaaS isolation models.

These handbooks are NOT for:

  • Absolute beginners with no programming experience. HLD Part 0 assumes you can read code and have built at least one CRUD app. DSA Part 0 assumes you can write a for loop and a recursive function. If you're brand new, start with The Odin Project or CS50, then come back.
  • People who want low-level language-specific tutorials. We don't teach Rust syntax or Go's goroutines — we teach concepts that apply across languages.

How these handbooks compare

This repo (HLD + DSA) Typical OSS system-design repos Typical OSS DSA repos Paid courses ($150-$400+/yr) O'Reilly-style books
Free and open-source Yes Yes Yes No No
100% inline content (no external redirects) Yes No No Yes Yes
Both HLD + DSA in one curriculum Yes No No Rarely No
150+ HLD chapters, 100+ DSA chapters Yes No No Only their sliver Usually one topic
56 end-to-end HLD case studies Yes 5-15 typical 15-25 typical Varies
Sibling code in Python / Java / C++ / Go Yes Rarely all 4 Sometimes Rarely
Interactive widgets on the DSA site Yes No Sometimes No
Dedicated Trade-offs Library (22 pages) Yes No No No
37 DSA pattern decision pages Yes No No No
Covers LLMs, RAG, agents, multimodal AI Yes Rarely modern Sometimes Rarely
Opinionated decisions, not just options Yes No No Sometimes Varies
3,100+ primary-source citations Yes No No Yes
Community-editable, PRs welcome Yes Yes Yes No No
Updated continuously Yes Often frozen Often frozen Varies Every 3-5 years

Start reading

For the best reading experience, visit hld.handbook.academy for HLD or dsa.handbook.academy for DSA — free, no sign-up, full search, dark mode, per-chapter diagram zoom, and the live DSA widgets. The links below open the source on GitHub, where Mermaid diagrams also render natively.

Pick one:

I want a quick HLD taste. Read these three in order — they give you the vocabulary and the first real worked example:

  1. Scalability: Growing a System Without Breaking It
  2. Back-of-the-Envelope Estimation
  3. Design a URL Shortener (TinyURL / bit.ly)

I want a quick DSA taste. Read these three to see the chapter shape and the sibling-code workflow:

  1. Arrays: static, dynamic, multi-dimensional
  2. Two pointers: opposite ends
  3. Sliding window: variable size

I'm preparing for an SDE2 interview in the next 6 weeks. Follow the SDE1 → SDE2 study plan.

I'm a Senior engineer refreshing my distributed-systems foundations. Read HLD Part 3 — Distributed Systems Theory straight through.

I want to learn AI-systems design. Read HLD Part 9 — AI & ML System Design + the 8 AI case studies in Part 8 (chapters 30-37).

I want to grind interview algorithms. Open the DSA curriculum. Read Parts 0-2 in order, then jump into the pattern parts (3-9) as the problems you encounter call for them.

I want to read both books cover-to-cover. Start at HLD Part 0 for the systems track and DSA Part 0 for the algorithms track. At one 25-minute chapter per day, the whole repo is roughly a year of reading.


The full HLD curriculum

181 pages across 12 parts + a 22-page Trade-offs Library. Each part below is collapsed by default — click to expand the chapter list. Linking to a specific part from elsewhere in the README auto-expands it on GitHub.

# Part Chapters Difficulty Reading time
0 Prerequisites 5 Beginner ~3 hrs
1 Core Fundamentals 7 Beginner-Intermediate ~4 hrs
2 Building Blocks 16 Intermediate ~10 hrs
3 Distributed Systems Theory 11 Intermediate-Advanced ~9 hrs
4 Data Systems 10 Intermediate-Advanced ~8 hrs
5 Architecture Patterns 11 Intermediate ~8 hrs
6 Reliability & Operations 11 Intermediate-Advanced ~8 hrs
7 Security at Scale 10 Intermediate-Advanced ~7 hrs
8 Case Studies 56 Intermediate-Advanced ~45 hrs
9 AI & ML System Design 15 Intermediate-Advanced ~11 hrs
10 Emerging Patterns 1 Intermediate-Advanced ~45 min
11 Interview Framework 6 Intermediate ~4 hrs
T Trade-offs Library 22 Intermediate ~8 hrs
Part 0 — Prerequisites (5 chapters) — networking, OS, data structures, databases, API design

Part 0 — Prerequisites (5 chapters)

Audience: engineers without a CS degree, or anyone who wants to confirm their foundations before moving on. Difficulty: Beginner. Total reading time: ~3 hours.

Foundational topics that the rest of the handbook assumes. If you can explain TCP's three-way handshake, the difference between a process and a thread, a B-tree's internal structure, and why idempotent PUT is better than non-idempotent POST for a retry-prone endpoint — you can skip this part.

  1. Networking Fundamentals for System Design — OSI and TCP/IP layers, TCP vs UDP, HTTP/1.1 vs HTTP/2 vs HTTP/3, DNS resolution, TLS handshake, what "the network is unreliable" really means in practice.
  2. Operating System Essentials for System Design — Processes vs threads, context switching cost, virtual memory, page cache, filesystem I/O, epoll/kqueue/IOCP, why O_DIRECT matters for databases.
  3. Data Structures for Distributed Systems — Hash tables, B-trees, LSM-trees, skip lists, Bloom filters, tries, HyperLogLog, Count-Min Sketch, and when each shows up in real systems.
  4. Database Fundamentals for System Design — Transactions, isolation levels (read-committed to serializable), indexes, query planners, joins, and why your ORM hides things from you that you need to see.
  5. API Design Basics: REST, GraphQL, gRPC, and the Hard Parts — Resource modeling, idempotency, versioning, pagination, rate-limit headers, error envelopes, HATEOAS in theory vs practice.
Part 1 — Core Fundamentals (7 chapters) — scalability, latency, availability, consistency, estimation, trade-off thinking

Part 1 — Core Fundamentals (7 chapters)

Audience: everybody — read this part even if you know the topic, because the vocabulary here is used for the rest of the book. Difficulty: Beginner-Intermediate. Total reading time: ~4 hours.

The vocabulary and the reasoning habits that every later chapter assumes. "Scalability," "consistency," "trade-off," and "back-of-envelope" get defined here rigorously so they mean something specific when we use them later.

  1. Scalability: Growing a System Without Breaking It
  2. Latency and Throughput: The Two Numbers That Matter
  3. Availability and Reliability: Nines, SLOs, and Staying Up
  4. Consistency Models: What Readers Actually See
  5. Back-of-the-Envelope Estimation
  6. How to Approach a System Design Question
  7. Trade-off Thinking
Part 2 — Building Blocks (16 chapters) — load balancers, caches, queues, CDNs, databases, sharding, pub/sub, rate limiting, blob storage, geo, edge

Part 2 — Building Blocks (16 chapters)

Audience: anyone who'll assemble backend systems. This part is the Lego brick inventory. Difficulty: Intermediate. Total reading time: ~10 hours.

Deep dives on the pieces you assemble to build real systems: load balancers, caches, queues, CDNs, databases (SQL and NoSQL), partitioning, replication, pub/sub, rate limiters, service discovery, blob storage, geospatial indexes, and edge compute.

  1. Load Balancers: Spreading Traffic, Absorbing Failure
  2. Reverse Proxies and API Gateways: The Smart Edge
  3. Content Delivery Networks: Moving Bytes Closer to Users
  4. Caching: From Browser to Database
  5. SQL Databases: The Boring Technology That Wins
  6. NoSQL Databases: Picking the Right Non-Relational Tool
  7. Database Partitioning and Sharding: When One Node Is Not Enough
  8. Database Replication: Keeping Copies in Sync
  9. Message Queues and Streaming: Decoupling at Scale
  10. Pub/Sub: Fan-Out and Event-Driven Systems
  11. Real-Time Communication: WebSockets, SSE, and Long Polling
  12. Rate Limiting: Protecting Systems from Themselves
  13. Service Discovery and Service Mesh: Finding and Talking to Services
  14. Blob and Object Storage: Storing the Big Stuff
  15. Geospatial Indexing: Geohash, Quadtree, R-tree, S2, and H3
  16. Edge Computing (Cloudflare Workers, Lambda@Edge, Deno Deploy)
Part 3 — Distributed Systems Theory (11 chapters) — consensus, consistency, clocks, CRDTs, transactions

Part 3 — Distributed Systems Theory (11 chapters)

Audience: SDE2+ preparing for Senior/Staff. If you haven't internalized linearizability vs serializability, read this. Difficulty: Intermediate-Advanced. Total reading time: ~9 hours.

The theory that makes distributed systems distributed: consensus (Raft/Paxos), the full consistency spectrum, CAP/PACELC in 2025 framing, logical clocks, CRDTs, distributed transactions (2PC/Saga), exactly-once delivery, failure detection, consistent hashing, and Merkle-tree anti-entropy.

  1. Consensus Protocols: How Distributed Systems Agree
  2. Consistency Deep Dive: Linearizability, Serializability, and the Spectrum Between
  3. Quorums and Replication: The Math of R + W > N
  4. CAP and PACELC: The Tradeoff That Keeps Confusing People
  5. Clocks and Ordering: Lamport, Vector, and Hybrid Logical Clocks
  6. CRDTs: Conflict-Free Replicated Data Types
  7. Distributed Transactions: 2PC, Saga, and When to Avoid Both
  8. Idempotency and Exactly-Once: The Honest Truth About Delivery Guarantees
  9. Failure Detection: Deciding a Node Is Dead
  10. Consistent Hashing: Keys to Nodes Without Global Reshuffles
  11. Merkle Trees and Anti-Entropy: Keeping Replicas in Sync Cheaply
Part 4 — Data Systems (10 chapters) — storage engines, OLTP/OLAP, warehouses/lakes, streams, search, time-series, graph, vector, KV

Part 4 — Data Systems (10 chapters)

Audience: anyone who owns a data pipeline or picks a database. Difficulty: Intermediate-Advanced. Total reading time: ~8 hours.

Every flavor of data system you might pick, and when each actually fits. Storage engines (B-tree vs LSM), OLTP vs OLAP, warehouses/lakes/lakehouses, streaming vs batch, CDC, search, time-series, graph, vector, and key-value.

  1. Storage Engines: B-Trees, LSM-Trees, and Why Your Database Feels the Way It Does
  2. OLTP vs OLAP: Row Stores, Column Stores, and Matching Shape to Workload
  3. Data Warehouses and Data Lakes: Structure, Schema, and the Lakehouse
  4. Stream vs Batch Processing: Lambda, Kappa, and the End of That Debate
  5. Change Data Capture: Streaming the Database's Inner Monologue
  6. Search Systems: Inverted Indexes, BM25, and Running Elasticsearch in Production
  7. Time-Series Databases: Metrics, Events, and Retention at Scale
  8. Graph Databases: Property Graphs, Cypher, and When Joins Are the Problem
  9. Vector Databases: Embeddings, ANN Indexes, and the Retrieval Layer for AI
  10. Key-Value Stores: Redis, Memcached, DynamoDB, and Picking the Right Hash Table
Part 5 — Architecture Patterns (11 chapters) — monolith vs micro, event-driven, CQRS, ES, serverless, BFF, strangler, hex, multi-region, multi-tenant, CRDT apps

Part 5 — Architecture Patterns (11 chapters)

Audience: engineers making architecture-level decisions or leading service migrations. Difficulty: Intermediate. Total reading time: ~8 hours.

The architectural shapes you choose between when you design anything bigger than a single service: monolith vs microservices, event-driven, CQRS, event sourcing, serverless, BFF, strangler fig, hexagonal/clean, multi-region, multi-tenancy, CRDT-based apps.

  1. Monolith vs Microservices: Team Topology, Conway's Law, and the Distributed System Tax
  2. Event-Driven Architecture: Notifications, State Transfer, and Choreography
  3. CQRS: Separating Reads from Writes Without Losing Your Mind
  4. Event Sourcing: Events as the Source of Truth
  5. Serverless: Functions, Cold Starts, and When FaaS Actually Saves Money
  6. Backend for Frontend: Per-Client API Aggregation Done Right
  7. Strangler Fig: Incremental Migration Without a Big Bang
  8. Hexagonal and Clean Architecture: Keeping Business Logic Independent
  9. Multi-Region Architecture: Active-Passive, Active-Active, and CRDTs
  10. Multi-Tenancy: Silo, Pool, and the SaaS Isolation Spectrum
  11. CRDT Applications (Yjs, Automerge, Local-First Software)
Part 6 — Reliability & Operations (11 chapters) — observability, SLOs, resilience, scaling, deploys, chaos, incidents, FinOps, platform

Part 6 — Reliability & Operations (11 chapters)

Audience: anyone on-call, anyone who signs SLAs, anyone paying a cloud bill. Difficulty: Intermediate-Advanced. Total reading time: ~8 hours.

The engineering that separates "it compiles" from "it runs reliably at 3 a.m. on a long weekend": observability, SLOs, resilience patterns, auto-scaling, deployments, chaos engineering, incident response, health checks, FinOps, platform engineering.

  1. Observability: Metrics, Logs, Traces, and the OpenTelemetry Standard
  2. SLI, SLO, SLA, and Error Budgets: Making Reliability Quantitative
  3. Resilience Patterns: Timeouts, Retries, Circuit Breakers, and Bulkheads
  4. Graceful Degradation: When Partial Service Beats No Service
  5. Auto-Scaling and Capacity Planning: From HPA to Predictive Scaling
  6. Deployment Strategies: Blue-Green, Canary, Rolling, and Feature Flags
  7. Chaos Engineering: Breaking Things on Purpose
  8. Incident Management: From Detection to Blameless Postmortem
  9. Health Checks and Readiness: Telling the Truth About Whether You're Up
  10. Cost Optimization and FinOps
  11. Platform Engineering: IDPs, Golden Paths, and DX
Part 7 — Security at Scale (10 chapters) — AuthN/Z, OAuth/OIDC, JWT, mTLS, secrets, DDoS/WAF, compliance, supply chain, privacy, PQC

Part 7 — Security at Scale (10 chapters)

Audience: Senior/Staff engineers, platform teams, security-adjacent builders. Difficulty: Intermediate-Advanced. Total reading time: ~7 hours.

Security architecture for real systems, not a CISSP crib sheet: AuthN vs AuthZ, OAuth2/OIDC, JWT (and why you probably shouldn't), mTLS, secrets management, DDoS/WAF, compliance (GDPR/DPDP/CCPA), software supply chain, privacy-preserving systems, post-quantum cryptography.

  1. Authentication vs Authorization: Identity, Permissions, and Access Models
  2. OAuth 2.0 and OpenID Connect: Delegated Authorization and Identity Done Right
  3. JWT Deep Dive: Signed Tokens, Claims, and the Revocation Problem
  4. mTLS and Service-to-Service Authentication: SPIFFE, Service Mesh, and Zero Trust
  5. Secrets Management: Vault, KMS, and the End of Secrets in Config Files
  6. DDoS Protection and WAFs: Mitigating Volumetric and Application Attacks
  7. Data Residency and Compliance Architecture (GDPR, DPDP, CCPA, Right-to-Erasure)
  8. Supply Chain Security: SBOM, SLSA, Sigstore, and Defending Against xz-utils
  9. Privacy-Preserving Systems (Differential Privacy, Federated Learning)
  10. Post-Quantum Cryptography: Migrating to ML-KEM, ML-DSA, and a Crypto-Agile Future
Part 8 — Case Studies (56 chapters) — 56 end-to-end designs, grouped by theme; the centerpiece of interview prep

Part 8 — Case Studies (56 chapters)

Audience: interview prep candidates, engineers building adjacent systems, anyone who learns best from worked examples. Difficulty: Intermediate-Advanced. Total reading time: ~45 hours.

56 end-to-end system designs, each following a consistent structure: requirements (functional + non-functional), back-of-envelope numbers, high-level architecture, data model, deep-dive components, scalability and reliability considerations, and real-world references. Grouped thematically below for easier navigation.

Core primitives (chapters 00-03) — the four designs you see in every interview
  1. Design a URL Shortener (TinyURL / bit.ly)
  2. Design a Pastebin (Paste Sharing Service)
  3. Design a Distributed Rate Limiter
  4. Design a Distributed Key-Value Store (Dynamo / Cassandra / Riak)
Messaging & social (chapters 04-09) — notifications, chat, feeds, photos, crawlers, autocomplete
  1. Design a Notification System (Push, SMS, Email at Scale)
  2. Design a Chat System (WhatsApp / Messenger / Signal)
  3. Design a Social Media Feed (Twitter / Instagram / LinkedIn)
  4. Design a Photo Sharing Service (Instagram)
  5. Design a Web Crawler (Googlebot-style)
  6. Design Search Autocomplete (Typeahead Suggestions)
Media & consumer products (chapters 10-17) — video, ride-hailing, maps, file sync, editing, cache, recommenders
  1. Design a Video Streaming Service (YouTube / Twitch / TikTok)
  2. Design Netflix (End-to-End)
  3. Design a Ride-Hailing Service (Uber / Lyft)
  4. Design Google Maps (Routing and Tile Rendering)
  5. Design a File Sync Service (Dropbox / Google Drive)
  6. Design Collaborative Editing (Google Docs / Figma / Notion)
  7. Design a Distributed Cache (Memcached / Redis Cluster)
  8. Design a Recommendation System (Netflix / YouTube / TikTok)
Commerce & financial (chapters 18-21) — ticketing, payments, stock exchange, food delivery
  1. Design a Ticketing System (BookMyShow / Ticketmaster)
  2. Design a Payment System (Stripe / PayPal)
  3. Design a Stock Exchange (Matching Engine)
  4. Design a Food Delivery Service (DoorDash / Swiggy)
Data & infrastructure (chapters 22-29) — metrics, ad-click, logs, proximity, leaderboards, IDs, hotels, schedulers
  1. Design a Metrics Pipeline (Prometheus / InfluxDB / Thanos)
  2. Design Ad-Click Aggregation (Real-Time Stream Processing)
  3. Design a Logging Platform (ELK / Loki / Splunk)
  4. Design a Proximity Service (Nearby Friends / Yelp)
  5. Design a Real-Time Leaderboard
  6. Design a Unique ID Generator (Snowflake, ULID, TSID, UUIDv7)
  7. Design a Hotel Reservation System (Booking.com / Airbnb)
  8. Design a Distributed Job Scheduler (Airflow / Temporal / Distributed Cron)
AI systems (chapters 30-37) — ChatGPT, RAG, coding agents, AI search, voice, moderation, semantic cache, model routing
  1. Design ChatGPT (Conversational AI at Scale)
  2. Design an Enterprise RAG System
  3. Design a Coding Agent (Claude Code / GitHub Copilot / Cursor)
  4. Design Perplexity (AI Search with Citations)
  5. Design a Voice Agent (Alexa / Siri-Class Realtime)
  6. Design a Content Moderation System at Scale
  7. Design a Semantic Cache for LLM Applications
  8. Design a Model Router and Gateway (OpenRouter / LiteLLM)
Infra services (chapters 38-39) — feature flags, DNS
  1. Design a Feature Flag Service (LaunchDarkly / Harness FME / Unleash)
  2. Design a DNS Service (Cloudflare 1.1.1.1 / Google 8.8.8.8)
Consumer products II (chapters 40-49) — dating, auctions, SaaS, video conf, email, live comments, fraud, fitness, online judge, price tracking
  1. Design a Dating App (Tinder / Hinge / Bumble)
  2. Design an Online Auction (eBay / Catawiki)
  3. Design a Multi-Tenant SaaS Platform
  4. Design a Video Conferencing System (Zoom / Google Meet)
  5. Design an Email Service at Gmail Scale (1.8B Users, 300B Messages/Day)
  6. Design Live Comments at Scale (FB Live / YouTube Live / Twitch Chat)
  7. Design a Fraud Detection System (Stripe Radar / PayPal / Feedzai)
  8. Design a Fitness Tracking Service (Strava / MapMyRun)
  9. Design an Online Judge (LeetCode / Codeforces / HackerEarth)
  10. Design a Price Tracking Service (CamelCamelCamel / Honey / Keepa)
Developer & ops platforms (chapters 50-55) — API gateway, CI/CD, observability, search engine, brokerage, chat-at-scale
  1. Design an API Gateway at Scale (Kong / AWS API Gateway / Apigee / Envoy)
  2. Design a CI/CD Platform (GitHub Actions / GitLab CI / CircleCI)
  3. Design an Observability Platform (Datadog / New Relic / Honeycomb)
  4. Design a Search Engine (Google-Scale / Brave Search)
  5. Design a Brokerage Platform (Robinhood / E*TRADE / Interactive Brokers)
  6. Design Channel-Scale Chat (Discord / Slack)
Part 9 — AI & ML System Design (15 chapters) — LLM serving, RAG, vector search, agents, LLMOps, safety, recommenders, multimodal

Part 9 — AI & ML System Design (15 chapters)

Audience: anyone building with or around LLMs, agents, or production ML. Difficulty: Intermediate-Advanced. Total reading time: ~11 hours.

Modern AI-systems architecture, treated with the same rigor as Part 3: LLM serving, RAG, vector search, agent architectures, multi-agent orchestration, LLM evaluation, LLMOps, cost optimization, safety, ML fundamentals, feature stores, recommenders, real-time AI, multimodal, and the data infra underneath all of it.

  1. LLM Serving Architecture (vLLM, TGI, TensorRT-LLM)
  2. RAG Pipelines (Retrieval-Augmented Generation)
  3. Vector Search at Scale (HNSW, IVF-PQ, DiskANN)
  4. AI Agent Architectures (ReAct, Reflection, Planning, Tool Use, Memory)
  5. Multi-Agent Orchestration (LangGraph, OpenAI Agents SDK, AutoGen, Swarm)
  6. LLM Evaluation and Observability (Ragas, LangSmith, TruLens, LLM-as-Judge)
  7. LLMOps and Prompt Engineering (Versioning, Guardrails, Red-Teaming)
  8. LLM Cost Optimisation (Semantic Cache, Model Routing, Cascading, Prompt Caching)
  9. LLM Safety and Guardrails (OWASP LLM Top 10, Prompt Injection, PII, Jailbreaks)
  10. ML System Design Fundamentals
  11. Feature Stores and Model Serving (Feast, Tecton, KServe, BentoML, MLflow)
  12. Recommendation Systems Deep Dive (DLRM, Two-Tower, Embedding Retrieval, Cold Start)
  13. Realtime AI and Voice Agents (Streaming Inference, WebRTC, LiveKit, Deepgram)
  14. Multimodal AI Systems (CLIP, Whisper, LayoutLM, Document AI)
  15. Data Infrastructure for AI (Embedding Pipelines, Chunking, Unstructured ETL, MCP)
Part 10 — Emerging Patterns (1 chapter) — green/sustainable computing; growing list of forward-looking topics

Part 10 — Emerging Patterns (1 chapter)

Audience: Staff+ engineers and architects thinking past 2026. Difficulty: Intermediate-Advanced. Total reading time: ~45 minutes.

Forward-looking topics that are adjacent to everything else. Currently one chapter, with more planned (WebAssembly at the edge, unikernels, confidential computing, on-device AI).

  1. Green Computing (Carbon-Aware Scheduling, PUE, Sustainable Systems)
Part 11 — Interview Framework (6 chapters) — RESHADED/PEDALS/ADEPT, requirements, diagrams, trade-offs, company flavors, RFCs

Part 11 — Interview Framework (6 chapters)

Audience: anyone preparing for or giving system-design interviews. Difficulty: Intermediate. Total reading time: ~4 hours.

How to run a 45-minute system-design interview, from both sides of the whiteboard. Compares RESHADED / PEDALS / ADEPT frameworks, teaches requirements scoping, diagramming, trade-off articulation, company-specific flavors, and RFC/design-doc authoring for Staff-level work.

  1. Interview Frameworks Compared (RESHADED, PEDALS, ADEPT)
  2. Requirements Scoping: Functional, Non-Functional, and MoSCoW
  3. Diagramming Skills for System Design Interviews
  4. Trade-off Articulation: Saying 'It Depends' Well
  5. Company-Specific Interview Flavors (Amazon, Google, Meta, Netflix)
  6. Design Doc Authoring: RFCs, ADRs, and the Staff Engineer's Written Output
Trade-offs Library (22 pages) — the canonical "X vs Y" decision pages cross-referenced from every part

Trade-offs Library (22 pages)

Audience: everyone — these pages are referenced from every other part. Difficulty: Intermediate. Total reading time: ~8 hours.

The 22 most-asked architectural-choice questions, each answered in a dedicated decision-comparison page: flowchart, comparison table, "when to pick A" vs "when to pick B" sections, real-world examples, and citations.

  1. Strong vs Eventual Consistency
  2. ACID vs BASE
  3. SQL vs NoSQL
  4. Latency vs Throughput
  5. CAP and PACELC Applied
  6. Cache Strategies: Cache-Aside vs Write-Through vs Write-Behind
  7. Batch vs Stream Processing
  8. Load Balancer vs Reverse Proxy vs API Gateway
  9. REST vs gRPC vs GraphQL
  10. Polling vs Long-Polling vs SSE vs WebSockets vs Webhooks
  11. Rate Limiting Algorithms: Token Bucket vs Sliding Window
  12. Optimistic vs Pessimistic Concurrency Control
  13. Partitioning Schemes: Range, Hash, Consistent Hash, Directory
  14. B-tree vs LSM-tree Storage
  15. Monolith vs Microservices
  16. Replication Topologies: Leader-Follower, Multi-Leader, Leaderless
  17. Distributed Transactions: 2PC vs Saga vs TCC
  18. Push vs Pull (Fan-out, Messaging, Feed)
  19. Lambda vs Kappa Architecture
  20. Vertical vs Horizontal Scaling
  21. Normalization vs Denormalization
  22. Single-Region vs Multi-Region Deployment

The full DSA curriculum

120 chapters across 15 parts, plus 37 pattern decision pages, 5 long-form editorials, and 46 interactive widget specs. Each chapter is centered on a specific data structure or algorithmic pattern, taught with worked LeetCode problems. Sibling sol.py / sol.java / sol.cpp / sol.go files live under each problem directory; the Python solution is inlined in the chapter, the others are linked. Editorials, pattern decision diagrams, and widget YAML specs are siblings under content/dsa/.

Each part below is collapsed by default — click to expand the chapter list. Linking to a specific part from elsewhere in the README auto-expands it on GitHub.

# Part Chapters Difficulty Reading time
0 Foundations 7 Beginner ~3 hrs
1 Linear Data Structures 8 Beginner ~5 hrs
2 Search & Sort 8 Beginner-Intermediate ~5 hrs
3 Two Pointers, Sliding Window, Prefix Sums 6 Intermediate ~4 hrs
4 Stack & Queue Patterns 5 Intermediate ~3 hrs
5 Linked Lists 6 Intermediate ~4 hrs
6 Trees & Heaps 11 Intermediate-Advanced ~7 hrs
7 Recursion & Backtracking 7 Intermediate-Advanced ~5 hrs
8 Graphs 13 Intermediate-Advanced ~9 hrs
9 Dynamic Programming 15 Advanced ~10 hrs
10 Greedy 5 Intermediate-Advanced ~3 hrs
11 Bit Manipulation 4 Intermediate ~2 hrs
12 Strings & Pattern Matching 6 Intermediate-Advanced ~4 hrs
13 Design the Data Structure 7 Intermediate-Advanced ~5 hrs
14 Interview Framework 12 All levels ~5 hrs
P Pattern decision pages 37 Intermediate ~10 hrs
E Long-form editorials 5 Intermediate ~2 hrs
W Interactive widget specs 46 reference
Part 0 — Foundations (7 chapters) — Big-O, recursion, bit ops, interview math, language idioms, choosing your language

Part 0 — Foundations (7 chapters)

Audience: anyone starting interview prep, or returning after a long pause. Difficulty: Beginner. Total reading time: ~3 hours.

The mental models and language fluency that everything else assumes. Big-O, the recursion mental model, bit-manipulation primer, the math you actually need for interviews, language idioms across Python / Java / C++ / Go, and how to pick which language to interview in.

  1. How to use this handbook
  2. Computational complexity and Big-O
  3. The recursion mental model
  4. Bit manipulation primer
  5. Math for interviews
  6. Language idioms across Python, Java, C++, Go
  7. Choosing your interview language
Part 1 — Linear Data Structures (8 chapters) — arrays, dynamic-array internals, strings, hash maps, stacks, queues, matrices

Part 1 — Linear Data Structures (8 chapters)

Audience: everyone — these are the structures that show up in 70% of all interview problems. Difficulty: Beginner. Total reading time: ~5 hours.

The contiguous-memory and bucket-based data structures: arrays (static / dynamic / multi-dimensional), the amortized-O(1) doubling rule that makes a vector work, strings as encoded byte arrays, hash maps and hash sets, the load-factor / collision math behind them, stacks and queues with the call-stack analogy, and matrix manipulation tricks (rotate-in-place, spiral, transpose).

  1. Arrays: static, dynamic, multi-dimensional
  2. Dynamic array internals
  3. Strings: encoding, immutability, builders
  4. Hash maps and hash sets
  5. Hash collisions and the load factor
  6. Stacks and the call stack analogy
  7. Queues, deques, and circular buffers
  8. Matrix manipulation
Part 2 — Search & Sort (8 chapters) — linear/binary search, comparison & linear-time sorts, heap sort, quickselect

Part 2 — Search & Sort (8 chapters)

Audience: every interviewee — binary search alone shows up in roughly a quarter of all problems. Difficulty: Beginner-Intermediate. Total reading time: ~5 hours.

How to find things and how to order things. Linear search and when it's actually the right answer; the canonical binary search written without off-by-one bugs; the lower_bound / upper_bound / peak / rotated-array variants; the comparison-sort family (insertion, merge, quicksort with the production hybrids like Timsort and Introsort); heap sort and why n log n is the comparison-sort lower bound; the linear-time sorts (counting, radix, bucket); and quickselect for the top-k problems.

  1. Linear search and what it's good for
  2. Binary search: the canonical version
  3. Binary search variants: lower_bound, upper_bound, peaks, and rotated arrays
  4. Comparison sorts I: insertion sort and merge sort
  5. Comparison sorts II: quicksort, partition, and the production hybrids
  6. Heap sort and the n log n lower bound
  7. Linear-time sorts: counting, radix, bucket
  8. Quickselect: linear-time selection
Part 3 — Two Pointers, Sliding Window, Prefix Sums (6 chapters) — the array-walking patterns that turn O(n²) into O(n)

Part 3 — Two Pointers, Sliding Window, Prefix Sums (6 chapters)

Audience: anyone whose nested-loop solutions keep timing out. Difficulty: Intermediate. Total reading time: ~4 hours.

Three closely related patterns that all amount to "walk the array smarter": opposite-ends two pointers (Container With Most Water, 3Sum), same-direction two pointers, fixed and variable sliding windows, prefix sums and difference arrays, and the prefix-sum + hash-map combo for "subarray sum equals K"-shaped problems.

  1. Two pointers: opposite ends
  2. Two pointers: same direction
  3. Sliding window: fixed size
  4. Sliding window: variable size
  5. Prefix sums and difference arrays
  6. The prefix-sum + hash-map combo
Part 4 — Stack & Queue Patterns (5 chapters) — monotonic stack/deque, min/max stack, expression parsing, queue-from-stacks

Part 4 — Stack & Queue Patterns (5 chapters)

Audience: anyone struggling with Next-Greater-Element and sliding-window-maximum problems. Difficulty: Intermediate. Total reading time: ~3 hours.

Stack and queue as algorithmic patterns, not just data structures. Monotonic stacks (Daily Temperatures, Largest Rectangle in Histogram), monotonic deques (Sliding Window Maximum), min/max stacks (O(1) min query under push/pop), expression parsing (Shunting-Yard / RPN), and the queue-from-stacks amortization argument.

  1. Monotonic stack
  2. Monotonic deque
  3. Min and max stacks
  4. Expression evaluation and parsing
  5. Queue from stacks
Part 5 — Linked Lists (6 chapters) — pointer rewiring, reversal, k-group reversal, Floyd's cycle, merging, LRU cache

Part 5 — Linked Lists (6 chapters)

Audience: anyone who's drawn boxes-and-arrows on a whiteboard and gotten lost. Difficulty: Intermediate. Total reading time: ~4 hours.

Linked lists are pointer surgery. Sentinel-node patterns, the canonical iterative reversal (and the recursive cousin), reverse-in-groups-of-k, Floyd's tortoise-and-hare cycle detection (and why the cycle-start formula works), merging sorted lists, and the LRU cache as the canonical hash-map + doubly-linked-list combo.

  1. Linked list fundamentals: sentinels, pointer rewiring, doubly-linked design
  2. Reversal patterns
  3. Reverse in groups of k
  4. Cycle detection (Floyd's tortoise and hare)
  5. Merging linked lists
  6. LRU cache: hash map plus doubly linked list
Part 6 — Trees & Heaps (11 chapters) — traversals, Morris, BFS, heaps, BST, AVL/RB, tries, tree DP, segment trees

Part 6 — Trees & Heaps (11 chapters)

Audience: anyone past the linear-data-structure phase of prep. Difficulty: Intermediate-Advanced. Total reading time: ~7 hours.

The hierarchical structures and the priority queue. Binary tree fundamentals, the three depth-first traversals (pre/in/post — both recursive and iterative), Morris traversal for O(1) space, level-order BFS, heaps and priority queues (heapify in O(n)), binary search trees, AVL rotations, a red-black overview, tries, the tree-DP primer (post-order with side state), and an introduction to segment trees for range queries.

  1. Binary tree fundamentals
  2. Tree traversals: pre, in, post
  3. Morris traversal: O(1)-space inorder by threading
  4. Level-order traversal: BFS on trees
  5. Heaps and priority queues
  6. Binary search trees
  7. AVL trees and rotations
  8. Red-black trees: an overview
  9. Tries
  10. Tree DP primer: post-order with side state
  11. Segment trees
Part 7 — Recursion & Backtracking (7 chapters) — recursion patterns, the template, subsets/perms, N-Queens, Sudoku, randomized algos

Part 7 — Recursion & Backtracking (7 chapters)

Audience: anyone whose subset/permutation/combination solutions feel like guesswork. Difficulty: Intermediate-Advanced. Total reading time: ~5 hours.

Backtracking is just DFS with state-restoration discipline. The recursion patterns (linear, tree, divide-and-conquer); the backtracking template you can adapt to any constraint problem; subsets, combinations, and permutations; N-Queens with pruning; Sudoku with constraint propagation and forward checking; word search on a grid; and Fisher-Yates, reservoir sampling, and rejection sampling for randomized algorithms.

  1. Recursion patterns: linear, tree, and divide-and-conquer
  2. The backtracking template
  3. Subsets, combinations, permutations
  4. N-Queens: pruning and constraint propagation
  5. Sudoku solver: constraint propagation and forward checking
  6. Word search and grid backtracking
  7. Randomized algorithms: Fisher-Yates, reservoir sampling, rejection sampling
Part 8 — Graphs (13 chapters) — BFS/DFS, components, topo, cycles, bipartite, union-find, Dijkstra, Bellman-Ford, MSTs

Part 8 — Graphs (13 chapters)

Audience: anyone preparing for FAANG-tier interviews — graph problems are the differentiator. Difficulty: Intermediate-Advanced. Total reading time: ~9 hours.

The graph chapters are interview-grade end-to-end. Adjacency-list vs adjacency-matrix representations; BFS and DFS as orthogonal traversal templates; connected components and flood fill (Number of Islands); topological sort via Kahn's queue and via DFS post-order reverse; cycle detection on directed and undirected graphs; bipartite checking; union-find with path compression and union-by-rank; Dijkstra's algorithm; Bellman-Ford and negative cycles; and minimum spanning trees via both Kruskal and Prim.

  1. Graph representation
  2. Breadth-first search
  3. Depth-first search
  4. Connected components and flood fill
  5. Topological sort: Kahn's algorithm
  6. Topological sort: DFS post-order reverse
  7. Cycle detection in graphs
  8. Bipartite check
  9. Union-Find: parent forests, path compression, and union by rank
  10. Dijkstra's shortest-path algorithm
  11. Bellman-Ford and negative cycles
  12. Minimum spanning tree: Kruskal's algorithm
  13. Minimum spanning trees: Prim's algorithm
Part 9 — Dynamic Programming (15 chapters) — memo↔tab, 1D/grid/interval/tree/bitmask, knapsack, LCS, edit distance, LIS, palindromes

Part 9 — Dynamic Programming (15 chapters)

Audience: the part most candidates fear most. Read this if "DP problems just don't click." Difficulty: Advanced. Total reading time: ~10 hours.

The most demanding part of the handbook. Build DP up from recursion (top-down memoization) and re-derive the bottom-up tabulation; 1D-state DPs (Climbing Stairs, House Robber, decision DP); the string-prefix decision DPs (Decode Ways, Word Break); 0/1 and unbounded knapsacks; longest-common-subsequence and edit distance; LIS in O(n²) and the patience-sort O(n log n) variant; palindrome DP; interval DP (matrix chain, burst balloons); grid DP; tree DP; and the bitmask DP family for "all subsets" problems.

  1. Dynamic Programming: From Recursion to Memoization
  2. DP: bottom-up tabulation
  3. Dynamic programming on a 1D state
  4. Decode Ways and Word Break: string-prefix decision DP
  5. 0/1 knapsack
  6. Unbounded knapsack: when items can be picked over and over
  7. Longest common subsequence
  8. Edit distance
  9. Longest Increasing Subsequence: the quadratic DP
  10. LIS: patience sort
  11. Palindrome DP
  12. Interval DP: matrix chain and burst balloons
  13. Grid DP: forward fills, backward survives
  14. Tree DP: states that travel up the call stack
  15. Bitmask DP
Part 10 — Greedy (5 chapters) — when local choices win, intervals, activity selection, Huffman, jump games

Part 10 — Greedy (5 chapters)

Audience: anyone who's been burned by a greedy that "looked right" and got Wrong Answer. Difficulty: Intermediate-Advanced. Total reading time: ~3 hours.

When local choices yield global optima, and how to prove it. Greedy thinking framed against DP; interval scheduling (the sorting comparator is the algorithm); activity selection and the task-scheduler family; Huffman encoding; and Jump Games / Gas Station as canonical "scan once, maintain a running invariant" problems.

  1. Greedy thinking: when local choices win, and when they don't
  2. Interval scheduling: the comparator is the algorithm
  3. Activity selection and the task-scheduler family
  4. Huffman encoding
  5. Jump games and gas station
Part 11 — Bit Manipulation (4 chapters) — the bit-ops cookbook, XOR patterns, bitmask techniques, performance tricks

Part 11 — Bit Manipulation (4 chapters)

Audience: anyone preparing for low-level / performance-oriented interviews (HFT, embedded, kernel, GPU). Difficulty: Intermediate. Total reading time: ~2 hours.

The bit-ops cookbook (set/clear/toggle, lowest-set-bit, popcount); XOR patterns (Single Number I/II/III, Missing Number); bitmask techniques as compact subset state; and bit-level performance tricks for the "your code is correct, just make it 10× faster" interview round.

  1. Bit operations cookbook
  2. XOR patterns
  3. Bitmask techniques
  4. Bit tricks for performance
Part 12 — Strings & Pattern Matching (6 chapters) — naive matching, Rabin-Karp, KMP, Z-array, Aho-Corasick, suffix arrays

Part 12 — Strings & Pattern Matching (6 chapters)

Audience: anyone interviewing where strings are a focus area (search infra, NLP infra, IDEs, compiler tooling). Difficulty: Intermediate-Advanced. Total reading time: ~4 hours.

Substring search beyond the naive O(n·m) baseline. Rabin-Karp with rolling hashes; KMP and the failure function (and why it generalizes the prefix-function); the Z-algorithm; Aho-Corasick for matching many patterns in one pass; and an introduction to suffix arrays for "all substring" queries.

  1. Naive string matching
  2. Rabin-Karp and rolling hashes
  3. KMP and the failure function
  4. Z-algorithm
  5. Aho-Corasick: many patterns, one pass
  6. Suffix arrays: a sorted index of every suffix
Part 13 — Design the Data Structure (7 chapters) — LRU/LFU, min stacks, hit counter, trie autocomplete, Twitter feed, game state

Part 13 — Design the Data Structure (7 chapters)

Audience: SDE2+ candidates — "design X" is the staple senior-coding-round question type. Difficulty: Intermediate-Advanced. Total reading time: ~5 hours.

The bridge between DSA and HLD: how to compose primitives into APIs that meet a per-operation complexity contract. The full LRU treatment with concurrency and multi-tier framing; LFU via frequency-bucketed doubly-linked lists; min stacks and max-frequency stacks; hit counters and rate limiters; trie-backed autocomplete; Twitter-feed design; and a game-state design vignette.

  1. LRU cache: design framing, concurrency, and multi-tier deployment
  2. LFU cache: hash map plus frequency-bucketed doubly linked lists
  3. Min stack and max-frequency stack
  4. Hit counter and rate limiter
  5. Trie autocomplete
  6. Twitter feed design
  7. Game state design
Part 14 — Interview Framework (12 chapters) — pattern recognition, clarifying questions, communicating, complexity, mocks, per-company tracks

Part 14 — Interview Framework (12 chapters)

Audience: anyone within a month of an actual interview window. Difficulty: All levels. Total reading time: ~5 hours.

The meta-skills that turn a correct solution into a passing one. Pattern-recognition drills; the first-five-minutes clarifying-questions script; how to narrate while you code; how to discuss complexity convincingly; running productive mock interviews; the most common pitfalls; Amazon Leadership Principles (brief and narration); the Meta AI-assisted round (format, prompting tactics, common failures); and the per-company tracks index that points at company-specific reading lists.

  1. Pattern recognition: the fastest skill to develop
  2. The first five minutes: clarifying questions
  3. Communicating during the interview
  4. Complexity discussions
  5. The mock interview process
  6. Common pitfalls
  7. Amazon Leadership Principles, briefly
  8. Amazon Leadership Principles narration
  9. Meta's AI round: the format
  10. Meta's AI round: prompting tactics
  11. Meta's AI round: common failures
  12. Per-company tracks: index and how to use

DSA pattern decision pages (37 pages)

Audience: anyone deciding "which pattern fits this problem?" mid-interview. Difficulty: Intermediate. Total reading time: ~10 hours total; each page is a 15-30 min decision aid.

Each page is a focused decision-comparison: a Mermaid decision tree, a "pick A vs pick B" table, archetype problems on each side, and references. Click to expand the full list.

All 37 pattern decision pages
  1. Recursion vs Iteration
  2. Memoization vs Tabulation
  3. Hash Map vs Sorted Map
  4. Array vs Linked List
  5. Stack vs Queue vs Deque
  6. Heap vs Sorted Array vs BST
  7. Two-Heap Median Archetypes
  8. k-Way Merge: Cursor Heap
  9. BFS vs DFS on Graphs
  10. Shortest-Paths Decision Tree
  11. Kruskal vs Prim
  12. Quicksort vs Mergesort vs Heapsort
  13. Binary Search vs Linear Scan
  14. Two Pointers vs Sliding Window
  15. Iterative vs Morris vs Recursive Traversal
  16. Trie vs Hash Map vs Sorted Set
  17. Union-Find vs DFS
  18. Topological Sort: Kahn vs DFS
  19. Greedy vs DP
  20. Backtracking vs DP
  21. Recursion vs DP
  22. When to Use Bit Manipulation
  23. BFS vs DFS on Trees
  24. Quickselect vs Heap for Top-K
  25. Sliding Window Archetypes
  26. Two-Pointer Archetypes
  27. Prefix-Sum Archetypes
  28. Monotonic Stack Archetypes
  29. Monotonic Deque Archetypes
  30. In-Place Linked-List Reversal
  31. Cyclic-Sort Archetypes
  32. Top-K: Heap or Quickselect
  33. Merge-Intervals Archetypes
  34. Tree-DP Archetypes
  35. Bitmask-DP Archetypes
  36. Backtracking Template & Pruning
  37. Sweep-Line Archetypes

DSA long-form editorials (5 pages)

Audience: anyone who's solved an "easy" problem and wants the senior-engineer-level analysis behind it. Difficulty: Intermediate. Total reading time: ~2 hours total.

Five canonical interview problems treated as full editorial-style essays: every approach (brute force → optimized → optimal), the proof of correctness, the edge cases, the production framing, and the follow-up variants.

All 5 long-form editorials
  1. LC-001 — Two Sum (full editorial under LC-001-two-sum/)
  2. LC-003 — Longest Substring Without Repeating Characters (full editorial under LC-003-longest-substring-without-repeating-characters/)
  3. LC-005 — Longest Palindromic Substring (full editorial under LC-005-longest-palindromic-substring/)
  4. LC-011 — Container With Most Water (full editorial under LC-011-container-with-most-water/)
  5. LC-015 — 3Sum (full editorial under LC-015-3sum/)

DSA interactive widget specs (46 YAML files)

Audience: contributors authoring or editing the website's interactive widgets. Format: YAML; rendered as interactive animations on dsa.handbook.academy.

Each widget is described as keyframe data plus narration in YAML. The website's renderer turns each spec into an interactive (play/pause/scrub/step) animation alongside the chapter prose. Click to expand the full list.

Editorial-tied widgets (5)
  1. e-LC001 — Two Sum walkthrough
  2. e-LC003 — Longest Substring walkthrough
  3. e-LC005 — Longest Palindrome walkthrough
  4. e-LC011 — Container With Most Water walkthrough
  5. e-LC015 — 3Sum walkthrough
Pattern widgets (41)
  1. w-01 — Recursion call stack
  2. w-02 — Hash table
  3. w-03 — Matrix rotation
  4. w-04 — Binary search
  5. w-05 — Sorting visualizer
  6. w-06 — Quicksort partition
  7. w-07 — Quickselect
  8. w-08 — Two-pointer 3Sum
  9. w-09 — Sliding window expansion
  10. w-10 — Prefix-sum cumulative
  11. w-11 — Monotonic stack
  12. w-12 — Amortized queue via stacks
  13. w-12 — Monotonic deque
  14. w-13 — Linked-list pointer rewiring
  15. w-14 — Floyd cycle
  16. w-15 — LRU cache
  17. w-16 — Tree traversal animator
  18. w-17 — Morris thread
  19. w-18 — Heap operations
  20. w-19 — BST rotations
  21. w-20 — Trie
  22. w-21 — Segment tree
  23. w-22 — Backtracking tree
  24. w-23 — N-Queens
  25. w-24 — Sudoku grid
  26. w-25 — Graph BFS
  27. w-26 — Graph DFS
  28. w-27 — Topological sort (Kahn)
  29. w-28 — Union-Find
  30. w-29 — Dijkstra
  31. w-30 — MST: Kruskal & Prim
  32. w-31 — DP table fill
  33. w-32 — Knapsack fill
  34. w-33 — LIS via patience
  35. w-34 — Bitmask DP
  36. w-35 — Prefix-sum + hash combo
  37. w-36 — Interval scheduling
  38. w-37 — Bellman-Ford
  39. w-38 — KMP
  40. w-39 — Huffman encoding
  41. w-40 — Z-array

Widgets are catalogued in content/dsa/widgets/_widget-registry.yml; see content/dsa/widgets/README.md for authoring conventions.


Study plans

You don't need one — pick any chapter and start reading. These are here if you prefer a pre-built route, typically because you're preparing for a specific interview window or filling a specific gap.

SDE1 → SDE2 — 6-week interview prep

Goal: pass a 45-60 minute system-design screen at a mid-to-senior level. Roughly 8-10 hours of reading per week.

Week Focus Chapters
1 Foundations Part 0 + Part 1 (12 chapters, ~7 hrs)
2 Building blocks I Part 2 chapters 0-7: load balancers, proxies, CDN, cache, SQL, NoSQL, partitioning, replication
3 Building blocks II Part 2 chapters 8-15: queues, pub/sub, real-time, rate limiting, service mesh, blob storage, geo, edge
4 Case studies (core) Pick 5 from Part 8 chapters 0-9: URL shortener, rate limiter, chat, feed, web crawler, autocomplete
5 Case studies (your target) Pick 5 more from Part 8 relevant to your target company (see Company-Specific Flavors)
6 Interview mechanics Part 11 + top 5 most-cited pages in Trade-offs Library

SDE2 → Senior — 3-month deep dive

Goal: operate at Senior level, own cross-team architecture, pass loops at Senior+ bars.

Phase Focus Duration
1 Foundations + building blocks 3 weeks — Parts 0-2 (28 chapters)
2 Distributed theory + data + architecture 4 weeks — Parts 3, 4, 5 (32 chapters)
3 Case studies (all 56) 4 weeks — Part 8
4 Reliability, security, AI, frontier, interview 2 weeks — Parts 6, 7, 9, 10, 11 + Trade-offs Library

Full curriculum — Staff+ preparation (6 months)

Read everything in order. Use the end-of-chapter questions for active recall. Average one 25-minute chapter per day gets you through the whole thing in about 6 months, with buffer for harder chapters and re-reading.

AI/ML-only track (4 weeks)

If you already know the fundamentals and want to become fluent specifically in AI-systems design:

Week Focus Chapters
1 LLM serving + RAG Part 9 chapters 0-3
2 Agents + evaluation Part 9 chapters 3-9
3 AI case studies Part 8 chapters 30-37 (ChatGPT, RAG, coding agent, Perplexity, voice, moderation, semantic cache, model router)
4 ML fundamentals Part 9 chapters 9-14 + Recommendation System case study

Interview-triage track (one weekend)

If you have a loop on Monday:


Project statistics

Metric HLD Handbook DSA Handbook Repo total
Parts 12 + Trade-offs Library 15 27 + library
Teaching chapters 159 120 279
Decision/pattern pages 22 trade-offs 37 patterns 59
Long-form editorials 5 5
Interactive widget specs 46 46
Sibling code samples (per language) 155 problems × 4 langs ≈ 620 files 620
Total Markdown pages 181 162 343
Total words ~773,000 ~470,000 ~1,240,000
Mermaid diagrams 719 226 945
Citations to primary sources 3,100+ hundreds 3,500+
Estimated total reading time ~110 hours ~60 hours ~170 hours
Equivalent printed book page count ~2,400 pages ~1,500 pages ~3,900 pages

For comparison, the HLD book alone is longer than Designing Data-Intensive Applications (~600 pages) + Alex Xu's System Design Interview Vol. 1 (~300 pages) + Vol. 2 (~340 pages) combined. The DSA book is comparable in length to Cracking the Coding Interview (~700 pages) plus Elements of Programming Interviews (~480 pages).

Quality standards

Every chapter in this repository passes 7 automated CI checks on every PR. The validators run against both books and dispatch on a per-book schema:

  1. markdownlint — Markdown style and structure conformance.
  2. typos — source-code spell-check with a project-specific allowlist for technical terms.
  3. Citation integrity — every [^1]-style footnote has a corresponding [^1]: source definition; every citation is a real URL; no orphan citations.
  4. Frontmatter validation — HLD chapters declare title, difficulty, prerequisites, date_created, date_updated, reading_time_minutes, tags (canonical taxonomy), and technologies (curated allowlist). DSA chapters declare title, slug, part, chapter, difficulty, languages (subset of python/java/cpp/go), canonical_test, widgets, and ladder (referencing LC-NNN IDs in _problem-registry.yml). Editorials and pattern decision pages have their own thinner schemas. See scripts/check-frontmatter.mjs for the per-book branches.
  5. Mermaid diagram validation — all 945 diagrams across both books must parse with @mermaid-js/mermaid-cli on CI so broken syntax doesn't ship.
  6. Vale — prose-style linter for voice, passive voice, weasel words, and banned phrases. Custom rules in .vale.ini.
  7. lychee — external link checker run weekly; flags rotted URLs so citations stay valid.

CI configuration lives in .github/workflows/content-ci.yml. Validator scripts are in scripts/.

Beyond automation, every chapter is reviewed for:

  • Internal consistency (terminology matches Part 1 definitions)
  • Difficulty calibration (a "Beginner" chapter doesn't assume Staff-level context)
  • Diagram quality (Mermaid, not screenshots; captioned; accessibility-tagged)
  • Citation quality (primary sources preferred over summarizing blog posts)

Contributing

Contributions of all sizes are welcome, from a typo fix to a full chapter. Read CONTRIBUTING.md for the full workflow. A short summary:

Contribution paths

Time What you can do Issue required?
5 min Fix a typo or dead link No
15 min Add a missing citation or update an out-of-date number No
30 min Add a real-world example or clarify a confusing paragraph No
1 hour Create a Mermaid diagram for an existing chapter Optional
2-4 hours Review a chapter for technical accuracy and leave feedback Yes
4-8 hours Write a full chapter from an outline Yes, required
Translator Translate one chapter or the full handbook into your language Yes

How to contribute

  1. Read the STYLE_GUIDE.md for voice, structure, diagram conventions, and citation format.

  2. Open an issue first for anything bigger than 30 minutes of work — this ensures you don't duplicate in-flight work.

  3. Fork, branch, edit. Use descriptive branch names like fix/raft-quorum-math or add/mcp-protocol-chapter.

  4. Run validators locally (optional):

    npm install
    npm run check:all

    If you skip this, CI will run them on your PR anyway and tell you what to fix.

  5. Submit a pull request using the PR template. Small fix? A one-sentence description is fine. Full chapter? Describe the pedagogical approach and list your primary sources.

  6. Respond to review. A maintainer will review within 7 days (usually faster). For content chapters, expect at least one round of technical review.

What makes a great contribution

  • Correctness. If you're citing a number, cite the primary source. If you're claiming a property (like "Raft guarantees linearizability"), cite the paper.
  • Clarity. Write for the difficulty tier declared in the chapter's frontmatter. Don't introduce Part-7 concepts in a Part-0 chapter.
  • Opinion with evidence. If you think the chapter should recommend something different, make the case with citations, not vibes.
  • Pedagogical structure. Intro → first principles → diagram → worked example → trade-offs → production gotchas → references. Deviate only when the topic genuinely demands it.

Reviewers wanted

If you have operated one of the systems we cover in Part 8 (payment processing, real-time chat, feeds, video streaming, etc.) — your review is worth more than a month of solo research. Open an issue with your expertise area and which chapter you'd like to review.

Private/sensitive concerns


Project structure

engineering-handbook/
├── content/                           # Two open-source books (CC BY-SA 4.0)
│   ├── hld/                           # HLD Handbook — 181 pages
│   │   ├── part-0-prerequisites/          # 5 chapters
│   │   ├── part-1-core-fundamentals/      # 7 chapters
│   │   ├── part-2-building-blocks/        # 16 chapters
│   │   ├── part-3-distributed-systems-theory/  # 11 chapters
│   │   ├── part-4-data-systems/           # 10 chapters
│   │   ├── part-5-architecture-patterns/  # 11 chapters
│   │   ├── part-6-reliability-and-operations/  # 11 chapters
│   │   ├── part-7-security-at-scale/      # 10 chapters
│   │   ├── part-8-case-studies/           # 56 chapters
│   │   ├── part-9-ai-ml-system-design/    # 15 chapters
│   │   ├── part-10-emerging-patterns/     # 1 chapter
│   │   ├── part-11-interview-framework/   # 6 chapters
│   │   └── trade-offs/                    # 22 decision pages
│   └── dsa/                           # DSA Handbook — 120 chapters
│       ├── part-0-foundations/        # 7 chapters
│       ├── part-1-linear-data-structures/  # 8 chapters
│       ├── part-2-search-sort/        # 8 chapters
│       ├── part-3-pointers-window-prefix/  # 6 chapters
│       ├── part-4-stack-queue-patterns/  # 5 chapters
│       ├── part-5-linked-lists/       # 6 chapters
│       ├── part-6-trees-heaps/        # 11 chapters
│       ├── part-7-recursion-backtracking/  # 7 chapters
│       ├── part-8-graphs/             # 13 chapters
│       ├── part-9-dynamic-programming/  # 15 chapters
│       ├── part-10-greedy/            # 5 chapters
│       ├── part-11-bit-manipulation/  # 4 chapters
│       ├── part-12-strings-pattern-matching/  # 6 chapters
│       ├── part-13-design-the-data-structure/  # 7 chapters
│       ├── part-14-interview-framework/  # 12 chapters
│       ├── editorials/                # Per-LC long-form editorials
│       ├── patterns/                  # Pattern decision references
│       ├── widgets/                   # YAML specs for interactive widgets
│       ├── _problem-registry.yml      # Canonical LC-NNN registry
│       └── _widget-registry.yml       # Canonical w-NN/e-LCNNN registry
│
├── writing-guides/                    # Author handbooks (CC BY-SA 4.0)
│   ├── case-study-template.md         # Skeleton for Part 8 case studies
│   └── trade-off-template.md          # Skeleton for trade-off decision pages
│
├── scripts/                           # Content validators used by CI
│   ├── check-citations.mjs            # Footnote integrity
│   ├── check-frontmatter.mjs          # YAML frontmatter schema (per-book)
│   ├── check-mermaid.mjs              # Mermaid syntax validator
│   ├── content-stats.mjs              # Word / diagram / citation counts
│   ├── technologies.json              # Curated technology name taxonomy (HLD)
│   └── update-frontmatter-dates.mjs   # Bulk-touch date_updated on edited files
│
├── .github/
│   ├── workflows/
│   │   ├── content-ci.yml             # PR validation pipeline (both books)
│   │   └── stale.yml                  # Issue/PR hygiene
│   ├── ISSUE_TEMPLATE/                # Correction / Request / Writing templates
│   ├── PULL_REQUEST_TEMPLATE.md
│   ├── CODEOWNERS
│   ├── FUNDING.yml
│   ├── dependabot.yml
│   └── release.yml
│
├── STYLE_GUIDE.md                     # Voice, structure, diagram conventions
├── CONTRIBUTING.md                    # Full contribution workflow
├── CODE_OF_CONDUCT.md                 # Contributor Covenant v2.1
├── CONTRIBUTORS.md                    # All-contributors list
├── SECURITY.md                        # Security disclosure process
├── CITATION.cff                       # Citation metadata (Zenodo, academic tools)
├── LICENSE                            # CC BY-SA 4.0 (everything in this repo)
├── package.json                       # Dev dependencies for validators
├── lychee.toml                        # Link-check configuration
├── .vale.ini                          # Prose style rules
├── .markdownlint.json                 # Markdown style rules
└── .typos.toml                        # Spell-check allowlist

Development setup

You don't need anything installed to contribute content — edit Markdown, push, let CI validate. But if you want to run the checks locally before pushing:

Prerequisites

  • Node.js 20+ (the .nvmrc file pins the exact version)
  • npm (ships with Node)
  • Optional: Vale for prose-style linting
  • Optional: lychee for link checking

Setup

git clone https://github.com/handbook-academy/engineering-handbook.git
cd engineering-handbook
nvm use                    # pins Node version from .nvmrc
npm install                # installs markdownlint, @mermaid-js/mermaid-cli, etc.

Common commands

# Run all 7 content checks (same as CI)
npm run check:all

# Individual checks
npm run lint               # markdownlint
npm run check:citations    # footnote integrity
npm run check:frontmatter  # YAML schema
npm run check:mermaid      # Mermaid diagram syntax
npm run check:typos        # spell-check
npm run check:links        # lychee link check (slow; runs weekly)

# Optional
npm run prose              # Vale prose-style linter (requires Vale installed)
npm run stats              # word / diagram / citation counts

Editor

Use Markdownlab (source) for editing chapters. It's a browser-based Markdown editor with live preview, Mermaid rendering, and the same callouts/footnotes/admonitions this handbook uses — no install, no setup.

Open any chapter directly in the editor by prefixing its GitHub URL with https://markdownlab.vercel.app/:

https://markdownlab.vercel.app/https://github.com/handbook-academy/engineering-handbook/blob/main/content/hld/part-1-core-fundamentals/00-scalability.md

That URL opens the chapter in Markdownlab pre-loaded and ready to edit. Paste your edits back into a fork → branch → PR.

The repo includes .editorconfig, .nvmrc, .markdownlint.json, .vale.ini, and .typos.toml so any editor that reads them picks up project settings automatically.


Governance

Benevolent maintainer model. The project is currently solo-maintained by @invincible04. Major architectural decisions (curriculum scope, licensing, CI strategy) are made by the maintainer after consultation with active contributors via Issues/Discussions.

Decision process:

  • Typo / dead link / factual correction: any maintainer can merge.
  • New chapter or major rewrite: requires an issue with pedagogical rationale and at least one maintainer approval.
  • Curriculum scope change (new part, renamed part, reordering): requires a Discussion thread open for at least 7 days before merge.
  • License change: requires all-contributors consent; CC BY-SA 4.0 for everything in this repository is a stable commitment.

Becoming a maintainer. Sustained, high-quality contributions over 3+ months can result in a maintainer invitation. This includes merge rights on your areas of expertise and a vote on curriculum-scope decisions.

See CODEOWNERS for the current maintainer routing.


FAQ

Is this really free?

Yes, and specifically: everything in this repository — both books' content, writing guides, style guide, validator scripts — is released under CC BY-SA 4.0. Anyone (including us) can share and adapt it, but adaptations must carry the same license. That means the chapter text cannot be paywalled by anybody — not by us, not by a future company, not by anyone who forks this repo. You can always read both books free at handbook.academy.

Can I use this to build a paid course?

Yes. CC BY-SA 4.0 allows commercial use provided you:

  1. Attribute the original ("Adapted from The Engineering Handbook, CC BY-SA 4.0, https://github.com/handbook-academy/engineering-handbook")
  2. License your derivative under the same CC BY-SA 4.0 terms
  3. Indicate what you changed

You can teach courses, run boot camps, build YouTube channels, or sell books derived from this content — under share-alike. What you can't do is take the content, strip the attribution, and relicense it as proprietary.

Can I translate it into my language?

Please do. Translations are explicitly welcomed. Open an issue saying which language you'd like to translate into and which chapters you plan to start with. Translations live in parallel directories (e.g. content-hi/ for Hindi, content-zh/ for Chinese) under the same repository, or as a linked sister repo — your call.

How is this different from `donnemartin/system-design-primer`?

system-design-primer is a link-dump README that points at scattered blog posts and talks. It's great as a discovery tool. The HLD Handbook in this repo is inline content — every topic is taught fully within these pages. They're complementary: system-design-primer for breadth of resources, this handbook for depth of teaching.

How is this different from Alex Xu's books?

Alex Xu's books are excellent and we cite them in nearly every case study. The differences:

  • Scope: ~640 pages (Vols 1+2) vs ~2,400 equivalent pages of HLD here.
  • Freshness: Xu Vol. 2 was published in 2022 and doesn't cover LLMs, RAG, agents, CRDTs, post-quantum crypto, MCP, etc. This handbook does.
  • Format: Xu is case-study-focused. This handbook has 56 case studies plus 103 foundational HLD chapters that teach the building blocks.
  • License: Xu's books are copyrighted and paid. This handbook is CC BY-SA 4.0.
  • Community: Xu's books are one-author; this handbook accepts PRs.
  • Bonus: this repo also includes a 120-chapter DSA curriculum.
How is the DSA book different from NeetCode / LeetCode editorials / "Cracking the Coding Interview"?
  • Structured by pattern, not by problem. Every DSA chapter is a teaching chapter on a structure or pattern (e.g. monotonic stack, sliding-window-variable, segment tree). LeetCode problems show up as worked examples inside the chapter, not as the unit of study.
  • Four languages, side by side. Each canonical problem ships with sol.py, sol.java, sol.cpp, sol.go siblings. You read the chapter in Python, then click through to your interview language without re-deriving the logic.
  • 37 pattern decision pages. When a problem could be solved by recursion or iteration, BFS or DFS, sliding window or prefix sum — there's a pattern decision page that compares them and tells you which to reach for first. This is the part NeetCode skips.
  • Interactive widgets. On dsa.handbook.academy, the relevant chapters render live, animated visualisations for sliding windows, monotonic stacks, Morris traversal, quickselect partitioning, and so on. Useful when the words on the page aren't enough.
  • Open and editable. CTCI is paywalled and frozen at one author's voice. This is CC BY-SA 4.0 and accepts PRs.
Is this the same as the websites at hld.handbook.academy and dsa.handbook.academy?

The content is identical. The websites add a polished reading UI: full-text search, dark mode, syntax-highlighted code blocks, per-chapter diagram zoom, social cards, OG images, fast client-side navigation, and — on the DSA site — the live interactive widgets. Same content, better reading experience — and still free, with no sign-up.

Why don't you include an "Awesome" list of external resources?

We cite primary sources inline where they're relevant. A separate "awesome" list would duplicate that work and go stale faster. If a resource is worth reading, it's cited in a chapter's References section.

I want to submit a chapter on [topic X]. How?

Open an issue first with the writing template explaining:

  1. Which part this chapter belongs in
  2. Why it belongs (not already covered? modern emerging topic? missing from Part X?)
  3. Proposed outline (1-2 paragraphs)
  4. Your primary sources

A maintainer will respond with scope feedback within a week. Once approved, fork → draft → PR. For Part 8 chapters use writing-guides/case-study-template.md; for trade-off pages use writing-guides/trade-off-template.md. All other chapters follow the skeleton documented in STYLE_GUIDE.md.

How do I cite this in an academic paper?

Use the BibTeX entry in the Citation section below, or the CITATION.cff file (which academic tools like Zotero and Zenodo can parse directly).


License

Everything in this repository — the 343 chapters and decision pages across both books, writing guides, style guide, validator scripts, templates, and configuration — is licensed under Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0).

What CC BY-SA 4.0 means in practice:

You can:

  • Read it, share it, print it, screenshot it — forever.
  • Build courses, boot camps, YouTube channels, or books derived from it.
  • Translate it into any language.
  • Use it commercially.

You must:

  • Attribute the original (link back to this repo).
  • Release derivatives under the same CC BY-SA 4.0 license (share-alike).
  • Not relicense the content as proprietary or impose additional restrictions.

Nobody can paywall the chapter text itself under CC BY-SA 4.0. You can build anything on top of it, as long as the text itself stays open. Read both handbooks free at handbook.academy — no sign-up, no paywall, ever.


Citation

If you reference this project in academic work, blog posts, books, or courses:

BibTeX

@misc{engineering-handbook,
  author       = {Soni, Aayush},
  title        = {The Engineering Handbook: Open-Source Engineering Curricula (HLD + DSA)},
  year         = {2026},
  publisher    = {GitHub},
  url          = {https://github.com/handbook-academy/engineering-handbook},
  note         = {HLD: 159 chapters + 22 trade-off pages, ~773K words. DSA: 120 chapters across 15 parts, ~470K words, 155 LeetCode problems with 4-language solutions. CC BY-SA 4.0.}
}

Prose

"The Engineering Handbook (HLD + DSA)" by Aayush Soni, https://github.com/handbook-academy/engineering-handbook, CC BY-SA 4.0.

Machine-readable

The CITATION.cff file in this repository is a Citation File Format descriptor. GitHub's "Cite this repository" button, Zenodo, Zotero, and academic reference managers can parse it directly.


Acknowledgments

This project stands on the shoulders of many, and we cite primary sources liberally throughout. A few that shaped the handbook most:

Canonical books — HLD:

Canonical books — DSA:

Open-source references:

  • Donne Martinsystem-design-primer.
  • NeetCodeneetcode.io and the NeetCode 150 problem list for the DSA practice ladder template.
  • LeetCode community — for the canonical problem corpus and the editorial conventions we adapt.
  • The Kubernetes, CNCF, and OpenTelemetry communities — for cloud-native and observability standards.
  • The Apache projects (Kafka, Cassandra, Hadoop, Spark, Flink) — for foundational distributed-systems code and docs.

Papers, RFCs, and postmortems:

  • SIGMOD / VLDB / OSDI / SOSP / NSDI papers — the bedrock of distributed systems.
  • IETF / W3C / IANA — for networking and web standards.
  • Every engineer who has written a public postmortem. Real outages taught us everything about graceful degradation, blast radius, and the limits of testing.

We teach the same concepts in our own words, with our own diagrams, and we cite the originals.


Community

Star history

If you find this useful, star the repo. Stars help signal to new contributors that the project is worth their time.

Star History Chart


About

Open-source High-Level System Design and Data Structure & Algorithm handbook. CC BY-SA 4.0, Free Forever.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Contributors