Skip to content

This repository is a collection of resources and explanations covering various aspects of system design.

License

Notifications You must be signed in to change notification settings

rvigneshwaran/system-design

Repository files navigation

System Design Basics

Overview

This repository is a collection of resources and explanations covering various aspects of system design. Whether you're a beginner or an experienced developer, this repository aims to provide insights into fundamental system design concepts. The content is split by topics and examples are provided in each section.

Part 1: Basics

  1. What is System Design?
  2. Horizontal vs. Vertical Scaling
  3. What is Capacity Estimation?
  4. What is HTTP?
  5. What is the Internet TCP/IP stack?
  6. What happens when you enter Google.com?
  7. What are Relational Databases?
  8. What are Database Indexes?
  9. What are NoSQL databases?
  10. What is a Cache?
  11. What is Thrashing?
  12. What are Threads?

Part 2: Load Balancing

  1. What is Load Balancing?
  2. What is Consistent Hashing?
  3. What is Sharding?

Part 3: DataStores

  1. What are Bloom Filters?
  2. What is Data Replication?
  3. How are NoSQL databases optimized?
  4. What are Location-based Databases?
  5. Database Migrations

Part 4: Consistency vs. Availability

  1. What is Data Consistency?
  2. Data Consistency Levels
  3. Transaction Isolation Levels

Part 5: Message Queues

  1. What is a Message Queue?
  2. What is the publisher-subscriber model?
  3. What are event-driven systems?
  4. Database as a Message Queue

Part 6: DevOps Concepts

  1. What is a Single Point of Failure?
  2. What are Containers?
  3. What is Service Discovery and Heartbeats?
  4. How to avoid Cascading Failures?
  5. Anomaly Detection in Distributed Systems
  6. Distributed Rate Limiting

Part 7: Caching

  1. What is Distributed Caching?
  2. What are Content Delivery Networks?
  3. Write Policies
  4. Replacement Policies

Part 8: Microservices

  1. Microservices vs. Monoliths
  2. How monoliths are migrated

Part 9: API Gateways

  1. How are APIs designed?
  2. What are asynchronous APIs?

Part 10: Authentication Mechanisms

  1. OAuth
  2. Token Based Auth
  3. Access Control Lists and Rule Engines

Part 11: System Design Tradeoffs

  1. Pull vs. Push
  2. Memory vs. Latency
  3. Throughput vs. Latency
  4. Consistency vs. Availability
  5. Latency vs. Accuracy
  6. SQL vs. NoSQL databases

What is System Design?

System design involves creating the architecture of a complex software system to meet specified requirements.

Horizontal vs. Vertical Scaling

Horizontal scaling involves adding more machines, while vertical scaling involves increasing the resources of a single machine.

What is Capacity Estimation?

Capacity estimation is the process of predicting the amount of load a system can handle.

What is HTTP?

HTTP (Hypertext Transfer Protocol) is the foundation of data communication on the World Wide Web.

What is the Internet TCP/IP stack?

The TCP/IP stack is the suite of communication protocols that enable network connectivity on the Internet.

What happens when you enter Google.com?

Explains the steps and processes that occur when a user enters a website like Google.com.

What are Relational Databases?

Relational databases organize data into tables and use SQL for querying and managing data.

What are Database Indexes?

Database indexes improve the speed of data retrieval operations on a database.

What are NoSQL databases?

NoSQL databases are non-relational databases designed for scalability, flexibility, and performance.

What is a Cache?

A cache is a high-speed data storage layer that stores frequently accessed data.

What is Thrashing?

Thrashing occurs when a computer's performance deteriorates due to excessive paging.

What are Threads?

Threads are lightweight processes within a program that can run concurrently.

What is Load Balancing?

Load balancing distributes incoming network traffic across multiple servers to ensure optimal resource utilization.

What is Consistent Hashing?

Consistent hashing is a technique for distributing data across a network in a way that minimizes reorganization when nodes are added or removed.

What is Sharding?

Sharding involves dividing a database into smaller, more manageable pieces called shards.

What are Bloom Filters?

Bloom filters are space-efficient probabilistic data structures used to test whether an element is a member of a set.

What is Data Replication?

Data replication involves copying data to multiple locations to improve reliability and fault tolerance.

How are NoSQL databases optimized?

NoSQL databases are optimized for specific use cases, such as horizontal scaling and flexible data models.

What are Location-based Databases?

Location-based databases store and retrieve data based on geographic locations.

Database Migrations

Database migrations involve transferring data from one database to another while preserving data integrity.

What is Data Consistency?

Data consistency ensures that data remains accurate and unchanged across the system.

Data Consistency Levels

Describes different levels of data consistency, such as strong consistency and eventual consistency.

Transaction Isolation Levels

Transaction isolation levels define the degree to which transactions are isolated from each other.

What is a Message Queue?

A message queue is a communication method that allows applications to communicate asynchronously.

What is the publisher-subscriber model?

The publisher-subscriber model involves communication between publishers and subscribers through a message broker.

What are event-driven systems?

Event-driven systems respond to and handle events, triggering actions based on specific occurrences.

Database as a Message Queue

Using a database as a message queue for communication between different components.

What is a Single Point of Failure?

A single point of failure is a component that, if it fails, will cause the entire system to fail.

What are Containers?

Containers encapsulate applications and their dependencies, providing a consistent and isolated environment.

What is Service Discovery and Heartbeats?

Service discovery involves automatically finding and connecting to services, and heartbeats are signals indicating the health of a service.

How to avoid Cascading Failures?

Strategies to prevent the spread of failures across a system.

Anomaly Detection in Distributed Systems

Detecting abnormal behavior or performance in distributed systems.

Distributed Rate Limiting

Implementing rate limiting across multiple components in a distributed system.

What is Distributed Caching?

Distributed caching involves caching data across multiple nodes to improve performance.

What are Content Delivery Networks?

Content Delivery Networks (CDNs) distribute content geographically for faster and more reliable delivery.

Write Policies

Write policies determine how data is written to a cache.

Replacement Policies

Replacement policies determine which items are removed from a cache when space is needed.

Microservices vs. Monoliths

Comparison between microservices architecture and monolithic architecture.

How monoliths are migrated

Strategies for migrating from monolithic to microservices architecture.

How are APIs designed?

Design principles and considerations for creating effective APIs.

What are asynchronous APIs?

Asynchronous APIs allow communication between components without waiting for an immediate response.

OAuth

OAuth is an authorization framework for securing access to resources.

Token Based Auth

Authentication using tokens for secure access to resources.

Access Control Lists and Rule Engines

Access control lists and rule engines manage permissions and access in a system.

System Design Tradeoffs

Pull vs. Push

Choosing between pull (request-based) and push (notification-based) communication.

Memory vs. Latency

Balancing the tradeoff between memory usage and system responsiveness.

Throughput vs. Latency

Optimizing for either high throughput or low latency, depending on system requirements.

Consistency vs. Availability

Navigating the tradeoff between data consistency and system availability.

Latency vs. Accuracy

Balancing the tradeoff between response time and the accuracy of results.

SQL vs. NoSQL databases

Comparing the characteristics and use cases of SQL and NoSQL databases.

About

This repository is a collection of resources and explanations covering various aspects of system design.

Topics

Resources

License

Stars

Watchers

Forks