# üöÄ HIGH-LEVEL DESIGN (HLD) MASTERY NOTEBOOK

## üéØ **SYSTEM ARCHITECTURE EXCELLENCE - YOUR PATH TO SENIOR+ ROLES!**

### üî• What You'll Master:
- **Distributed Systems Architecture** - Scale to millions of users
- **Microservices Design** - Build resilient, maintainable systems
- **Database Design** - SQL vs NoSQL, sharding, replication
- **Caching Strategies** - Redis, Memcached, CDN optimization
- **Load Balancing** - Horizontal scaling techniques
- **Message Queues** - Async processing, event-driven architecture
- **API Design** - REST, GraphQL, gRPC best practices
- **Security** - Authentication, authorization, encryption
- **Monitoring** - Observability, logging, metrics
- **Cloud Architecture** - AWS, Azure, GCP patterns

### üí∞ **CAREER TRANSFORMATION:**
- **üìà $50K+ salary increase** potential for senior engineers
- **üéØ System Design interviews** - Pass FAANG/unicorn companies
- **üèÜ Technical leadership** - Lead architecture decisions
- **üöÄ Principal Engineer** - Design systems for millions
- **üíº Engineering Manager** - Technical strategy and vision

### üé™ **Real-World Systems You'll Design:**
- Social Media Platform (Instagram/Twitter scale)
- Video Streaming Service (Netflix/YouTube)
- E-commerce Platform (Amazon/eBay)
- Chat Application (WhatsApp/Slack)
- Ride-sharing Service (Uber/Lyft)
- Search Engine (Google scale)
- Payment System (PayPal/Stripe)

---

### üîß **Technologies Covered:**
```
Databases:     PostgreSQL, MongoDB, Cassandra, DynamoDB
Caching:       Redis, Memcached, CloudFront, CDN
Message Queue: Kafka, RabbitMQ, SQS, Pub/Sub
Load Balancer: NGINX, HAProxy, AWS ALB, Cloudflare
Monitoring:    Prometheus, Grafana, ELK Stack, Datadog
Cloud:         AWS, Azure, GCP, Kubernetes, Docker
```

### üìö **Learning Path:**
1. **Fundamentals** - Scalability, reliability, consistency
2. **Components** - Databases, caches, load balancers
3. **Patterns** - Microservices, event-driven, CQRS
4. **Real Systems** - End-to-end design challenges
5. **Advanced Topics** - Consistency, CAP theorem, consensus

### üéØ **Interview Success Formula:**
```
‚úÖ Requirements Gathering (5 min)
‚úÖ Capacity Estimation (5 min)
‚úÖ High-Level Design (15 min)
‚úÖ Detailed Design (15 min)
‚úÖ Scale & Optimize (15 min)
```

---

**üöÄ Ready to become a system design expert? Let's build systems that scale!**

## Chapter 1: System Design Fundamentals ‚≠ê‚≠ê‚≠ê
> **The Foundation** - Master these concepts to design systems that scale to millions

### üéØ Core Concepts:
- **Scalability**: Handle increasing load gracefully
- **Reliability**: System continues working despite failures
- **Availability**: System remains operational over time
- **Consistency**: All nodes see the same data simultaneously
- **Partition Tolerance**: System continues despite network failures

### üöÄ Key Principles:
- **Horizontal vs Vertical Scaling**
- **Load Distribution Strategies** 
- **Data Partitioning Techniques**
- **Caching Mechanisms**
- **Database Design Patterns**

In [None]:
# SYSTEM DESIGN FUNDAMENTALS - SCALE TO MILLIONS!

print("üéØ SYSTEM DESIGN FUNDAMENTALS - CAREER TRANSFORMATION!")
print("=" * 70)

import math
from typing import Dict, List, Tuple
from enum import Enum

# ===============================================================
# 1. SCALABILITY CONCEPTS - Handle Growing Load
# ===============================================================
print("\nüî• 1. SCALABILITY - HANDLE MILLIONS OF USERS")
print("The ability to handle increased load gracefully")

class ScalingType(Enum):
    VERTICAL = "vertical"    # Scale UP - bigger machine
    HORIZONTAL = "horizontal"  # Scale OUT - more machines

class LoadCalculator:
    """Calculate system capacity and scaling needs"""
    
    @staticmethod
    def estimate_servers_needed(requests_per_second: int, server_capacity: int) -> int:
        """Calculate number of servers needed for given RPS"""
        return math.ceil(requests_per_second / server_capacity)
    
    @staticmethod
    def estimate_storage_needed(users: int, data_per_user_mb: float) -> Tuple[float, str]:
        """Calculate storage needs in TB"""
        total_mb = users * data_per_user_mb
        total_gb = total_mb / 1024
        total_tb = total_gb / 1024
        
        if total_tb < 1:
            return total_gb, "GB"
        else:
            return total_tb, "TB"
    
    @staticmethod
    def estimate_bandwidth(requests_per_second: int, avg_response_kb: float) -> float:
        """Calculate bandwidth needed in Mbps"""
        kb_per_second = requests_per_second * avg_response_kb
        mb_per_second = kb_per_second / 1024
        mbps = mb_per_second * 8  # Convert to bits per second
        return mbps

# Demonstrate scaling calculations
print("\nüìä SCALING CALCULATIONS:")
print("Scenario: Social Media Platform")

# User load estimation
daily_active_users = 10_000_000  # 10M DAU
peak_concurrent_users = daily_active_users * 0.1  # 10% concurrent peak
requests_per_user_per_minute = 2
peak_rps = int((peak_concurrent_users * requests_per_user_per_minute) / 60)

print(f"  Daily Active Users: {daily_active_users:,}")
print(f"  Peak Concurrent Users: {peak_concurrent_users:,}")
print(f"  Peak Requests/Second: {peak_rps:,}")

# Server capacity planning
server_capacity_rps = 1000  # Each server handles 1000 RPS
servers_needed = LoadCalculator.estimate_servers_needed(peak_rps, server_capacity_rps)
print(f"  Servers Needed: {servers_needed}")

# Storage estimation
data_per_user_mb = 50  # 50MB per user (posts, images, metadata)
storage, unit = LoadCalculator.estimate_storage_needed(daily_active_users, data_per_user_mb)
print(f"  Storage Required: {storage:.2f} {unit}")

# Bandwidth calculation
avg_response_kb = 100  # 100KB average response
bandwidth_mbps = LoadCalculator.estimate_bandwidth(peak_rps, avg_response_kb)
print(f"  Bandwidth Required: {bandwidth_mbps:.2f} Mbps")

# ===============================================================
# 2. CAP THEOREM - The Fundamental Trade-off
# ===============================================================
print("\n\nüî• 2. CAP THEOREM - CHOOSE YOUR TRADE-OFFS")
print("You can only guarantee 2 out of 3: Consistency, Availability, Partition Tolerance")

class CAPChoice(Enum):
    CP = "CP"  # Consistency + Partition Tolerance (sacrifice Availability)
    AP = "AP"  # Availability + Partition Tolerance (sacrifice Consistency)
    CA = "CA"  # Consistency + Availability (sacrifice Partition Tolerance - single node only)

class SystemExample:
    def __init__(self, name: str, cap_choice: CAPChoice, use_case: str, trade_off: str):
        self.name = name
        self.cap_choice = cap_choice
        self.use_case = use_case
        self.trade_off = trade_off

# Real-world CAP theorem examples
cap_examples = [
    SystemExample(
        "Banking System", 
        CAPChoice.CP,
        "Financial transactions, account balances",
        "System goes down rather than show wrong balance"
    ),
    SystemExample(
        "Social Media Feed", 
        CAPChoice.AP,
        "Facebook/Twitter timeline, likes, comments",
        "Show slightly stale data rather than error page"
    ),
    SystemExample(
        "Traditional RDBMS", 
        CAPChoice.CA,
        "Single-node PostgreSQL/MySQL",
        "Perfect consistency until network partitions occur"
    ),
    SystemExample(
        "DNS System", 
        CAPChoice.AP,
        "Domain name resolution",
        "Always available, eventual consistency is fine"
    ),
    SystemExample(
        "MongoDB (default)", 
        CAPChoice.CP,
        "Document database with strong consistency",
        "Becomes read-only if can't reach majority of nodes"
    )
]

print("\nüìö REAL-WORLD CAP EXAMPLES:")
for example in cap_examples:
    print(f"  {example.name} ({example.cap_choice.value}):")
    print(f"    Use Case: {example.use_case}")
    print(f"    Trade-off: {example.trade_off}")
    print()

# ===============================================================
# 3. CONSISTENCY MODELS - Data Consistency Patterns
# ===============================================================
print("\nüî• 3. CONSISTENCY MODELS - DATA SYNCHRONIZATION")
print("Different levels of data consistency across distributed systems")

class ConsistencyLevel(Enum):
    STRONG = "strong"           # All nodes have same data immediately
    EVENTUAL = "eventual"       # All nodes will eventually have same data
    WEAK = "weak"              # No guarantees about when data will be consistent
    CAUSAL = "causal"          # Causally related operations are seen in order

class ConsistencyExample:
    def __init__(self, level: ConsistencyLevel, description: str, use_case: str, latency: str):
        self.level = level
        self.description = description
        self.use_case = use_case
        self.latency = latency

consistency_examples = [
    ConsistencyExample(
        ConsistencyLevel.STRONG,
        "All reads receive the most recent write immediately",
        "Banking transactions, inventory management",
        "High latency, lower availability"
    ),
    ConsistencyExample(
        ConsistencyLevel.EVENTUAL,
        "System will become consistent over time",
        "Social media feeds, DNS updates, email delivery",
        "Low latency, high availability"
    ),
    ConsistencyExample(
        ConsistencyLevel.WEAK,
        "No guarantees about consistency timing",
        "Real-time gaming, live streaming metrics",
        "Very low latency"
    ),
    ConsistencyExample(
        ConsistencyLevel.CAUSAL,
        "Related operations are seen in correct order",
        "Chat applications, collaborative editing",
        "Moderate latency"
    )
]

print("\nüìä CONSISTENCY MODELS:")
for example in consistency_examples:
    print(f"  {example.level.value.upper()} CONSISTENCY:")
    print(f"    Definition: {example.description}")
    print(f"    Use Case: {example.use_case}")
    print(f"    Trade-off: {example.latency}")
    print()

# ===============================================================
# 4. PARTITIONING STRATEGIES - Divide and Conquer Data
# ===============================================================
print("\nüî• 4. DATA PARTITIONING - DIVIDE TO SCALE")
print("Split large datasets across multiple machines")

class PartitionStrategy(Enum):
    HORIZONTAL = "horizontal"   # Sharding - split rows
    VERTICAL = "vertical"       # Split columns/tables
    FUNCTIONAL = "functional"   # Split by feature/service

class PartitioningCalculator:
    @staticmethod
    def hash_partition(user_id: int, num_shards: int) -> int:
        """Simple hash-based partitioning"""
        return user_id % num_shards
    
    @staticmethod
    def range_partition(user_id: int, ranges: List[Tuple[int, int]]) -> int:
        """Range-based partitioning"""
        for i, (start, end) in enumerate(ranges):
            if start <= user_id <= end:
                return i
        return -1  # Not found
    
    @staticmethod
    def consistent_hash_partition(key: str, num_virtual_nodes: int = 100) -> int:
        """Simplified consistent hashing"""
        hash_value = hash(key) % (num_virtual_nodes * 1000)
        return hash_value % num_virtual_nodes

# Demonstrate partitioning strategies
print("\nüìä PARTITIONING EXAMPLES:")

# Hash partitioning example
num_shards = 4
sample_user_ids = [1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008]

print("  HASH PARTITIONING:")
for user_id in sample_user_ids:
    shard = PartitioningCalculator.hash_partition(user_id, num_shards)
    print(f"    User {user_id} ‚Üí Shard {shard}")

# Range partitioning example
ranges = [(1, 2500), (2501, 5000), (5001, 7500), (7501, 10000)]
print("\n  RANGE PARTITIONING:")
test_users = [1500, 3000, 6000, 8500]
for user_id in test_users:
    shard = PartitioningCalculator.range_partition(user_id, ranges)
    print(f"    User {user_id} ‚Üí Shard {shard}")

# ===============================================================
# 5. REPLICATION STRATEGIES - Data Redundancy for Reliability
# ===============================================================
print("\n\nüî• 5. REPLICATION - RELIABILITY & PERFORMANCE")
print("Keep multiple copies of data for fault tolerance and faster reads")

class ReplicationType(Enum):
    MASTER_SLAVE = "master_slave"     # One writer, multiple readers
    MASTER_MASTER = "master_master"   # Multiple writers
    PEER_TO_PEER = "peer_to_peer"     # No distinguished master

class ReplicationExample:
    def __init__(self, type_: ReplicationType, pros: List[str], cons: List[str], use_case: str):
        self.type = type_
        self.pros = pros
        self.cons = cons
        self.use_case = use_case

replication_strategies = [
    ReplicationExample(
        ReplicationType.MASTER_SLAVE,
        ["Simple to implement", "Strong consistency", "Clear data flow"],
        ["Single point of failure", "Write bottleneck", "Read-only slaves"],
        "Traditional databases, read-heavy workloads"
    ),
    ReplicationExample(
        ReplicationType.MASTER_MASTER,
        ["High availability", "Write scalability", "Load distribution"],
        ["Conflict resolution needed", "Complex synchronization", "Consistency challenges"],
        "Global applications, high-write workloads"
    ),
    ReplicationExample(
        ReplicationType.PEER_TO_PEER,
        ["Highly available", "Decentralized", "Fault tolerant"],
        ["Complex consensus", "Network overhead", "Eventual consistency"],
        "Blockchain, distributed file systems"
    )
]

print("\nüìä REPLICATION STRATEGIES:")
for strategy in replication_strategies:
    print(f"  {strategy.type.value.upper().replace('_', '-')}:")
    print(f"    Pros: {', '.join(strategy.pros)}")
    print(f"    Cons: {', '.join(strategy.cons)}")
    print(f"    Use Case: {strategy.use_case}")
    print()

# ===============================================================
# SYSTEM DESIGN FUNDAMENTALS SUMMARY
# ===============================================================
print("=" * 70)
print("üèÜ SYSTEM DESIGN FUNDAMENTALS MASTERY")
print("=" * 70)

fundamentals = {
    "Scalability": "Design systems that grow with demand",
    "CAP Theorem": "Understand trade-offs in distributed systems",
    "Consistency": "Choose appropriate data consistency model",
    "Partitioning": "Split data to scale beyond single machine",
    "Replication": "Ensure availability and fault tolerance"
}

print("\nüìö CONCEPTS MASTERED:")
for concept, description in fundamentals.items():
    print(f"‚úÖ {concept}: {description}")

print("\nüíº INTERVIEW IMPACT:")
print("üèÜ Foundation for all system design questions")
print("üèÜ Demonstrates understanding of trade-offs")
print("üèÜ Shows ability to design scalable systems")
print("üèÜ Required for senior+ engineering roles")
print("üèÜ Gateway to system architect positions")

print("\nüöÄ NEXT STEPS:")
print("‚úÖ Master these fundamentals before diving into specific components")
print("‚úÖ Practice applying these concepts to real-world scenarios")
print("‚úÖ Study how major tech companies implement these patterns")
print("=" * 70)