# System Design Case Studies

This notebook covers three essential system design case studies with detailed requirements, architecture diagrams, and implementation considerations.

| Case Study | Key Concepts | Complexity |
|------------|--------------|------------|
| URL Shortener | Hashing, Base62, Caching | Medium |
| Rate Limiter | Token Bucket, Sliding Window, Redis | Medium |
| News Feed | Fan-out, Ranking, Timeline | High |

---
## 1. URL Shortener (TinyURL)

### Functional Requirements
- Given a long URL, generate a short unique alias
- Redirect short URL to original URL
- Custom aliases (optional)
- Link expiration (optional)
- Analytics (click count, location)

### Non-Functional Requirements
- High availability (99.9%)
- Low latency redirects (<100ms)
- Short URLs should be unpredictable
- Scale: 100M URLs/day creation, 10:1 read/write ratio

In [None]:
# Capacity Estimation - URL Shortener

# Traffic estimates
urls_per_day = 100_000_000  # 100M new URLs/day
read_write_ratio = 10
reads_per_day = urls_per_day * read_write_ratio  # 1B reads/day

# QPS calculations
seconds_per_day = 86400
write_qps = urls_per_day / seconds_per_day
read_qps = reads_per_day / seconds_per_day

print(f"Write QPS: {write_qps:,.0f}")
print(f"Read QPS: {read_qps:,.0f}")
print(f"Peak QPS (2x): {read_qps * 2:,.0f}")

# Storage estimates (5 years)
years = 5
total_urls = urls_per_day * 365 * years
avg_url_size = 500  # bytes (original URL)
short_url_size = 7  # bytes
metadata_size = 100  # bytes (timestamps, user_id, etc.)
record_size = avg_url_size + short_url_size + metadata_size

total_storage_bytes = total_urls * record_size
total_storage_tb = total_storage_bytes / (1024**4)

print(f"\nTotal URLs (5 years): {total_urls:,.0f}")
print(f"Storage needed: {total_storage_tb:.1f} TB")

### Architecture Diagram

```
                                    ┌─────────────────┐
                                    │   Load Balancer │
                                    └────────┬────────┘
                                             │
                    ┌────────────────────────┼────────────────────────┐
                    │                        │                        │
              ┌─────▼─────┐            ┌─────▼─────┐            ┌─────▼─────┐
              │ App Server│            │ App Server│            │ App Server│
              └─────┬─────┘            └─────┬─────┘            └─────┬─────┘
                    │                        │                        │
                    └────────────────────────┼────────────────────────┘
                                             │
                    ┌────────────────────────┼────────────────────────┐
                    │                        │                        │
              ┌─────▼─────┐            ┌─────▼─────┐            ┌─────▼─────┐
              │   Cache   │            │   Cache   │            │   Cache   │
              │  (Redis)  │            │  (Redis)  │            │  (Redis)  │
              └───────────┘            └───────────┘            └───────────┘
                                             │
                    ┌────────────────────────┼────────────────────────┐
                    │                        │                        │
              ┌─────▼─────┐            ┌─────▼─────┐            ┌─────▼─────┐
              │  DB Shard │            │  DB Shard │            │  DB Shard │
              │  (Range)  │            │  (Range)  │            │  (Range)  │
              └───────────┘            └───────────┘            └───────────┘
```

### Database Schema

```sql
-- Main URL table (sharded by short_url hash)
CREATE TABLE urls (
    id              BIGINT PRIMARY KEY,
    short_url       VARCHAR(7) UNIQUE NOT NULL,
    original_url    VARCHAR(2048) NOT NULL,
    user_id         BIGINT,
    created_at      TIMESTAMP DEFAULT NOW(),
    expires_at      TIMESTAMP,
    click_count     BIGINT DEFAULT 0,
    INDEX idx_short_url (short_url),
    INDEX idx_user_id (user_id)
);

-- Analytics table (async writes)
CREATE TABLE url_analytics (
    id          BIGINT PRIMARY KEY,
    short_url   VARCHAR(7),
    clicked_at  TIMESTAMP,
    ip_address  VARCHAR(45),
    user_agent  VARCHAR(512),
    referrer    VARCHAR(2048)
);
```

In [None]:
# URL Shortening Algorithms
import hashlib
import string

# Base62 encoding (0-9, a-z, A-Z)
BASE62 = string.digits + string.ascii_lowercase + string.ascii_uppercase

def base62_encode(num: int) -> str:
    """Convert number to base62 string."""
    if num == 0:
        return BASE62[0]
    result = []
    while num:
        result.append(BASE62[num % 62])
        num //= 62
    return ''.join(reversed(result))

def base62_decode(s: str) -> int:
    """Convert base62 string to number."""
    num = 0
    for char in s:
        num = num * 62 + BASE62.index(char)
    return num

# Method 1: Counter-based (requires distributed counter)
def generate_short_url_counter(counter_id: int) -> str:
    return base62_encode(counter_id).zfill(7)

# Method 2: MD5 hash (first 43 bits -> 7 chars base62)
def generate_short_url_hash(long_url: str) -> str:
    hash_bytes = hashlib.md5(long_url.encode()).digest()
    hash_int = int.from_bytes(hash_bytes[:6], 'big')  # First 48 bits
    return base62_encode(hash_int)[:7]

# Capacity: 62^7 = 3.5 trillion unique URLs
print(f"Base62 with 7 chars capacity: {62**7:,} URLs")
print(f"\nExamples:")
print(f"Counter 1000000: {generate_short_url_counter(1000000)}")
print(f"Hash 'https://example.com': {generate_short_url_hash('https://example.com')}")

### Key Design Decisions & Trade-offs

| Decision | Option A | Option B | Recommendation |
|----------|----------|----------|----------------|
| **ID Generation** | Counter (ZooKeeper) | MD5 Hash + collision check | Counter for predictable performance |
| **Database** | SQL (PostgreSQL) | NoSQL (DynamoDB) | NoSQL for simpler horizontal scaling |
| **Caching** | Write-through | Write-around + TTL | Write-around (most URLs read once) |
| **Sharding** | Range-based | Hash-based | Hash-based (even distribution) |
| **Expiration** | Lazy deletion | Background job | Lazy + periodic cleanup |

---
## 2. Rate Limiter

### Functional Requirements
- Limit requests per user/IP/API key
- Different limits for different APIs
- Return appropriate headers (X-RateLimit-*)
- Distributed rate limiting across servers

### Non-Functional Requirements
- Low latency (<1ms overhead)
- High availability
- Accurate limiting (no false positives)
- Memory efficient

### Architecture Diagram

```
    ┌──────────┐     ┌──────────────────┐     ┌──────────────┐
    │  Client  │────▶│  API Gateway /   │────▶│  Application │
    └──────────┘     │  Load Balancer   │     │   Servers    │
                     └────────┬─────────┘     └──────────────┘
                              │
                              │ Check Rate Limit
                              ▼
                     ┌──────────────────┐
                     │  Rate Limiter    │
                     │    Service       │
                     └────────┬─────────┘
                              │
                              ▼
                     ┌──────────────────┐
                     │  Redis Cluster   │
                     │  (Distributed    │
                     │   Counters)      │
                     └──────────────────┘
                              │
            ┌─────────────────┼─────────────────┐
            ▼                 ▼                 ▼
      ┌──────────┐      ┌──────────┐      ┌──────────┐
      │  Rules   │      │  Rules   │      │  Rules   │
      │  Config  │      │  Config  │      │  Config  │
      └──────────┘      └──────────┘      └──────────┘
```

In [None]:
# Rate Limiting Algorithms
import time
from collections import deque
from dataclasses import dataclass

@dataclass
class TokenBucket:
    """Token Bucket Algorithm - smooth rate limiting."""
    capacity: int           # Max tokens
    refill_rate: float      # Tokens per second
    tokens: float = None
    last_refill: float = None
    
    def __post_init__(self):
        self.tokens = self.capacity
        self.last_refill = time.time()
    
    def allow_request(self, tokens_needed: int = 1) -> bool:
        self._refill()
        if self.tokens >= tokens_needed:
            self.tokens -= tokens_needed
            return True
        return False
    
    def _refill(self):
        now = time.time()
        elapsed = now - self.last_refill
        self.tokens = min(self.capacity, self.tokens + elapsed * self.refill_rate)
        self.last_refill = now

# Demo
bucket = TokenBucket(capacity=10, refill_rate=2)  # 10 max, 2/sec refill
print("Token Bucket (capacity=10, refill=2/sec):")
for i in range(12):
    result = bucket.allow_request()
    print(f"  Request {i+1}: {'✓ Allowed' if result else '✗ Denied'} (tokens: {bucket.tokens:.1f})")

In [None]:
# Sliding Window Log Algorithm
class SlidingWindowLog:
    """Precise but memory-intensive rate limiting."""
    def __init__(self, max_requests: int, window_seconds: int):
        self.max_requests = max_requests
        self.window_seconds = window_seconds
        self.requests = deque()  # Timestamps of requests
    
    def allow_request(self) -> bool:
        now = time.time()
        window_start = now - self.window_seconds
        
        # Remove old requests outside window
        while self.requests and self.requests[0] < window_start:
            self.requests.popleft()
        
        if len(self.requests) < self.max_requests:
            self.requests.append(now)
            return True
        return False

# Sliding Window Counter (hybrid approach)
class SlidingWindowCounter:
    """Memory-efficient approximation."""
    def __init__(self, max_requests: int, window_seconds: int):
        self.max_requests = max_requests
        self.window_seconds = window_seconds
        self.prev_count = 0
        self.curr_count = 0
        self.curr_window_start = int(time.time())
    
    def allow_request(self) -> bool:
        now = int(time.time())
        window_start = now - (now % self.window_seconds)
        
        # Rotate windows if needed
        if window_start != self.curr_window_start:
            self.prev_count = self.curr_count
            self.curr_count = 0
            self.curr_window_start = window_start
        
        # Weighted count
        elapsed_ratio = (now % self.window_seconds) / self.window_seconds
        weighted_count = self.prev_count * (1 - elapsed_ratio) + self.curr_count
        
        if weighted_count < self.max_requests:
            self.curr_count += 1
            return True
        return False

print("Algorithm Comparison:")
print("┌─────────────────────┬──────────────┬─────────────────┐")
print("│ Algorithm           │ Memory       │ Accuracy        │")
print("├─────────────────────┼──────────────┼─────────────────┤")
print("│ Token Bucket        │ O(1)         │ Allows bursts   │")
print("│ Sliding Window Log  │ O(n)         │ Precise         │")
print("│ Sliding Window Ctr  │ O(1)         │ ~99% accurate   │")
print("│ Fixed Window        │ O(1)         │ Edge spikes     │")
print("└─────────────────────┴──────────────┴─────────────────┘")

### Redis Implementation Patterns

```python
# Token Bucket in Redis (Lua script for atomicity)
TOKEN_BUCKET_SCRIPT = """
local key = KEYS[1]
local capacity = tonumber(ARGV[1])
local refill_rate = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
local requested = tonumber(ARGV[4])

local bucket = redis.call('HMGET', key, 'tokens', 'last_refill')
local tokens = tonumber(bucket[1]) or capacity
local last_refill = tonumber(bucket[2]) or now

-- Refill tokens
local elapsed = now - last_refill
tokens = math.min(capacity, tokens + elapsed * refill_rate)

local allowed = 0
if tokens >= requested then
    tokens = tokens - requested
    allowed = 1
end

redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now)
redis.call('EXPIRE', key, capacity / refill_rate * 2)

return {allowed, tokens}
"""

# Sliding Window Counter in Redis
SLIDING_WINDOW_SCRIPT = """
local key = KEYS[1]
local window = tonumber(ARGV[1])
local limit = tonumber(ARGV[2])
local now = tonumber(ARGV[3])

-- Remove old entries
redis.call('ZREMRANGEBYSCORE', key, 0, now - window)

local count = redis.call('ZCARD', key)
if count < limit then
    redis.call('ZADD', key, now, now .. math.random())
    redis.call('EXPIRE', key, window)
    return {1, limit - count - 1}  -- allowed, remaining
end
return {0, 0}  -- denied, 0 remaining
"""
```

### Trade-offs & Decisions

| Aspect | Consideration |
|--------|---------------|
| **Location** | API Gateway (centralized) vs Application (distributed) - Gateway preferred for consistency |
| **Granularity** | Per-user, per-IP, per-API key, global - Usually combination |
| **Failure Mode** | Fail-open (allow) vs Fail-closed (deny) - Depends on security requirements |
| **Response** | 429 Too Many Requests + Retry-After header |
| **Sync** | Eventual consistency OK for most cases (slight over-limit acceptable) |

---
## 3. News Feed System

### Functional Requirements
- Users can create posts (text, images, videos)
- Users follow other users
- Home feed shows posts from followed users
- Feed is sorted by relevance/time
- Support likes, comments, shares

### Non-Functional Requirements
- Feed generation: <500ms
- Highly available (99.99%)
- Scale: 500M DAU, avg 200 follows per user
- Eventually consistent (few seconds delay OK)

In [None]:
# Capacity Estimation - News Feed

dau = 500_000_000  # 500M daily active users
avg_follows = 200
posts_per_user_per_day = 2
feed_refreshes_per_day = 10

# Post creation traffic
posts_per_day = dau * posts_per_user_per_day
post_write_qps = posts_per_day / 86400

# Feed read traffic
feed_reads_per_day = dau * feed_refreshes_per_day
feed_read_qps = feed_reads_per_day / 86400

print(f"Posts created/day: {posts_per_day:,.0f}")
print(f"Post write QPS: {post_write_qps:,.0f}")
print(f"Feed read QPS: {feed_read_qps:,.0f}")

# Fan-out estimation
avg_followers = 200
fanout_writes_per_second = post_write_qps * avg_followers
print(f"\nFan-out writes/sec (push model): {fanout_writes_per_second:,.0f}")

# Storage (posts for 30 days)
post_size_kb = 5  # Average including metadata
posts_30_days = posts_per_day * 30
storage_tb = (posts_30_days * post_size_kb * 1024) / (1024**4)
print(f"\nPost storage (30 days): {storage_tb:.1f} TB")

### High-Level Architecture

```
┌─────────────────────────────────────────────────────────────────────────┐
│                              CLIENTS                                     │
└────────────────────────────────┬────────────────────────────────────────┘
                                 │
                                 ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                           LOAD BALANCER                                  │
└────────────────────────────────┬────────────────────────────────────────┘
                                 │
         ┌───────────────────────┼───────────────────────┐
         │                       │                       │
         ▼                       ▼                       ▼
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Post Service  │    │   Feed Service  │    │  User Service   │
│   (Write Path)  │    │   (Read Path)   │    │  (Graph/Follow) │
└────────┬────────┘    └────────┬────────┘    └────────┬────────┘
         │                      │                      │
         │                      ▼                      │
         │             ┌─────────────────┐             │
         │             │  Feed Ranking   │             │
         │             │    Service      │             │
         │             └────────┬────────┘             │
         │                      │                      │
         ▼                      ▼                      ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                            CACHE LAYER                                   │
│    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐                │
│    │ Post Cache  │    │ Feed Cache  │    │ User Cache  │                │
│    │   (Redis)   │    │   (Redis)   │    │   (Redis)   │                │
│    └─────────────┘    └─────────────┘    └─────────────┘                │
└─────────────────────────────────────────────────────────────────────────┘
                                 │
         ┌───────────────────────┼───────────────────────┐
         ▼                       ▼                       ▼
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Posts DB      │    │   Feed Store    │    │  Social Graph   │
│  (Cassandra)    │    │    (Redis)      │    │   (Neo4j/SQL)   │
└─────────────────┘    └─────────────────┘    └─────────────────┘
```

### Fan-out Strategies

```
PUSH (Fan-out on Write)                    PULL (Fan-out on Read)
========================                    =====================

User A posts                               User B requests feed
     │                                          │
     ▼                                          ▼
┌─────────────┐                           ┌─────────────┐
│ Get all     │                           │ Get followed│
│ followers   │                           │ users list  │
└──────┬──────┘                           └──────┬──────┘
       │                                         │
       ▼                                         ▼
┌─────────────┐                           ┌─────────────┐
│ Write post  │                           │ Fetch posts │
│ to each     │                           │ from each   │
│ follower's  │                           │ followed    │
│ feed cache  │                           │ user        │
└─────────────┘                           └──────┬──────┘
                                                 │
                                                 ▼
                                          ┌─────────────┐
                                          │ Merge & Rank│
                                          │ in memory   │
                                          └─────────────┘

✓ Fast reads                              ✓ No wasted writes
✓ Pre-computed feed                       ✓ Fresh content
✗ Slow writes for celebrities             ✗ Slow reads
✗ Wasted storage                          ✗ High read amplification
```

In [None]:
# Hybrid Fan-out Strategy Implementation

CELEBRITY_THRESHOLD = 10000  # Followers

class HybridFanout:
    """Push for regular users, Pull for celebrities."""
    
    def __init__(self):
        self.feed_cache = {}      # user_id -> list of post_ids
        self.posts_db = {}        # post_id -> post_data
        self.followers = {}       # user_id -> set of follower_ids
        self.celebrities = set()  # Users with many followers
    
    def create_post(self, user_id: str, post: dict) -> None:
        post_id = f"post_{len(self.posts_db)}"
        post['author_id'] = user_id
        self.posts_db[post_id] = post
        
        if user_id in self.celebrities:
            # Celebrity: Don't fan out, will be pulled on read
            print(f"Celebrity post {post_id} - stored, no fan-out")
        else:
            # Regular user: Push to all followers' caches
            followers = self.followers.get(user_id, set())
            for follower in followers:
                if follower not in self.feed_cache:
                    self.feed_cache[follower] = []
                self.feed_cache[follower].insert(0, post_id)
            print(f"Regular post {post_id} - pushed to {len(followers)} followers")
    
    def get_feed(self, user_id: str, following: list) -> list:
        # Get pre-computed feed (from push)
        feed_posts = self.feed_cache.get(user_id, [])[:50]
        
        # Pull celebrity posts
        celebrity_posts = []
        for celeb in [u for u in following if u in self.celebrities]:
            celeb_posts = [pid for pid, p in self.posts_db.items() 
                          if p['author_id'] == celeb]
            celebrity_posts.extend(celeb_posts[:10])
        
        # Merge and rank
        all_posts = list(set(feed_posts + celebrity_posts))
        return sorted(all_posts, key=lambda x: self.posts_db[x].get('timestamp', 0), 
                     reverse=True)[:20]

# Demo
fanout = HybridFanout()
fanout.followers['user_a'] = {'user_b', 'user_c'}  # Regular user
fanout.followers['celeb'] = set(f'user_{i}' for i in range(15000))  # Celebrity
fanout.celebrities.add('celeb')

fanout.create_post('user_a', {'content': 'Hello!', 'timestamp': 1})
fanout.create_post('celeb', {'content': 'Celeb post!', 'timestamp': 2})

### Feed Ranking

```python
# Simplified ranking score calculation
def calculate_score(post, user_id):
    # Engagement signals
    likes_score = log(post.likes + 1) * 2
    comments_score = log(post.comments + 1) * 3
    shares_score = log(post.shares + 1) * 4
    
    # Recency decay (half-life = 6 hours)
    age_hours = (now() - post.created_at).hours
    recency_score = 1 / (1 + age_hours / 6)
    
    # Affinity (user-author relationship)
    affinity = get_interaction_score(user_id, post.author_id)
    
    # Content type boost
    type_boost = {'video': 1.5, 'image': 1.2, 'text': 1.0}[post.type]
    
    return (likes_score + comments_score + shares_score) \
           * recency_score * affinity * type_boost
```

### Database Choices

| Data | Storage | Rationale |
|------|---------|----------|
| **Posts** | Cassandra / DynamoDB | High write throughput, time-series partitioning |
| **Feed Cache** | Redis (Sorted Sets) | Fast reads, automatic ordering, TTL |
| **Social Graph** | Neo4j / TAO (FB) | Efficient traversal, friend-of-friend queries |
| **User Profiles** | PostgreSQL | Strong consistency, ACID for auth |
| **Media** | S3 + CDN | Blob storage, global distribution |

### Key Trade-offs Summary

| Decision | Trade-off |
|----------|----------|
| **Push vs Pull** | Hybrid: Push for regular users (fast reads), Pull for celebrities (avoid hot partitions) |
| **Consistency** | Eventual consistency acceptable - users won't notice 1-2 second delay |
| **Feed Length** | Cache last 800 posts per user, older posts fetched from DB |
| **Ranking** | Real-time ranking expensive - pre-compute + lightweight re-ranking |
| **Sharding** | Shard by user_id for feed cache, by post_id for posts DB |

---
## Quick Reference: Common Patterns

| Pattern | Use Case | Example |
|---------|----------|--------|
| **Write-behind cache** | High write volume | News feed cache |
| **Read-through cache** | Frequent reads | URL shortener lookups |
| **Sharding** | Horizontal scale | User ID hash-based |
| **Async processing** | Decoupling | Fan-out via message queue |
| **Circuit breaker** | Fault tolerance | Rate limiter fallback |
| **Bloom filter** | Existence check | "Have I seen this URL?" |