# Overview (Example: Twitter)

## Requirement clarification

- Do users post tweets and follow other people?
- Do users search tweets?
- Need to display user timeline?
- Need to push notification on new tweets?
- Need to display trending topics?
- Do tweets contain photos and videos?
- Are we discussing backend only or frontend too?

## System estimation

- Numer of tweets? Number of users? 
- Size of storage?
- Network bandwidth?

## System interface

- What APIs are expected?

## Data model

- User (UserID, Name, Email, CreationDate, LastLogin, etc)
- Tweet (TweetID, Content, TimeStamp, etc)
- UserFollow (UserID1, UserID2)

## High-level design

- Multiple app servers to do read/write with LoadBalancer in front.
- Efficient DB to store all tweets and support large number of reads.
- Distributed file storage to store photos and videos.

## Detailed design

- How to partition data and distribute it to multiple DBs.
- How to handle "hot" users.
- At which layer should we introduce cache. 

## Identify bottleneck

- Is there a single point of failure?
- Do we have replicas of data?
- Do we have availability set of app servers?
- Do we monitor servers and get alerts?

# TinyURL

- Short link. This is used to save a lot of space. This URL must not be guessable.
- When users clock short link, they are redirected to the original URL.

## Functional requirements

- Given a URL, generate a short version.
- When users click short link, redirect them to original link.
- Link should expire after certain time.

## Non-functional requirements

- High availability.
- Redirection should have minimum latency.
- Short link must not be guessable.

## Exteneded requirements

- Service should be accessible via REST API

## Design

Encode the URL
- Want to generate characters at the end of the URL.
- Assume base64 encoding.
    - 6 letters 64^6 = 68.7 billion possible strings.
    - 8 letters 64^8 = 281 trillion possible strings.
- Duplication: if multiple users enter the same URL, they can get the same shortened URL, which is not good.
         
Key generating service
- Generates random 6 letter strings and store them in key DB.
- What is two or more servers are trying the use the same key?
    - One table for keys not in used, one table for keys in use.
    - Load some keys in memory to give to the servers and at the same time move them to "Used" table.
        - If servers die, we lose keys. But that is okay given we've got enough keys to cover shortening requests.
- DB size: 6 (characters per key) * 68.7 billion (unique keys) = 412GB
- Single point of failure? Use replicas. 
     
## Capacitiy estimate

Traffic
- 500M URL shortening per month.
- 100:1 read:write ratio.
- Numer of redirections per month: 100 * 500M = 50B
- Queries per second: 500M / (30 days * 24 hrs * 3600 seconds) = 200 URLs/s
- URL redirections per second: 100 * 200 URLs/s = 20k URLs/s 

Storage
- Assume we store URL shortening request for 5 yrs.
- Total number of objects to store: 500M * 5 yrs * 12 months = 30 billion
- Assume each object is around 500 bytes.
- Total storage: 30 billion * 500 bytes = 15TB

Bandwidth
- Write (Queries per second is 200 URLs): 200 * 500 bytes = 100 kB/s
- Read (URL redirections per second is 20k URLs): 20k * 500 bytes = 10 MB/s

Memory
- Assume 20% of URLs generate 80% of traffic. (Hot URLs)
- Request per day: 20k URLs/s * 3600 seconds * 24 hrs = 1.7 billion
- To cache 20% of these requests: 0.2 * 1.7 billion * 500 bytes = 170 GB
    - There will be duplicate requests of the same URLs, so the actual required memory will be less.
    
## API

- createURL(api_dev_key, original_url, expire_data)
    - api_dev_key: API developer key of registered account.
    - original_url: URL to be shortened.
    - expire_date: optional. If not specified, default to some value. 
- deleteURL(api_dev_key, short_url)
    - short_url: shortened URL.
- To prevent user abuse, limit "api_dev_key" to certain number of creations and redirections per time period.     
   
## DB

- Billions of records.
- Each object is small. (500k)
- Read heavy.
- Since there is no relationship between records, No SQL should be chosen.
- Schema "URL"
    - Hash(varchar 16, PK)
    - OriginalURL(varchar)
    - CreationDate(datetime)
    - ExpirationDate(datetime)
    - UserID(int) 
- Schema "User"
    - UserID(int)
    - Name(varchar)
    - Email(varchar)
    - CreationDate(datetime)
    - LastLogin(datetime)

## Data partitioning

- Consistent hashing

## Caching

- LRU

## Load balancing

- Round robin

## DB cleanup

- Lazy cleanup
    - When users access expired links, delete it from DB, and return error to users.
    - Put the key back to key DB.

# Design Pastebin

- Store texts and access the data using URLs.

## Capacity estimate

- There will be reads of URLs than new pastebin creation. Assume 5:1 ratio.
- Traffic: assume 1 million pastes per day (then, 5 million reads per day)
    - New pastes per second: 1M / (24 hrs * 3600 seconds) = 12 pastes/s
    - Paste reads per second: 5M / (24 hrs * 3600 seconds) = 58 pastes/s
- Storage: users can upload max 10MB data. Assume on average 10KB data.
    - We store 1M * 10KB = 10GM per day.
    - Storing this data for 10 yrs requires 36TB.
    - With 70& capacity model (we don't use more than 70% capacity at any point), we need 51.4TB.
- Bandwidth: 
    - With 12 pastes/s writes, we need 12/s * 10KB = 120KB/s ingress.
    - With 58 pastes/s read, we need 58/s * 10KB = 580KB/s ingress.
- Memory: assume 20/80 rule 
    - 0.2 * 5M * 10KB = 10GM memory need to cache.
    
## System API

- addPaste(api_dev_key, paste_data)
    - api_dev_key: API developer key of registered account
    - paste_data: text data to paste.
- getPaste(api_dev_key, api_paste_key)
    - api_paste_key: string representing the paste key of paste to be retrieved.
- deletePaste(api_dev_key, api_paste_key)

## Database

- Billions of records.
- Each object is medium sized (max 10MB)
- Read heavy
- No relations
- Schema "Paste": Hash(varchar 16, PK), ContentKey(varchar), CreationDate(datetime), ExpirationDate(datetime), UserID(int)
- Schema "User": UserID(int), Name(varchar), Email(varchar), CreationDate(datetime), LastLogin(datetime)

# Design Instagram

## Capacity estimate

- Assume 
    - 500M total users with 1M daily active users.
    - 2M new photos every day (23 new photos / s)
    - Average photo size 200KB
- Space for 1 day's amount of photo: 2M * 200KB = 400GB
- If 10 yrs, 400GB * 365 * 10 = 1425TB

## Database

- Schema "Photo": PhotoID(int, PK), UserID(int), PhotoPath(varchar), PhotoLatitude(int), PhotoLongitude(int), UserLatitude(int), UserLongitude(int), CreationDate(datetime)
- Schema "User": UserID(int, PK), Name(varchar), Email(varchar), CreationDate(datetime), LastLogin(datetime)
- Schema "UserFollow": FollowerID(int, PK), FolloweeID(int, PK)

## Data size
- Assume "int" and "datetime" are 4 bytes.
- User 
    - UserID(4 bytes) + Name(20 bytes) + Email(32 bytes) + DateOfBirth(4 bytes) + CreationDate(4 bytes) + LastLogin(4 bytes) = 68 bytes
    - With 500M users, we need 68 * 500M = 32GB
- Photo
    - PhotoID(4 bytes) + UserID(4 bytes) + PhotoPath(256 bytes) + PhotoLatitude(4 bytes) + PhotoLongitude(4 bytes) + UserLatitude(4 bytes) + UserLongitude(4 bytes) + CreationDate(4 bytes) = 284 bytes
    - With 2M photos everyday, we need 284 * 2M = 0.5GB
    - For 10 yrs, we need 1.88TB
- UserFollow
    - Assume each user follows 500 other users and each row in UserFollow table is 8 bytes: 5M * 500 * 8 bytes = 1.82TB
- Total space: 32GB + 1.88TB + 1.82TB = 3.7TB

## Component design

- Split read and write services such as "uploads" don't hog the system.

# Design Dropbox

- Store data on remote servers.
- Read and write will be huge. (assume the same ratio)
- Fill will be stored in small chunks (assume 4MB)

## Capacity estimate

- Assume 500M total users, 100 daily active users.
- Assume each user connects from three different devices.
- Assume each user has 200 files/photos, so there we have 100 billion total files.
- Assume average file size is 100KB. We have 100B * 100KB = 10PB
- Assume 1M active connections per minute.

## High-level design

- Need to store file metadata (name, size, path, shared with who, etc)

## Component design

- Client
    - Internal metadata DB: keeps metadata in the client to save round-trips to update remote metadata.
    - Chunker: splits files into small pieces.
    - Watcher: monitors workspace and notify Indexer of any user actions. Listens to changes in other clients.
    - Indexer: processes events from Watcher and updates internal metadata DB. Also, updates remote DB via talking to remote sync service.
- Metadata DB
- Sync service
- Message queue
    - Handles communication between clients and sync service.
- Block storage
    - Stores chunks of files.

# Design Facebook messenger

## Functional requirement

- Support 1-on-1 conversation between users.
- Tracks online/offline status of users.
- Persists chat history.

## Non-functional requirement

- Minimum latency when chatting.
- Consistency: same chat history from all devices.
- High availability

## Capacity estimate

- Assume 500M daily active users.
- Assume each user sends 40 messages per day.
- 500M * 40 messages = 20B messages per day.

Storage
- Assume 100 bytes per message.
- 20B messages per day * 100 bytes = 2TB per day.

Bandwidth
- 2TB / 86400s = 25MB/s 

## High-level design

When user A sends a message to user B.
- Server receives the message and send ack back to A.
- Server stores the message into DB and sends message to B.
- B receives the message and sends ack back to the server.
- Server notifies A that the message has been delivered.

## Detailed design

# Design Twitter

## Functional requirement

- Users post new tweets.
- Users follow other users.
- Users mark tweets as favs.
- Display users timeline with top tweets.
- Tweets contain photos and videos.

## Non-functional requirement

- High availability.
- 200 ms latency for timeline generation.

## Capacity estimate

- Assume 
    - 1B total users.
    - 200M daily active users.
    - 100M new tweets every day.
    - Each user follows 200 people.
    - Each user favs 5 tweeks per day.
- 200M * 5 favs = 1B favs per day.

## API

- tweet(api_dev_key, tweet_data, tweet_location, user_location, media_ids)

## High-level design

- Client -> load balancer -> app servers -> DB & file storage

## DB

Tweet
- TweetID (int, pk)
- UserID (int)
- Content (varchar)
- CreationDate (datetime)
- NumFavs (int)

User
- UserID (int, pk)
- Name (varchar)
- Email (varchar)
- DateOfBirth (datetime)
- CreationDate (datetime)
- LastLogin (datetime)

UserFollow
- UserID1 (int, pk)
- UserID2 (int, pk)

Favorite
- TweetID (int, pk)
- UserID (int, pk)
- CreationDate (datetime)

# Design Youtube

## Functional requirement

- Users upload videos.
- Users share and view videos.
- Users search videos.
- Tracks likes/dislikes, number of views.
- Users add and view comments.

## Non-functional requirement

- High availability.
- High reliability.
- Users should not feel lag watching videos.

## Capacity estimate

- Assume
    - 1.5B total users.
    - 800M daily active users.
    - Users watch 5 videos per day.
    - Ratio of upload:view is 1:200. 
- Videos view per second: 800M * 5 / 86400s = 46K videos/s.
- Videso upload per second: 46K / 200 = 230 videos/s.

## API

- uploadVideo(api_dev_key, video_title, video_description, tags[], category_id, default_language, recording_details, video_contents)
- searchVideo(api_dev_key, search_query, user_location, max_videos_to_return)
- streamVideo(api_dev_key, video_id, offset, codec, resolution)

## High-level design

- Client <-> web server <-> app server <-> DB / video storage.

## DB

# Design Typeahead Suggestion

## Functional requirement

- Suggests top 10 terms starting with whatever user has typed.

## Non-functional requirement

- Users see suggestions within 200ms.

# Design API Rate Limiter

- Prevents Dos attack, brute force password attempts, etc.
- Throttling: control the usage of API. When throttle limit is reached, server returns “429 - Too many requests".

## Functional requirement

- Limit number of requests (Eg. 15 requests per second)
- APIs are accessible through a cluster.

## Non-functional requirement

- High availability.
- Low latency.

## High-level design

- Client <-> web server <-> rate limiter
- Client <-> web server <-> API server
- Web server first asks rate limiter whether request should be served or throttled.

# Design Twitter Search

- Search over all user tweets.

## Capacity estimate

- Assume
    - 1.5B total users, 800M daily active users.
    - 400M tweets per day.
    - Each tweet is 300 bytes.
    - 500M searches per day.
    - Search query contains multiple words.
- Storage: (400M tweets/day) * (300 bytes/tweet) = 120GB/day = 1.38MB/s

## API

- search(api_dev_key, search_terms, max_results_to_return)

## High-level design

- Client <-> app server <-> DB server

# Design Web Crawler

## High-level design

- Pick a URL from unvisited URL list.
- Determine the ID address.
- Download the document.
- Parse document contents to look for new URLs.
- Add new URLs to unvisited URL list.
- Process downloaded document.

# Design Facebook Newsfeed

## Functional requirement

- Feed is generated based on posts that user follows.
- Feed may contain images, videos, texts.
- Supports adding new posts as they arrive.

## Non-functional requirement

- Generates feeds in real-time, with latency less than 2s.

## Capacity estimate

- Assume
    - User has 300 friends.
    - User follows 200 pages.
    - 300M daily active users.
    - User fetches timeline 5 times a day.
- Traffic
    - 1.5B feed requests per day or 17500 requests per second.
- Storage

## API

- getUserFeed(api_dev_key, user_id)

## DB

# Design Yelp

# Design Uber 

# Design Ticketmaster