# Designing Instagram
Let's design a photo-sharing service like IG, where users upload photos to share them with other users.

Instagram enables its users to upload and share their photos and videos with other users. Users can choose to share information publicly or privately. Anything shared publicly can be seen by any other user, whereas privately shared content can only be accessed by a specified set of people.

We plan to design a simpler version of Instagram, where a user can share photos and can also follow other users. 

## 1. Requirements and Goals of the System

#### Functional requirements
1. Users should be able to upload/download/view photos
2. Users can perform searches baed on photo/video titles
3. Users can follow other users
4. The system should generate Newsfeed consisting top photos from all the people the user follows

#### Non-functional requirements
1. The service needs to be highly available
2. The acceptable latency is 200ms for News Feed generation
3. The system should be highly reliable; any uploaded photo/video should never be lost.

## 2. Capacity Estimation and Constraints
The system would be read-heavy, so we'll focus on buiding a system that can retrieve photos quickly.

- Assume we have 500M total users, with 1M daily active users.
- 2M new photos every day, 23 new photos per secod.
- Average photo file size ~= 200KB
- Total space required for a 1 day of photos => 
    ```
    2M * 200KB => 400GB
    ```
- Total space for 10 years:
    ```
    400GB * 365 days * 10 years ~= 1425 TB => 1.4 Petabytes
    ```

## 3. High Level System Design
At a high-level, we need to support two scenarios: uploading photos and view/searching photos.

We need object storage servers to store photos and also some DB servers to store metadata information about the photos

![](images/instagram_high_level_design.png)

## 4. Database Schema
> DB schema will help understand data flow among various components and later guid towards data partitioning.

We need to store user data, their photos, and people they follow.
Photo table 


>| Photo    |            
| --- |             
| PhotoID: int (PK) |     
| UserID: int     |
| PhotoLatitude: int |
| PhotoLongitude: int |
| UserLatitude: int |
| UserLongitude: int |
| CreationDate: datetime |


>| User |
| --- |
| UserID: int (PK) |
| Name: varchar(20) |
| DOB: datetime |
| CreatedAt: datetime |
| LastLogin: datetime |

>|UserFollow |  |
|---|---|
| PK | UserID1: int |
| PK | UserID2: int|


We could use an RDBMS like MySQL since we require joins. But relational DB come with their challenges, especially when we need to scale them. So we'll store the schema in a distributed wide-column NoSQL datastore like [Cassandra](https://en.wikipedia.org/wiki/Apache_Cassandra).
All the photo metadata can go to a table where the 'key' is the `PhotoID` and the 'value' would be an object containing Photo related details.
Cassandra or key-value stores in general always maintain a certain number of replicas to offer reliability. Also in such data stores, deletes don't get applied instantly, data is retained for a few days to support undeleting, before getting removed permanently.

We can store the actual photos in as distributed file storage like [Hadoop](https://en.wikipedia.org/wiki/Apache_Hadoop) or [S3](https://en.wikipedia.org/wiki/Amazon_S3).





## 5. Data Size Estimation

Let's estimate how much storage we'll need for the next 10 years.

### User
Assuming each int and datetime is 4 bytes, each row in User table will have:
```
UserID(4) + Name(20 bytes) + Email(32 bytes) + DOB(4 bytes) + 
CreatedAt(4) + LastLogin(4) = 68 bytes
```
We have 500 million users:
```
500 million * 68 ~= 32 GB
```

### Photo
Each row in Photos table will have:
```
PhotoID (4 bytes) + UserID (4 bytes) + PhotoPath (256 bytes) + PhotoLatitude (4 bytes) + PhotLongitude(4 bytes) + UserLatitude (4 bytes) + UserLongitude (4 bytes) + CreationAt (4 bytes) = 284 bytes
```
We get 2M photos every day, so for one day we need:
```
2 M * 284 bytes ~= 0.5 GB per day

For 10 years we'll need:
0.5GB per day * 365 days * 10 years => 1.88 TB
```

### UserFollow
Each row will have 8 bytes. Assume on average, each user follows 500 users, We would need 1.82 TB of storage for the UserFollow Table:
```
8 bytes * 500 followers * 500M users ~= 1.82 TB
```
Total space required for the DB tables will be:
```
32 GB + 1.88 + 1.82  ~= 3.7TB

## 6. Component Design
Photo uploads (or writes) can be slow as they have to go to the disk, while reads will be faster, especially if they are being served from cache.

Uploading users can consume all available connections, as uploading is a slow process, meaning reads can't be served if the system gets busy with all the write requests. We should keep in mind that all web servers have a connection limit. If we assume that a web server can have a maximum of 500 connections at any time, then it ca't have more than 500 concurrent reads and uploads. To handle this bottleneck, we can split reads and writes into seperate services. We'll have dedicated servers for reads and different servers for uploads to ensure uploads don't hog the system.

> Also, separating reads from writes will allow us to scale and optimize each operation independently.

![](images/ig_read_writes.png)

## 7. Reliability and Redundancy
Losing files is not an option for our service. 

We'll store multiple copies of each file so that if one storage server dies, we can retrieve another copy on a different storage server.

This principle also applies to the rest of the system. IF we want high availability, we need to have multiple replicas of services running, so that if a few services go down, the system remains available and running. 

> Redundancy removes the single point of failure in the system, taking control after a failover.

If there are two instances of the same service running on production and one fails, the system can failover to the healthy copy. Failover can happen automatically or be done manually.

![](images/ig_redundancy.png)