# Facebook Newsfeed

## Functional
- Feed is generated based on posts from people, pages, groups that user follows.
- User can have/follow many friends, pages, groups.
- Feed can contain images, videos, texts.
- Supports adding new posts as they arrive.

## Non-functional
- Generates feeds in real-time, with latency less than 2s.
- Posts take less than 5s to make it to user's feed.

## Capacity

### Assume
- Each user has 300 friends.
- Each user follows 200 pages.
- 300M daily active users.
- User fetches timeline 5 times a day.
- Each user's feed has 500 posts in memory.
- Each post is 1KB.

### Traffic
- 1.5B feed requests per day = 17500 requests per second.

### Storage
- 500KB for each user.
- 150TB for all users.

## API
- getUserFeed(api_dev_key, user_id, count)
    - Returns JSON containing a list of feed items.

## DB
- Relational database.
- Assume only users can create FeedItem.
- FeedItem can optionally have EntityID pointing to page or group.

### User
- UserID (int, pk)
- Name (varchar)
- Email (varchar)
- CreationDate (datetime)
- LastLogin (datetime)

### Entity
- EntityID (int, pk)
- Name (varchar)
- Type (tinyint)
- Description (varchar)
- CreationDate (datetime)

### UserFollow
- UserID (int)
- EntityOrFriendID (int)

### FeedItem
- FeedItemID (int, pk)
- UserID (int)
- EntityID (int)
- Content (varchar)
- CreationDate (datetime)
- Numlikes (int)

### FeedMedia
- FeedItemID (int)
- MediaID (int)

### Media
- MediaID (int, pk)
- Type (smallint)
- Description (varchar)
- CreationDate (datetime)

<img src="img/facebook-newsfeed1.png" style="width:800px;height:600px;">

## Design

### Feed generation
- Retrive ID of all users and entities that the user follows.
- Retrive the latest, most popular, most relevant posts for those IDs. 
- Rank these posts.
- Store this feed into cache.
- Return top 20 posts to the user.
- Repeat these steps every 5 minutes to check for new posts.

### Web server
- Maintain connection with users.

### Application server
- Store new posts in DB servers.
- Retrieve and push feeds to users. 

### Metadata DB and cache
- Store metadata about users, pages, groups.

### Post DB and cache
- Store metadata about posts.

### Video and photo storage, and media cache
- Blob storage to store all media.

### Newsfeed generation service
- Generate feed offline: dedicated servers that generate user feed and store them in memory.

### Feed notification service
- Pull: users pull feed at regular interval.
- Push: post is pushed to all followers whenever it is published.
- Celerity users should have pull model. 