- Given a url, the system will return a shortened one.
- When short link accessed, it will redirected to original url.
- The short link will expired within 5 years.
- User can customise and change the short link alias, by default its random 6 unique characters.
- In order to customise user need to logged in.
- There will be history of short link per user.
- Single long url might have multiple aliases (custom or generated by system)
- Short link must be highly available all the time
- URL Redirections should be minimum latency
- Assumption
- Read-heavy systems. More redirection compared to create new short link.
- Assume 100 : 1 ratio between read and write.
- Traffic estimates
- Assume 1B read monthly, 10M new url shortened.
- URL Shortening per seconds
- 10m / (30 * 24 * 3600) =~ 4 URLs/s
- URL Redirection per seconds
- 1B / (30 * 24 * 3600) =~ 390/s
- Storage estimates
- Assume storing every url shortening request for 5 years, each object takes 500bytes.
- Total objects: 10M * 12 * 5 = 600M
- Total storages: 600M * 500bytes = 300GB
- Bandwith estimates
- Write 4 URLs/s * 500 bytes/URL = 2KB/S
- Read 390/s * 500 bytes/URL = 195KB/S
- Cache memory estimates
- Using 80-20 rules, 20% URL generate 80% traffic, cache 20% hot URLs
- Request per day: 390 * 3600 * 24 = 33.7 Million/day
- Cache 20 % : 0.2 * 33.7 Million * 500 bytes = ~3.37GB
- Using 80-20 rules, 20% URL generate 80% traffic, cache 20% hot URLs
- POST /shortener
- Parameters
- original_url - string
- custom_alias - optional, string
- Return
- id
- short_link - string
- expiry_date - string
- Parameters
- GET /shortener
- Parameters
- short_link - string
- Return
- short_link - string
- original_url - string
- Parameters
- DELETE /shortener
- Parameters
- short_link - string
- Return
- Success 204 HTTP
- Parameters
- POST /user
- Parameters
- email - string
- password - string
- username - string
- Return
- id - string
- email - string
- Parameters
- POST /auth
- Parameters
- email - string
- password - string
- Return
- authentication_token - string
- Parameters
- To improve performance of creating new short url, key generation should be done beforehand.
- Separated service that have their own database (Key Generation Service)
- Key generation service have a memory database to quickly provide key to main service.
- Once key aliases moved to memory database, it can be mark as used / deleted
-
Observation
- Need to store hundred millions of records
- Read-heavy
- One to many relationship between original url and alias
-
Schema for Main Service
- short_url
- id - PK
- alias - varchar(8) unique
- user_id - FK
- original_url - varchar (2048)
- created_date - datetime
- user
- id - PK
- email - varchar(64) unique
- created_date - datetime
- password - varchar(50)
- short_url
-
Schema for Key Generation Service
- key_alias
- id - PK Auto increment
- key - varchar(8)
- is_used
- key_alias