# API rate limiter

## Functional
- Limits the number of request within a time window.
- User can get the proper error message.

## Non-functional
- High availability.
- Low latency.

## Design
- Web server asks rate limiter whether the request should be served or throttled.

<img src="img/api-rate-limiter1.png" style="width:500px;height:500px;">

1. Fixed window
- Use hash table where key is UserId and value is {count, start time as epoch time}.
- Start time is always normalized at minute. (Start time is reset at the end of every minute)
- Compute difference between current time and start time and if the diff is less than 1 minute, check count to decide whether to allow or reject the request.
- Assume the limit per minute is 3. User can send 3 requests at the last second of minute and another 3 request at the first second of the next minute. This results in 6 requests in two seconds, which will be a spike.

2. Sliding window
- Use hash table where key is UserId and value is a sorted set of epoch time.
- Remove all timestamps older than current time - 1 minute.
- Check the length of sorted set to decide whether to allow or reject the request.
- If accept the request, add current time in sorted set.
- Maintaining the sorted set requires much more memory than the previous approach.

3. Sliding window with counter
- For each user, use hash table where key is timestamp (starting at each minute) and value is counter.

Rate limit by IP or user?
- IP only
    - What if multiple users use a single public IP?
    - Hacker can easily impersonate a lot of users using IPv6 addresses.
- User only
    - Rate limit on API with valid authentication token.
    - We cannot rate limit on login API itself.
- We should rate limit on per IP per user.    
   
## Data partitioning
- Shard based on UserID.
- Use consistent hashing for fault tolerance and replication.

## Caching
- Cache recent active users.
- Use write-back cache.
- Use LRU eviction policy.