Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add backend API rate limiting #175

Closed
andreaskoepf opened this issue Dec 30, 2022 · 13 comments
Closed

Add backend API rate limiting #175

andreaskoepf opened this issue Dec 30, 2022 · 13 comments
Assignees
Labels

Comments

@andreaskoepf
Copy link
Collaborator

Add a throttling system to the backend REST endpoints that tracks and limits the interaction frequency of users/api-clients. This will help us to prevent or slow down users/automated systems form flooding our database. Preferably we would like to limit on multiple keys, e.g. user and api_client (each with different thresholds).

  1. Review existing libraries such as laurentS/slowapi & fastapi-limiter and select one (please justify your choice).
  2. Make necessary changes in to existing backend endpoints
@GraemeHarris
Copy link
Collaborator

@andreaskoepf Happy to pick this up if free - will add some local load testing and report in the PR

@kiritowu
Copy link
Collaborator

kiritowu commented Dec 30, 2022

The following are my two-cents after reading through the source code and documentations for slowapi and fastapi-limiter:

Storage Options

  • slowapi: Redis (Cluster), Memcached, MongoDB, In-built Memory
  • fastapi-limiter: Redis

Since In-built Memory and MongoDB is eliminated due to scalability issue and relative latency (as compared to memory-based alternatives), the two storage options left are Redis vs Memcached in which Redis seems to be a better alternative due to its Replication capabilities and Lua scripting.

Code Usage

  • slowapi: Python Decorator
  • fastapi-limiter: FastAPI Dependencies

One limitation of slowapi is that the request argument must be explicitly stated and hinted as fastapi.Request in the endpoints parameters like this.

@limiter.limit("5/minute")
async def myendpoint(request: Request)
    pass

After reviewing the existing backend code, the request is now type hinted as sub-classes from pydantic.BaseModel to be benefited from fastapi's automated body conversion. However, this will run into problem with slowapi's syntax compatibility since it checks whether is the request argument a subclass for starlette.Request in these few lines.

Solving this issue will require some changes to the naming convention of all our current routes like this:

@router.post("/", response_model=protocol_schema.AnyTask)  # work with Union once more types are added
def request_task(
    *,
    db: Session = Depends(deps.get_db),
    api_key: APIKey = Depends(deps.get_api_key),
    body: protocol_schema.TaskRequest,
    request: Request
) -> Any:
    pass

In contrast, fastapi-limiter's usage is relatively straightforward and more compatible with current code base

import redis.asyncio as redis
import uvicorn
from fastapi import Depends, FastAPI

from fastapi_limiter import FastAPILimiter
from fastapi_limiter.depends import RateLimiter

app = FastAPI()


@app.on_event("startup")
async def startup():
    redis = redis.from_url("redis://localhost", encoding="utf-8", decode_responses=True)
    await FastAPILimiter.init(redis)


@app.get("/", dependencies=[Depends(RateLimiter(times=2, seconds=5))])
async def index():
    return {"msg": "Hello World"}


if __name__ == "__main__":
    uvicorn.run("main:app", debug=True, reload=True)

Shared Limits for Multiple Endpoints

  • slowapi: Supported
  • fastapi-limiter: Not supported

Slowapi provide support of shared rate-limits across multiple endpoints whereas fastapi-limiter's way of redis key definition makes it impossible to have shared limits across multiple endpoints.

However, in our current use-case, it seems like shared limits is not a must-have feature.

Multiple Key Support

  • slowapi: Supported
  • fastapi-limiter: Supported

Both allow limits on multiple keys with different thresholds for a single endpoint, whereby one need to configure by the key_func method for slowapi or identifier for fastapi-limter parameter

Summary

Both slowapi and fastapi-limiter offers very similar functionalities (and arguably performance since both uses lua script with redis), my personal preference is fastapi-limiter since the syntax is cleaner and easier to implement provided redis is used as the storage backend and shared limit across endpoint is not a required feature. Hope this is helpful 😄

@andreaskoepf
Copy link
Collaborator Author

@GraemeHarris Thanks for offering help with testing, load test code would be appreciated!

@kiritowu Thanks for the detailed comparison. We will follow your recommendation and use fastapi-limiter. If you want to work on this, could you first please submit a separate PR which adds redis to our docker-compose.yaml.

@kiritowu
Copy link
Collaborator

Sure, so there will be around ~3 PR for this issue:

  1. redis-in-docker-compose
  2. implement-fastapi-limiter
  3. limiter-load-testing

@GraemeHarris have you started any code on this? If yes, then few free to take this issue as it is around midnight in my timezone, probably only have time to work on this like 10 hours later.

@GraemeHarris
Copy link
Collaborator

@kiritowu thanks for the write-up :). I had only done some of the initial research - but I think I can get started on the redis in-docker-compose step if that works for you? I'll update here on if I get to the next steps. I am free for my evening now. (EU-timezone)

@kiritowu
Copy link
Collaborator

Sgtm!

@andreaskoepf
Copy link
Collaborator Author

👍 Thanks for peer-coordination!

@GraemeHarris
Copy link
Collaborator

@kiritowu #187 Basic updates for redis - the containers spin up alright on my local so hopefully helps set you to run when you're back in

@kiritowu
Copy link
Collaborator

kiritowu commented Jan 1, 2023

@andreaskoepf I have completed a working prototype for the rate-limiter. There is one part that I wish to clarify which is the limits on multiple keys portion:

AFAIK, the existing api_client represents any authenticated user that could be interacting with the service as both user(via website) or bots(via discord bots).

When you mention "limit on multiple keys, e.g. user and api_client", does it means that a database lookup will be performed to determine whether the given api key represents any signed-in users or just a bot?

@kiritowu
Copy link
Collaborator

kiritowu commented Jan 2, 2023

@andreaskoepf Another part is for the OasstErrorCode, will TOO_MANY_REQUESTS = 429 be a suitable error code? (since that also is the HTTP error code for too many requests).

class OasstErrorCode(IntEnum):
    """
    Error codes of the Open-Assistant backend API.

    Ranges:
         0-1000: general errors
      1000-2000: tasks endpoint
      2000-3000: prompt_repository
    """

    # 0-1000: general errors
    GENERIC_ERROR = 0
    DATABASE_URI_NOT_SET = 1
    API_CLIENT_NOT_AUTHORIZED = 2
    SERVER_ERROR = 3
    TOO_MANY_REQUESTS = 429

@yk
Copy link
Collaborator

yk commented Jan 2, 2023

just my two cents, since andreas is blocked a bit:

  • 429 is perfect
  • in my opinion, the rate limiter can be per tuple (api_key, user), fully rely on redis, and does not have to go via the sql DB to cross-check for other frontends. If a user is both on the discord bot and the website, it will take us twice as long to rate-limit them, that still seems fine.

@kiritowu
Copy link
Collaborator

kiritowu commented Jan 2, 2023

@andreaskoepf & @GraemeHarris, backend rate limiting PoC is now completed and is up for PR.

A few To-Dos that I have identified:

  • Identify list of endpoints that requires API rate limiting and verify its compatibility with existing rate limiter
  • Perform load testing to study the appropriate parameters for both user and api-client of each endpoints.

Let me know what are your thoughts. peace out ✌️

@andreaskoepf
Copy link
Collaborator Author

I will close this as basic rate limiting is working efficiently and the existing code acts as nice example how to add rate limits in case that should be necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

No branches or pull requests

4 participants