-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add backend API rate limiting #175
Comments
@andreaskoepf Happy to pick this up if free - will add some local load testing and report in the PR |
The following are my two-cents after reading through the source code and documentations for slowapi and fastapi-limiter: Storage Options
Since In-built Memory and MongoDB is eliminated due to scalability issue and relative latency (as compared to memory-based alternatives), the two storage options left are Redis vs Memcached in which Redis seems to be a better alternative due to its Replication capabilities and Lua scripting. Code Usage
One limitation of slowapi is that the @limiter.limit("5/minute")
async def myendpoint(request: Request)
pass After reviewing the existing backend code, the request is now type hinted as sub-classes from Solving this issue will require some changes to the naming convention of all our current routes like this: @router.post("/", response_model=protocol_schema.AnyTask) # work with Union once more types are added
def request_task(
*,
db: Session = Depends(deps.get_db),
api_key: APIKey = Depends(deps.get_api_key),
body: protocol_schema.TaskRequest,
request: Request
) -> Any:
pass In contrast, fastapi-limiter's usage is relatively straightforward and more compatible with current code base import redis.asyncio as redis
import uvicorn
from fastapi import Depends, FastAPI
from fastapi_limiter import FastAPILimiter
from fastapi_limiter.depends import RateLimiter
app = FastAPI()
@app.on_event("startup")
async def startup():
redis = redis.from_url("redis://localhost", encoding="utf-8", decode_responses=True)
await FastAPILimiter.init(redis)
@app.get("/", dependencies=[Depends(RateLimiter(times=2, seconds=5))])
async def index():
return {"msg": "Hello World"}
if __name__ == "__main__":
uvicorn.run("main:app", debug=True, reload=True) Shared Limits for Multiple Endpoints
Slowapi provide support of shared rate-limits across multiple endpoints whereas fastapi-limiter's way of redis key definition makes it impossible to have shared limits across multiple endpoints. However, in our current use-case, it seems like shared limits is not a must-have feature. Multiple Key Support
Both allow limits on multiple keys with different thresholds for a single endpoint, whereby one need to configure by the SummaryBoth slowapi and fastapi-limiter offers very similar functionalities (and arguably performance since both uses lua script with redis), my personal preference is fastapi-limiter since the syntax is cleaner and easier to implement provided redis is used as the storage backend and shared limit across endpoint is not a required feature. Hope this is helpful 😄 |
@GraemeHarris Thanks for offering help with testing, load test code would be appreciated! @kiritowu Thanks for the detailed comparison. We will follow your recommendation and use fastapi-limiter. If you want to work on this, could you first please submit a separate PR which adds redis to our docker-compose.yaml. |
Sure, so there will be around ~3 PR for this issue:
@GraemeHarris have you started any code on this? If yes, then few free to take this issue as it is around midnight in my timezone, probably only have time to work on this like 10 hours later. |
@kiritowu thanks for the write-up :). I had only done some of the initial research - but I think I can get started on the redis in-docker-compose step if that works for you? I'll update here on if I get to the next steps. I am free for my evening now. (EU-timezone) |
Sgtm! |
👍 Thanks for peer-coordination! |
@andreaskoepf I have completed a working prototype for the rate-limiter. There is one part that I wish to clarify which is the limits on multiple keys portion: AFAIK, the existing api_client represents any authenticated user that could be interacting with the service as both user(via website) or bots(via discord bots). When you mention "limit on multiple keys, e.g. user and api_client", does it means that a database lookup will be performed to determine whether the given api key represents any signed-in users or just a bot? |
@andreaskoepf Another part is for the class OasstErrorCode(IntEnum):
"""
Error codes of the Open-Assistant backend API.
Ranges:
0-1000: general errors
1000-2000: tasks endpoint
2000-3000: prompt_repository
"""
# 0-1000: general errors
GENERIC_ERROR = 0
DATABASE_URI_NOT_SET = 1
API_CLIENT_NOT_AUTHORIZED = 2
SERVER_ERROR = 3
TOO_MANY_REQUESTS = 429 |
just my two cents, since andreas is blocked a bit:
|
@andreaskoepf & @GraemeHarris, backend rate limiting PoC is now completed and is up for PR. A few To-Dos that I have identified:
Let me know what are your thoughts. peace out ✌️ |
I will close this as basic rate limiting is working efficiently and the existing code acts as nice example how to add rate limits in case that should be necessary. |
Add a throttling system to the backend REST endpoints that tracks and limits the interaction frequency of users/api-clients. This will help us to prevent or slow down users/automated systems form flooding our database. Preferably we would like to limit on multiple keys, e.g. user and api_client (each with different thresholds).
The text was updated successfully, but these errors were encountered: