Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add limit on concurrent requests for doc-qa #64

Merged
merged 2 commits into from
Apr 17, 2020

Conversation

ghost
Copy link

@ghost ghost commented Apr 17, 2020

This PR adds a RequestLimiter class to limit concurrent requests for a given endpoint.

In the case of question answering on very large documents, the requests can take several seconds. With the default FastAPI/Uvicorn/Gunicorn deployment, the requests get processed concurrently on a GPU, slowing down all the requests. To provide consistent user experience, the API can respond with a server-busy error code if the in-process requests exceed the limit threshold.

@tanaysoni tanaysoni requested a review from tholor April 17, 2020 11:32

resp_time = round(time.time() - t1, 2)
logger.info({"time": resp_time, "request": request.json(), "results": results})
with doc_qa_limiter.run():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nicer to make a decorator out of this, but if this requires bigger changes we can do it later in a separate PR.

@tanaysoni tanaysoni merged commit b474d21 into master Apr 17, 2020
@tanaysoni tanaysoni deleted the limit-concurrent-request branch July 8, 2020 08:37
masci pushed a commit that referenced this pull request Nov 27, 2023
Remove Component's I/O type checks at run time
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants