Skip to content

Improve response time latency #2603

@marziehoghbaie

Description

@marziehoghbaie

Hi there,
I am recently using fast api for machine learning prediction.
However, the model is loaded once when the server is began, and I communicate with the database only twice before and after prediction, it seems that there is a bottleneck in my code. I tried to check the robustness of my server with sending requests by threads. When there is less than 10 request every thing is fine and the max response time is acceptable. But when I increase the number of thread to about 100, the server response time increase exponentially and it might get even 200 seconds for client to receive the response.
CAn wany body have any Idea?
I do use asysnc and await keywords in both when accessing to the database and the prediction function.
The sequential response time is less than 1 second.
How can I improve the performance of my server when it is supposed to process a huge number of requests.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions