how to improve hashlib speed with async #7911

scheung38 · 2020-01-15T20:51:35Z

scheung38
Jan 15, 2020

How can I improve my hashlib performance using async await if I need to iterate and process 100,000 CSV files and hashlib for each one

Would i expect much improved performance against standard method, especially if these csv files are remotely in shared network drives?

Win 10/Linux environments

Answered by dmontagu

Jan 15, 2020

You can use starlette's run_in_threadpool function to run any expensive functions that release the GIL. It's not clear exactly what you are doing but if the problem is just that hashing so many large files is taking a long time, using run_in_threadpool might speed it up -- file IO will release the GIL, and I think the hashing functions might as well (not sure), so you should be able to achieve similar performance to what you'd get through multiprocess-based parallelism.

But it will depend a lot on what you are doing. If what you are asking about isn't specific to FastAPI this question may be better asked somewhere else like stack overflow.

View full answer

dmontagu · 2020-01-15T23:46:18Z

dmontagu
Jan 15, 2020
Collaborator

You can use starlette's run_in_threadpool function to run any expensive functions that release the GIL. It's not clear exactly what you are doing but if the problem is just that hashing so many large files is taking a long time, using run_in_threadpool might speed it up -- file IO will release the GIL, and I think the hashing functions might as well (not sure), so you should be able to achieve similar performance to what you'd get through multiprocess-based parallelism.

But it will depend a lot on what you are doing. If what you are asking about isn't specific to FastAPI this question may be better asked somewhere else like stack overflow.

0 replies

tiangolo · 2020-02-13T19:50:03Z

tiangolo
Feb 13, 2020
Maintainer

Thanks @dmontagu ! 🙇‍♂️

Yeah, what @dmontagu said. If it's taking a long time, you might want to do it in a background task. Possibly in an external system like ARQ or Celery.

0 replies

2020-02-25T00:01:46Z

github-actions[bot]
bot Feb 25, 2020

Assuming the original issue was solved, it will be automatically closed now. But feel free to add more comments or create new issues.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to improve hashlib speed with async #7911

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

how to improve hashlib speed with async #7911

scheung38 Jan 15, 2020

Replies: 3 comments

dmontagu Jan 15, 2020 Collaborator

tiangolo Feb 13, 2020 Maintainer

github-actions[bot] bot Feb 25, 2020

scheung38
Jan 15, 2020

dmontagu
Jan 15, 2020
Collaborator

tiangolo
Feb 13, 2020
Maintainer

github-actions[bot]
bot Feb 25, 2020