Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sharing model among worker process #2

Closed
mehmetilker opened this issue Aug 9, 2020 · 7 comments
Closed

Sharing model among worker process #2

mehmetilker opened this issue Aug 9, 2020 · 7 comments
Labels
question Further information is requested

Comments

@mehmetilker
Copy link

It is a question rather than an issue.
I have gone over your code to see if there is a solution to share the model (nlp, prediction etc..) among worker processes to prevent load model for every worker and utilize async definition (which is another subject/problem) but could not see a solution.
Is there something you can advice or apply in this skeleton?

Thanks.

@eightBEC eightBEC added the question Further information is requested label Aug 11, 2020
@eightBEC
Copy link
Owner

Hi @mehmetilker
one approach used in this method is to utilize the app's state to share the model within the app.

When talking about distributed workers, you can load the model as a singleton for each worker. For large models I can recommend to prefetch them from an object storage (e.g. S3, COS) to memory (e.g. using Redis), so that each worker can load the model initially.

Does this help?

@mehmetilker
Copy link
Author

Hi @eightBEC

Using App's state to load an instance for once for the whole application life time but it is meaningful if an application works on single worker.
I am starting my application with following configuration. With the current approach model state loaded to memory separately.
If we have 5 worker process it means 5*1.5GB

command=/home/xproj/.env/bin/gunicorn 
    app.modelsApi.main:app 
    -w 5
    -k uvicorn.workers.UvicornWorker
    --name gunicorn_models_api
    --bind 0.0.0.0:9200

Your advice for the solution (loading from object store to memory) does not change the situation I think, if I understood you right.

Here is another question in SO:
https://stackoverflow.com/questions/41988915/avoiding-loading-spacy-data-in-each-subprocess-when-multiprocessing

I haven't tried but I think related :
https://docs.python.org/3/library/multiprocessing.shared_memory.html
"This module provides a class, SharedMemory, for the allocation and management of shared memory to be accessed by one or more processes on a multicore or symmetric multiprocessor (SMP) machine."

@viniciusdsmello
Copy link

Hi @mehmetilker, have you found a solution for this? I'm facing the same here.

@mehmetilker
Copy link
Author

@viniciusdsmello no unfortunately...

@H3zi
Copy link

H3zi commented Feb 17, 2022

You can use gunicorn preload option to download your models only once, the workers fork will happen after the preload.
See this SO answer for more info.
(Haven't tested it with FastAPI)

@codespearhead
Copy link

@mehmetilker Did the above solution work? If so, can you close this issue?

@mehmetilker
Copy link
Author

No. Also not working on the project for a long time. Since I assume few years passed there should be a better way to solve the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants