-
-
Notifications
You must be signed in to change notification settings - Fork 6.4k
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
interplay with multiprocessing #841
Comments
Does it also exit if you do it in an Also, my understanding was that you don’t need to explicitly terminate the processes in a process pool — when the pool’s context block is exited, the processes are automatically terminated. So I would think you could just safely drop the loop over |
Yeah, an with async def INFO: Application startup complete.
main process: 6058
PID 6065 has sleep time: 1
PID 6064 has sleep time: 1
PID 6066 has sleep time: 10
terminating pid 6064
terminating pid 6065
terminating pid 6066
exiting at: 10
INFO: 127.0.0.1:49702 - "POST /test-endpoint HTTP/1.1" 200 OK
INFO: Shutting down
INFO: Finished server process [6058] with a def endpoint INFO: Application startup complete.
main process: 5880
PID 5901 has sleep time: 1
PID 5902 has sleep time: 1
PID 5903 has sleep time: 10
terminating pid 5901
terminating pid 5902
terminating pid 5903
INFO: Shutting down
INFO: Finished server process [5880]
exiting at: 10 You're right that the process pool terminates its worker processes when exiting the context block. Though it waits for them to finish. We want to return as soon as the time limit is exceeded, hence the code to terminate the workers manually. |
I see, makes sense, didn’t realize it waits for things to complete but that obviously makes sense. Hmm, sounds like probably the issue can be traced back to starlette or more likely uvicorn. You might try running in the hypercorn server and see if the same error happens — that would definitively identify uvicorn as the problem. But I wouldn’t necessarily ditch uvicorn even in that case since uvicorn is very fast. Have you tried calling pool.terminate() instead of the terminate method on each process? If the problem is the with block, maybe try dropping the with and manually cleaning up with pool.terminate() just to see if it works? (If it does, we can go back and worry about getting all the safety the with block provides.) Also, reading through the python docs, it seems like the |
Thank you for the pointer, I can replicate this issue in starlette too, so it is not a FastAPI specific issue. Would you recommend I close this issue and open one in starlette's repo? Hypercorn doesn't seem to run a starlette/FastAPI application out of the box (despite the ASGI promise?). I'll try to debug it and post. Good spot! I had left out the global timer since it added noise. Probably useful to add it back.
Using the import os
import time
import uvicorn
from multiprocessing import Pool
from fastapi import FastAPI
def simple_routine(sleep_for):
print(f"PID {os.getpid()} has sleep time: {sleep_for}")
time.sleep(sleep_for)
return "done"
app = FastAPI()
@app.post("/test-endpoint/?")
async def test_endpoint():
print(f"main process: {os.getpid()}")
START_TIME = time.time()
STOP_TIME = START_TIME + 2
pool = Pool(processes=3)
futures = [
pool.apply_async(simple_routine, [1]),
pool.apply_async(simple_routine, [1]),
pool.apply_async(simple_routine, [10]),
]
results = []
for fut in futures:
remains = max(STOP_TIME - time.time(), 0)
try:
results.append(fut.get(timeout = remains))
except:
results.append("not done")
# terminate the entire pool
pool.terminate()
print("exiting at: ", int(time.time() - START_TIME))
return "True"
if __name__=="__main__":
uvicorn.run(app, host="0.0.0.0", port=5900) |
Okay, this version (using ProcessPoolExecutor) at least works for me (server remains responsive and returns valid responses), though with some ugly exceptions getting raised inside the executor: import os
import time
from concurrent.futures.process import ProcessPoolExecutor
import uvicorn
from multiprocessing import Pool
from fastapi import FastAPI
def simple_routine(sleep_for):
print(f"PID {os.getpid()} has sleep time: {sleep_for}")
time.sleep(sleep_for)
return "done"
app = FastAPI()
@app.post("/test-endpoint/?")
async def test_endpoint():
print(f"main process: {os.getpid()}")
START_TIME = time.time()
STOP_TIME = START_TIME + 2
pool = ProcessPoolExecutor(max_workers=3)
futures = [
pool.submit(simple_routine, [1]),
pool.submit(simple_routine, [1]),
pool.submit(simple_routine, [10]),
]
results = []
for fut in futures:
remains = max(STOP_TIME - time.time(), 0)
try:
results.append(fut.get(timeout = remains))
except:
results.append("not done")
# terminate the entire pool
pool.shutdown(wait=False)
print("exiting at: ", int(time.time() - START_TIME))
return "True"
if __name__=="__main__":
uvicorn.run(app, host="0.0.0.0", port=5900) Might be worth raising an issue in starlette/uvicorn if you can throw together a minimal example. |
Just kidding, the processes were being created but not cleaned up. Ugh. Yeah, I'd probably make a starlette issue. |
Thank you @dmontagu for looking. I created an issue in the uvicorn repository, will close this one. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Describe the bug
I'm trying to run parallel tasks with a timeout per task (using multiprocessing) inside an API method. On trying to terminate the child processes post the time limit, the server process shuts down and disconnects.
To Reproduce
repro.py
python repro.py
.Expected behavior
The server shouldn't shut down and continue serving requests. Interestingly, the server logs a shutdown signal when I try to terminate the subprocesses but doesn't actually exit until the long running process is complete.
Environment
Additional context
I'm trying to move a working wsgi API written with Flask to FastAPI, really for the brevity and automatic documentation. The parallel tasks are cpu intensive methods and the termination works as expected with Flask. For context, the output from the Flask logger which is the expected behavior.
The issue likely has something to do with how FastAPI runs sync methods but I can't quite figure it.
Thank you for looking.
The text was updated successfully, but these errors were encountered: