Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FastAPI gets terminated when child multiprocessing process terminated #1487

Closed
jongwon-yi opened this issue May 27, 2020 · 23 comments
Closed
Labels
question Question or problem question-migrate

Comments

@jongwon-yi
Copy link

Describe the bug

Make a multiprocessing Process and start it.
Right after terminate the process, fastapi itself(parent) terminated.

To Reproduce

Start command: /usr/local/bin/uvicorn worker.stts_api:app --host 127.0.0.1 --port 8445

  1. Create a file with:
from fastapi import FastAPI

app = FastAPI()


@app.post('/task/run')
def task_run(task_config: TaskOptionBody):
    proc = multiprocessing.Process(
        target=task.run,
        args=(xxxx,))
    proc.start()
    return task_id

@app.get('/task/abort')
def task_abort(task_id: str):
    proc.terminate()
    return result_OK
  1. Run task_run and while the process alive, trigger task_abort
  2. After child process terminated then parent(fastApi) terminated as well.

Expected behavior

Parent process should not be terminated after child terminated.

Environment

  • OS: Linux
  • FastAPI Version 0.54.1
  • Python version 3.8.2

Additional context

I tried same code with Flask with gunicorn, it never terminated.

@jongwon-yi jongwon-yi added the bug Something isn't working label May 27, 2020
@zdutta
Copy link

zdutta commented May 31, 2020

Hi @jongwon-yi,
Just wanted to ask what 'TaskOptionBody' is referring to in the definition of task_run

@jongwon-yi
Copy link
Author

Hi @jongwon-yi,
Just wanted to ask what 'TaskOptionBody' is referring to in the definition of task_run

class TaskOptionBody(BaseModel): owner: str description: str subscribers: str devices: list options: dict protocol: int

This behavior doesn't related to request headers or payloads. You can simply reproduce this by

run fastapi with uvicorn
start default(fork) multiprocessing Process and save the proc somewhere
terminate child with saved proc (proc.terminate())

@victorphoenix3
Copy link
Contributor

@Kludex May I work on this?

@Kludex
Copy link
Member

Kludex commented Jun 12, 2020

@victorphoenix3 I'm not in charge of anything hahaha If I were you, I'd wait for someone else to confirm the bug (you can confirm by yourself), then if it's really a problem, you can work on it. There are no PRs related to this issue. :)

@victorphoenix3
Copy link
Contributor

@tiangolo @Kludex I did not find this issue on my system. Terminating the child process did not terminate the parent. Can you please re-confirm?
This is the code I used:

from fastapi import FastAPI
from pydantic import BaseModel
import multiprocessing
import os, signal
import psutil

app = FastAPI()

class TaskOptionBody(BaseModel): 
    owner: str 
    description: str 
    subscribers: str 
    devices: list 
    options: dict 
    protocol: int

def task(pid :int):
    print(f"{pid} {os.getpid()}")

@app.post('/task/run')
def task_run(task_config: TaskOptionBody, pid: int):
    proc = multiprocessing.Process(
        target=task,
        args=(pid,))
    proc.start()
    return os.getpid()

@app.get('/task/abort')
def task_abort(pid: int):
    proc = psutil.Process(pid)
    proc.terminate()
    return 0

@tiangolo
Copy link
Member

Oh, that's amazing, thanks a lot for taking the time to debug and try to reproduce it @victorphoenix3 ! 👏 🙇 That helps a lot! 🍰

I tried it locally and wasn't able to reproduce it either. @jongwon-yi please check with @victorphoenix3's example.

@jongwon-yi
Copy link
Author

@victorphoenix3
It seems like your process need to have some long running code.
Just try to add "time.sleep(30)" and try to abort this within the time.

I tried your code and there is no issue. (Because the subprocess already terminated..?)
1 21742
INFO: 127.0.0.1:52778 - "POST /task/run HTTP/1.1" 200 OK
INFO: 127.0.0.1:52780 - "GET /task/abort?pid=21742 HTTP/1.1" 200 OK

But after I adding "sleep 30 seconds", and the issue comes.
1 21982
INFO: 127.0.0.1:52802 - "POST /task/run HTTP/1.1" 200 OK
INFO: 127.0.0.1:52804 - "GET /task/abort?pid=21982 HTTP/1.1" 200 OK
INFO: Shutting down
INFO: Waiting for application shutdown.
INFO: Application shutdown complete.
INFO: Finished server process [21973]

@github-actions github-actions bot removed the answered label Jun 17, 2020
@Kludex
Copy link
Member

Kludex commented Jun 18, 2020

With the sleep I was able to reproduce it as well. jfyk

@victorphoenix3
Copy link
Contributor

@Kludex can you tell me what am doing wrong? still haven't been able to reproduce it.
Here's how i modified the child process to add sleep and aborted it before it printed the child process complete message.

def task(pid :int):
    print(f"{pid} {os.getpid()}")
    time.sleep(50)
    print("child process complete")

and it terminates without shutting fastapi

INFO:     127.0.0.1:40570 - "POST /task/run?pid=0 HTTP/1.1" 200 OK
0 10409
INFO:     127.0.0.1:40578 - "GET /task/abort?pid=10409 HTTP/1.1" 200 OK

@Kludex
Copy link
Member

Kludex commented Jun 19, 2020

You're doing the same thing as us, but it works just fine for you. I'll paste here my configs and python packages/version later.

@Mixser
Copy link

Mixser commented Jul 12, 2020

Hi, I have discovered this situation and came to the next conclusions:

  1. Uvicorn register signals handlers and child process inherit them (but also inherit ThreadPoolExecutor and another resources)
  2. You cannot set signals handlers not from the main thread
  3. The task function will be executed in the ThreadPoolExecutor, so as I say early - you cannot change signal handlers in this function;

The second and third conclusions is not true, the real problem was founded and described below.

But it still possible to solve this problem (without changing FastAPI or uvicorn) - you can change start_method for multiprocessing to spawn method and your child process will be clear (without inherited signals handles, thread pools and other stuff).

@app.post('/task/run')
def task_run():
    multiprocessing.set_start_method('spawn')
    proc = multiprocessing.Process(
        target=task,
        args=(10,))
    proc.start()
    
    return proc.pid

It's work for me (python3.7, macOS 10.15.5)

@johnthagen
Copy link
Contributor

When I tried @Mixser's solution, the second time task_run() is called, a RuntimeError is thrown:

RuntimeError('context has already been set')

@Mixser
Copy link

Mixser commented Jun 2, 2021

Hi, @johnthagen.

The multiprocessing.set_start_method method looks like system depend. So on different OS it may work in different ways. Please, move this call as early as you can in your code and call it only once;

@johnthagen
Copy link
Contributor

johnthagen commented Jun 2, 2021

To avoid RuntimeError('context has already been set') when set_spawn_method is called multiple times within a route, I moved the call into FastAPI's startup event handler so it is only called once, as prescribed in the stdlib docs. This solved this issue for me.

import multiprocessing

...
app = FastAPI()


@app.on_event("startup")
def startup_event() -> None:
    multiprocessing.set_start_method("spawn")

@Mixser
Copy link

Mixser commented Jun 4, 2021

@johnthagen Your comment is not clear - please clarify, did it help?

@johnthagen
Copy link
Contributor

@Mixser I edited my original message above.

@pdesjardins90
Copy link

Ran into this too and although @johnthagen's fix worked, it made my child processes swallow their logs. I tried to find a solution that didn't involve OS signals at all and came up with this:

My child process is an infinite executor that can stop itself

from asyncio import gather, get_running_loop, run
from multiprocessing import JoinableQueue, Process

from my_code.some_infinite_executor import SomeInfiniteExecutor

STOP_FLAG = 'STOP'


def start_process(flag_queue: JoinableQueue) -> None:
    some_process = Process(
        target=start_process_target,
        args=(flag_queue, )
    )

    some_process.start()

def start_process_target(flag_queue: JoinableQueue) -> None:
    run(execute(flag_queue))

async def execute(flag_queue: JoinableQueue) -> None:
    executor = SomeInfiniteExecutor()
    await gather(
        executor.execute(),
        wait_for_stop(executor, flag_queue)
    )

async def wait_for_stop(executor: SomeInfiniteExecutor, flag_queue: JoinableQueue) -> None:
    event_loop = get_running_loop()
    await event_loop.run_in_executor(None, stop_queue.get)
    await executor.stop()
    stop_queue.task_done()

def stop_process(some_process: Process, flag_queue: JoinableQueue) -> None:
    flag_queue.put(STOP_FLAG}
    some_process.join()

The trick is to share a Queue between the parent process and its child. The child has to wait for a stop flag in the queue from the parent and stop itself gracefully (like by cancelling its tasks, etc.), the parent just has to join the child process after sending this flag.

API functions can simply call start_process and stop_process

@mgolchert-inwt
Copy link

I tried to implement multiprocessing in fastAPI using the fork and the spawn method (as suggested in #1487 (comment)). While the usage of the fork method makes the API shut down after one call, the spawn method makes the API extremely slow. Does anybody else experience this issue and have a solution for that? Thanks a lot!

@Mixser
Copy link

Mixser commented Jun 15, 2022

Hi, I've found a new approach how to avoiding this behavior. But at first, let's figure out what's going on here. We have a master process (uvicorn) that listen and accept request on the port. During server initialization, unicorn setups signal listeners (by using asyncio.add_signal_handler calls) for graceful shutdown of the application. But it uses a file descriptor event notification instead default approach (see https://docs.python.org/3/library/signal.html#signal.set_wakeup_fd).
After initialization, we have an opened socket (which has been used in the signal.set_wakeup_fd call) and our main process waits for data from this socket. Any received data will be interpreted as a signal.

Next, we are creating a new process in our HTTP handler -- this new process will inherit this opened socket (opened file descriptor), and we are waiting for signals from this socket.

When we are sending a signal to the child, it goes to this opened socket. But our parent process listens to this socket too, so it receives a signal to terminate and shut down the application.

How to avoid -- we need to return the default behavior of signal handlers for child process and don't use the inherited fd from the parent; We can achieve this by calling signal.set_wakeup_fd(-1)

There is a PoC

from time import sleep
from fastapi import FastAPI
import os, signal
import psutil
import multiprocessing


app = FastAPI()

def task(pid: int):
    signal.set_wakeup_fd(-1)

    signal.signal(signal.SIGTERM, signal.SIG_DFL)
    signal.signal(signal.SIGINT, signal.SIG_DFL)


    print(f"{pid} {os.getpid()}")

    while True:
        sleep(1)

@app.get('/task/run')
async def task_run():
    pid = os.getpid()
    proc = multiprocessing.Process(
        target=task,
        args=(pid, ))
    proc.start()
    return proc.pid

@app.get('/task/abort')
def task_abort(pid: int):
    proc = psutil.Process(pid)
    proc.terminate()
    return 0

@Mixser
Copy link

Mixser commented Jun 16, 2022

@tiangolo I think you can close this, because this is not related to the FastApi or uvicorn -- this is a specific behaviour of signal handling in asyncio.

@muety
Copy link

muety commented Oct 12, 2022

I added force=True to the set_start_method() method to avoid the context has already been set error. No idea what it does, but seems to work.

@tiangolo
Copy link
Member

Thanks for the discussion everyone! I think @Mixser tips would do it, right? If that solves the problem you can close the issue @jongwon-yi. ☕

@tiangolo tiangolo added question Question or problem and removed bug Something isn't working labels Oct 19, 2022
@tiangolo tiangolo changed the title [BUG] FastAPI gets terminated when child multiprocessing process terminated FastAPI gets terminated when child multiprocessing process terminated Oct 19, 2022
@github-actions
Copy link
Contributor

Assuming the original need was handled, this will be automatically closed now. But feel free to add more comments or create new issues or PRs.

@tiangolo tiangolo reopened this Feb 28, 2023
@github-actions github-actions bot removed the answered label Feb 28, 2023
@fastapi fastapi locked and limited conversation to collaborators Feb 28, 2023
@tiangolo tiangolo converted this issue into discussion #7442 Feb 28, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
question Question or problem question-migrate
Projects
None yet
Development

No branches or pull requests

10 participants