Support for post_init lifecycle/lifespan hook. #10810

msteiner-google · 2023-12-22T09:48:36Z

msteiner-google
Dec 22, 2023

Hello all!

First of all, thank you for the work you folks are putting in.

Going back to the FR, as per the title, this discussion is about supporting for some post_init hook. In this state the server is already running and serving requests but some resources are still being initialized and we should have a way to make this information available from within the endpoints' methods.

Why would that be helpful

If we look at Google's VertexAI support for custom containers (documentation here) we see that two different checks happen. The liveness one only check that the server is running and if it fails 5 times in a row (~50sec) the container is restarted. The health check, instead, can fail as long we want and only when the models are loaded it should start return 200s so that the load balancer know that the instance is ready to receive predict requests.

So, summarizing, FastAPI can't be used to serve models whose load time take longer than 50 seconds since the container will be restarted.

Therefore, I suggest to enable a state in which the server starts and so it can listen to the liveness probes, but provide a mechanism to check if a resource is fully loaded from within the endpoint methods. I am not sure this should be called post_init, but I don't have much phantasy :).

Happy to discuss this further.

msteiner-google · 2023-12-27T07:58:22Z

msteiner-google
Dec 27, 2023
Author

For those interested, the workaround I use for achieving this can be described as follow:

Spawn a thread for loading the model
Add a callback that adds the health-check and predict routes to the server
Start the server

In code:

# server creation
    ...
    server = create_app(...)

    # Start load the model asynchronously and only attach the router to the server once
    # the load is completed.
    e = ThreadPoolExecutor(max_workers=1)
    f = e.submit(load_model, injector)
    f.add_done_callback(attach_routes(injector, server))

    uvicorn.run(server, host=_HOST.value, port=_AIP_HTTP_PORT.value)
    e.shutdown()

and

# Router and endpoints creation and attach them to running server

def get_health_route() -> Any:
    async def _helthz() -> Response:
        return Response(status_code=status.HTTP_200_OK)

    return _helthz


def get_predict_route(model: tf.keras.Model) -> Any:
    async def _predict() -> Response:
        return Response(status_code=status.HTTP_200_OK)

    return _predict


def get_router(
    healthz_path: str, predict_path: str, model: tf.keras.Model
) -> APIRouter:
    router = APIRouter(on_startup=[])
    router.add_api_route(
        path=healthz_path, endpoint=get_health_route(), methods={"GET"}
    )
    router.add_api_route(
        path=predict_path, endpoint=get_predict_route(model=model), methods={"POST"}
    )
    return router

def load_model(injector: Injector) -> tf.keras.Model:
    model_uri = injector.get(types.ModelStorageURI)
    ...
    logging.info("Loading model.")
    ...
    model = tf.keras.models.load_model(dir)

    logging.info(f"Loaded model: {model}")

    return model


def attach_routes(injector: Injector, server: FastAPI) -> Any:
    def _callback(future: Future[tf.keras.Model]) -> None:
        healtz = injector.get(types.HealthZRoute)
        predict = injector.get(types.PredictRoute)
        router = get_router(
            healthz_path=healtz, predict_path=predict, model=future.result()
        )
        server.include_router(router)

    return _callback

It works, but it feels like I am working against the framework to achieve what I want

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Support for post_init lifecycle/lifespan hook. #10810

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Uh oh!

Support for post_init lifecycle/lifespan hook. #10810

Uh oh!

Uh oh!

msteiner-google Dec 22, 2023

Why would that be helpful

Replies: 1 comment

Uh oh!

Uh oh!

msteiner-google Dec 27, 2023 Author

msteiner-google
Dec 22, 2023

msteiner-google
Dec 27, 2023
Author