[Refactor] fork server processes earlier #1476

bojiang · 2021-03-01T08:02:18Z

Description

requires test utils in #1467 #1347

Fork before importing third-party dependencies to avoid hacking like #925

Should introduce a lib of process management in the future. Now, this function was provided by Gunicorn.

Motivation and Context

How Has This Been Tested?

Types of changes

Breaking change (fix or feature that would cause existing functionality to change)
New feature and improvements (non-breaking change which adds/improves functionality)
Bug fix (non-breaking change which fixes an issue)
Code Refactoring (internal change which is not user facing)
Documentation
Test, CI, or build

Component(s) if applicable

BentoService (service definition, dependency management, API input/output adapters)
Model Artifact (model serialization, multi-framework support)
Model Server (mico-batching, dockerisation, logging, OpenAPI, instruments)
YataiService gRPC server (model registry, cloud deployment automation)
YataiService web server (nodejs HTTP server and web UI)
Internal (BentoML's own configuration, logging, utility, exception handling)
BentoML CLI

Checklist:

My code follows the bentoml code style, both ./dev/format.sh and
./dev/lint.sh script have passed
(instructions).
My change reduces project test coverage and requires unit tests to be added
I have added unit tests covering my code change
My change requires a change to the documentation
I have updated the documentation accordingly

codecov · 2021-03-01T08:04:29Z

Codecov Report

Merging #1476 (e214050) into master (b040add) will increase coverage by 0.97%.
The diff coverage is 88.54%.

@@            Coverage Diff             @@
##           master    #1476      +/-   ##
==========================================
+ Coverage   68.65%   69.62%   +0.97%     
==========================================
  Files         150      150              
  Lines       10045    10072      +27     
==========================================
+ Hits         6896     7013     +117     
+ Misses       3149     3059      -90

Impacted Files	Coverage Δ
bentoml/marshal/marshal.py	`74.25% <70.00%> (+48.42%)`	⬆️
bentoml/server/gunicorn_server.py	`87.50% <90.00%> (-0.88%)`	⬇️
bentoml/server/__init__.py	`69.33% <91.11%> (+19.33%)`	⬆️
bentoml/cli/bento_service.py	`80.85% <100.00%> (-1.62%)`	⬇️
bentoml/configuration/containers.py	`86.48% <100.00%> (ø)`
bentoml/server/api_server.py	`71.21% <100.00%> (-0.22%)`	⬇️
bentoml/server/marshal_server.py	`97.61% <100.00%> (+4.28%)`	⬆️
bentoml/utils/log.py	`95.74% <100.00%> (-0.18%)`	⬇️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b040add...e214050. Read the comment docs.

bojiang · 2021-03-03T10:48:45Z

Updated
Fixed: DI error on MacOS.

bojiang · 2021-03-04T06:18:02Z

req: #1483

bojiang · 2021-03-05T02:39:48Z

cc @ssheng

bentoml/configuration/containers.py

ssheng · 2021-03-05T10:47:33Z

bentoml/entrypoint/__init__.py

@@ -0,0 +1,133 @@
+# Copyright 2019 Atalaya Tech, Inc.


What do you envision will be added under the /entrypoint package? What do we gain by moving to a new package?

The place before is /bentoml/server/__init__.py. I don't think it's a proper place to wire packages including bentoml.server itself.

It should be fine to wire one's own package. Taking a step back, we should think about how we'd like to expose our public APIs. Having a module like BentoMLController is one choice. What are some of the best practice in Python?

@bojiang @ssheng I think what @bojiang suggested last time, having an /api/ module for exposing public APIs is probably the convention among python-based DS/ML tools:

/api ├── server.py ├── __init__.py ├── yatai.py ├── bundle.py

Discussed offline, we will keep these in the original location under server/__init__.py.

bentoml/entrypoint/__init__.py

ssheng · 2021-03-05T10:50:01Z

bentoml/entrypoint/__init__.py

+def _start_prod_server(
+    saved_bundle_path: str,
+    config: BentoMLConfiguration,
+    port: Optional[int] = None,


Why do we need to explicitly have port here instead of from config?

I do not want to change the value of config.api_server.port. Seeing the batching app and the model app as the whole API server, it makes sense to keep config.api_server rather than overriding it by the randomly picked port.

I understand the concern. Here is why we should restructure the config keys and terminology. Before we do that, I think there is an opportunity here to make the intermediate port injectable as well. First we can introduce an intermediate port (for lack of a better name).

api_server: marshal: intermediate_port: Null # or an actual port e.g. 6000

In the container, we can introduce a provider that first checks if there is an intermediate port defined, if not randomly reserve a port.

intermediate_port = providers.Callable( lambda port: port if port else reserve_free_port(), config.marshal.intermediate_port, )

Basically, a lot of the logic here can be refactored as a provider in containers.py.

It's great to have a provider for the intermediate_port. But we have to wire after creating the new process. In your solution, the reserve_free_port would be called twice and gave intermediate_port different values in each process.

Good call that reserve_free_port might be called twice. But it should be a solvable problem. We can use a singleton provider for creating the intermediate port. All users of the provider will get the same port. We will need to move the container creation out, however. Happy to discuss with you over a Zoom call.

UDS is superior but we might have to leave the TCP option open to make remote marshaling a possibility. We can structure the config meaningfully to reflect this.

to leave the TCP option open to make remote marshaling a possibility. We can structure the config meaningfully to reflect this.

api_server: port: 5000 enable_microbatch: False run_with_ngrok: False enable_swagger: True enable_metrics: True enable_feedback: True max_request_size: 20971520 workers: 1 timeout: 60 model_server: port: Null # default: api_server.port when enable_marshal=False marshal_server: port: Null # default: api_server.port when enable_marshal=True

like this?

I'm thinking something like the following where the user can choose the connector type of provide related configs. Basically the schema for connector is a union of UDS and TCP connector schemas.

If UDS is chosen,

api_server: marshal: connector: type: UDS uds_related_key1: value1 uds_related_key2: value2

Or, if TCP is chosen,

api_server: marshal: connector: type: TCP port: None # or some configured port tcp_related_key1: value1

For UDS support, I'd like to have a simple host field in URI format, just like gunicorn and Nginx does.

127.0.0.1:5000

unix:/tmp/gunicorn.sock

Your schema is fancy but looks too powerful for me.
I think you can draft a new PR to demonstrate your idea.

After all, the config structure reflects our system design. We can continue the discussion in the new PR/issue.

ssheng · 2021-03-05T10:51:19Z

bentoml/entrypoint/__init__.py

+
+def _start_prod_batching_server(
+    saved_bundle_path: str,
+    api_server_port: int,


Same question here. Why do we need to explicitly have api_server_port here instead of from config?

Same as above.

ssheng · 2021-03-05T11:00:06Z

bentoml/marshal/marshal.py

@@ -178,6 +180,7 @@ def __init__(
            "or launch more microbatch instances to accept more concurrent connection.",
            self.CONNECTION_LIMIT,
        )
+        self._client = None


We probably don't need to lazy initialize client here. Client is pretty much always needed, correct?

Client is pretty much always needed, correct?

Yeah. But IMO we best only do value assignment operations in the __init__.

In addition, in this case, an aiohttp client session should be initialized with a running asyncio event loop.
aio-libs/aiohttp#3331

ssheng · 2021-03-05T11:00:26Z

bentoml/marshal/marshal.py

+        if self._client is None:
+            jar = aiohttp.DummyCookieJar()
+            if self.outbound_unix_socket:
+                conn = aiohttp.UnixConnector(path=self.outbound_unix_socket,)


+1 for UDS.

ssheng · 2021-03-05T11:02:28Z

bentoml/server/__init__.py

-            bento_service, port=api_server_port, enable_swagger=enable_swagger
-        )
-        api_server.start()
+        api_server = BentoAPIServer(bento_service, enable_swagger=enable_swagger)


Do you think start_dev_server could follow the same pattern as start_prod_server?

Yeah. But that should be done by another PR. It involves more than the prod server.

ssheng · 2021-03-05T11:07:17Z

bentoml/server/api_server.py

@@ -183,14 +181,18 @@ def __init__(

        self.setup_routes()

-    def start(self):
+    def start(self, port: int, host: str = "127.0.0.1"):


Can host be in the config?

Yeah. But it should be done in another PR.

ssheng · 2021-03-09T09:35:53Z

bentoml/server/gunicorn_server.py


+import psutil
+from dependency_injector.wiring import Provide as P


Not a big issue. I think it's more readable to fully spell it out. Same for container.

* [Refactor] earlier fork * [Refactor] Add server entrypoint module * format code * clean * move configuration to entrypoint * clean * revert changes about container * Move start_prod_server * make pylint happy * fix wiring

bojiang changed the title ~~[Refactor] earlier fork server processes~~ [Refactor] fork server processes earlier Mar 1, 2021

bojiang force-pushed the fork-issue branch from 115129a to 841feff Compare March 1, 2021 09:56

bojiang added the pr/merge-hold Requires further discussions before a pull request can be merged label Mar 1, 2021

bojiang mentioned this pull request Mar 1, 2021

multiprocessing start_method setup runtime exception #1460

Closed

[Refactor] earlier fork

78433d8

bojiang force-pushed the fork-issue branch from 841feff to 78433d8 Compare March 2, 2021 06:57

[Refactor] Add server entrypoint module

d6d5d2f

format code

fa40527

bojiang added 3 commits March 4, 2021 17:20

clean

b8a0981

move configuration to entrypoint

393a90f

clean

02dac88

bojiang removed the pr/merge-hold Requires further discussions before a pull request can be merged label Mar 4, 2021

bojiang requested review from ssheng and removed request for ssheng March 5, 2021 02:23

ssheng reviewed Mar 5, 2021

View reviewed changes

revert changes about container

7aeaa9a

bojiang mentioned this pull request Mar 8, 2021

Restructure the configuration of bentoml API server #1489

Closed

bojiang added 3 commits March 9, 2021 13:25

Move start_prod_server

0935aba

make pylint happy

2feace9

fix wiring

e214050

ssheng reviewed Mar 9, 2021

View reviewed changes

ssheng approved these changes Mar 9, 2021

View reviewed changes

parano merged commit 97051d2 into bentoml:master Mar 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Refactor] fork server processes earlier #1476

[Refactor] fork server processes earlier #1476

bojiang commented Mar 1, 2021 •

edited

Loading

codecov bot commented Mar 1, 2021 •

edited

Loading

bojiang commented Mar 3, 2021

bojiang commented Mar 4, 2021

bojiang commented Mar 5, 2021

ssheng Mar 5, 2021

bojiang Mar 5, 2021

ssheng Mar 8, 2021

parano Mar 8, 2021

ssheng Mar 9, 2021

ssheng Mar 5, 2021

bojiang Mar 5, 2021 •

edited

Loading

ssheng Mar 6, 2021 •

edited

Loading

bojiang Mar 6, 2021 •

edited

Loading

ssheng Mar 7, 2021

ssheng Mar 8, 2021

bojiang Mar 8, 2021

ssheng Mar 8, 2021 •

edited

Loading

bojiang Mar 8, 2021

bojiang Mar 8, 2021

ssheng Mar 5, 2021

bojiang Mar 5, 2021

ssheng Mar 5, 2021

bojiang Mar 6, 2021 •

edited

Loading

ssheng Mar 5, 2021

ssheng Mar 5, 2021

bojiang Mar 5, 2021

ssheng Mar 5, 2021

bojiang Mar 5, 2021

ssheng Mar 9, 2021 •

edited

Loading


		import psutil
		from dependency_injector.wiring import Provide as P

[Refactor] fork server processes earlier #1476

[Refactor] fork server processes earlier #1476

Conversation

bojiang commented Mar 1, 2021 • edited Loading

Description

Motivation and Context

How Has This Been Tested?

Types of changes

Component(s) if applicable

Checklist:

codecov bot commented Mar 1, 2021 • edited Loading

Codecov Report

bojiang commented Mar 3, 2021

bojiang commented Mar 4, 2021

bojiang commented Mar 5, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bojiang Mar 5, 2021 • edited Loading

Choose a reason for hiding this comment

ssheng Mar 6, 2021 • edited Loading

Choose a reason for hiding this comment

bojiang Mar 6, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ssheng Mar 8, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bojiang Mar 6, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ssheng Mar 9, 2021 • edited Loading

Choose a reason for hiding this comment

bojiang commented Mar 1, 2021 •

edited

Loading

codecov bot commented Mar 1, 2021 •

edited

Loading

bojiang Mar 5, 2021 •

edited

Loading

ssheng Mar 6, 2021 •

edited

Loading

bojiang Mar 6, 2021 •

edited

Loading

ssheng Mar 8, 2021 •

edited

Loading

bojiang Mar 6, 2021 •

edited

Loading

ssheng Mar 9, 2021 •

edited

Loading