Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(scheduling): numpy worker environs are not taking effect #2893

Merged
merged 11 commits into from
Aug 18, 2022

Conversation

bojiang
Copy link
Member

@bojiang bojiang commented Aug 11, 2022

solution:

  • only allow scheduling_strategy to control the env variables
  • setting up environ before importing bentoml

WARNING:
a breaking change for a not documented public API custom scheduler

fix: #2787

@bojiang bojiang requested review from ssheng, parano and a team as code owners August 11, 2022 16:02
@bojiang bojiang requested review from jjmachan and removed request for a team August 11, 2022 16:02
@codecov
Copy link

codecov bot commented Aug 11, 2022

Codecov Report

Merging #2893 (3280c4b) into main (3ec89ec) will decrease coverage by 1.53%.
The diff coverage is 21.66%.

❗ Current head 3280c4b differs from pull request most recent head f0b7386. Consider uploading reports for the commit f0b7386 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2893      +/-   ##
==========================================
- Coverage   70.88%   69.35%   -1.54%     
==========================================
  Files         103      103              
  Lines        9335     9374      +39     
==========================================
- Hits         6617     6501     -116     
- Misses       2718     2873     +155     
Impacted Files Coverage Δ
bentoml/_internal/yatai_client/__init__.py 24.06% <6.06%> (-0.94%) ⬇️
bentoml/_internal/yatai_rest_api_client/yatai.py 31.25% <27.27%> (-0.64%) ⬇️
bentoml/_internal/yatai_rest_api_client/schemas.py 93.53% <100.00%> (+0.06%) ⬆️
bentoml/_internal/utils/buildx.py 0.00% <0.00%> (-49.00%) ⬇️
bentoml/_internal/utils/docker.py 34.48% <0.00%> (-34.49%) ⬇️
bentoml/_internal/utils/circus/__init__.py 60.00% <0.00%> (-30.00%) ⬇️
bentoml/_internal/utils/platform.py 66.66% <0.00%> (-8.34%) ⬇️
bentoml/_internal/runner/container.py 83.98% <0.00%> (-6.07%) ⬇️
bentoml/_internal/runner/runner_handle/remote.py 83.87% <0.00%> (-4.31%) ⬇️
bentoml/_internal/runner/utils.py 86.88% <0.00%> (-3.28%) ⬇️
... and 7 more

@@ -51,20 +58,10 @@ def main(
- file:///path/to/unix.sock
- fd://12
working_dir: (Optional) the working directory
worker_id: (Optional) if set, the runner will be started as a worker with the given ID
worker_id: (Optional) if set, the runner will be started as a worker with the given ID. Important: begin from 1.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add arg doc for worker_env_map

required=False,
type=click.STRING,
default=None,
help="The environment variables to pass to the worker process. The format is a JSON string, e.g. '{0: {\"CUDA_VISIBLE_DEVICES\": 0}}'.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use a dotenv file instead of JSON? that seems easier for debugging purpose

Copy link
Member Author

@bojiang bojiang Aug 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The env map includes all envvars for each worker.

{
    0: {"CUDA_VISIBLE_DEVICES": 0},
    1: {"CUDA_VISIBLE_DEVICES": 1},
}

It seems not that easy to be represented by dotenv files.

Copy link
Collaborator

@ssheng ssheng Aug 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that passing JSON in CLI isn't the most intuitive. Using an .env file, we can allow multiple arguments of key-value pairs.

--worker-env 0:worker_0.env --worker-env 1:worker_1.env

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ssheng I think that's too complicated. This is not the public API.
The public API is the bentoml serve and bentoml.serve

worker_id,
)
@property
def scheduled_worker_env_map(self) -> dict[int, dict[str, t.Any]]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does Yatai need this information for scheduling runners?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the worker concept is transparent for yatai

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bojiang got it, in the case of Yatai, it will just use resources available from system.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah

@bojiang bojiang force-pushed the fix-scheduling branch 2 times, most recently from 6a26765 to f0b7386 Compare August 18, 2022 04:54
@ssheng ssheng merged commit ff7d608 into bentoml:main Aug 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

chore: Set OMP_NUM_THREADS and related env vars prior to importing numpy
3 participants