Can't pass workers_per_resource to the bentoml container #901

hahmad2008 · 2024-02-12T14:34:43Z

Describe the bug

I have a machine with two GPUs, I run the model with openllm start command and everything went well.
CUDA_VISIBLE_DEVICES=0,1 TRANSFORMERS_OFFLINE=1 openllm start mistral --model-id mymodel --dtype float16 --gpu-memory-utilization 0.95 --workers-per-resource 0.5

there are two process appear on the two GPUs in this case one for the service and another for ray instance.

when I run start command without --gpu-memory-utilization 0.95 --workers-per-resource 0.5, only one GPU is running the service and CUDA out of memory is occured.

When I build the image and follow the steps to create container, however when i run the docker image, it issue error of cuda out of memory, such as the second case without passing these args: --gpu-memory-utilization 0.95 --workers-per-resource 0.5

steps:

openllm build mymodel --backend vllm --serialization safetensors
bentoml containerize mymodel-service:12345 --opt progress=plain
docker run --rm --gpus all -p 3000:3000 -it mymodel-service:12345

To reproduce

No response

Logs

No response

Environment

$ bentoml -v
bentoml, version 1.1.11

$openllm -v
openllm, 0.4.45.dev2 (compiled: False)
Python (CPython) 3.11.7

System information (Optional)

No response

The text was updated successfully, but these errors were encountered:

hahmad2008 · 2024-02-12T19:35:49Z

@aarnphm What is the difference between the previous two cases, so the first case can launch two processes one for ray worker and other for bentoml service (that when using --gpu-memory-utilization 0.95 --workers-per-resource 0.5

jeremyadamsfisher · 2024-02-13T00:40:06Z

Same issue: #872

bojiang · 2024-07-12T01:46:41Z

close for openllm 0.6

bojiang assigned aarnphm Feb 12, 2024

bojiang closed this as completed Jul 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't pass workers_per_resource to the bentoml container #901

Can't pass workers_per_resource to the bentoml container #901

hahmad2008 commented Feb 12, 2024 •

edited

Loading

hahmad2008 commented Feb 12, 2024

jeremyadamsfisher commented Feb 13, 2024

bojiang commented Jul 12, 2024

Can't pass workers_per_resource to the bentoml container #901

Can't pass workers_per_resource to the bentoml container #901

Comments

hahmad2008 commented Feb 12, 2024 • edited Loading

Describe the bug

To reproduce

Logs

Environment

System information (Optional)

hahmad2008 commented Feb 12, 2024

jeremyadamsfisher commented Feb 13, 2024

bojiang commented Jul 12, 2024

hahmad2008 commented Feb 12, 2024 •

edited

Loading