[Bugfix] Fix inappropriate content of model_name tag in Prometheus metrics #3937

DearPlanet · 2024-04-09T11:46:57Z

Description

When deploying a local model, such as one located at a path like /mnt/Qwen1.5-14B-chat/, the Prometheus metrics output exposes this model path information.

The metrics output looks like this:

...
# HELP vllm:num_requests_running Number of requests currently running on GPU.
# TYPE vllm:num_requests_running gauge
vllm:num_requests_running{model_name="/mnt/Qwen1.5-14B-Chat/"} 0.0
# HELP vllm:num_requests_swapped Number of requests swapped to CPU.
# TYPE vllm:num_requests_swapped gauge
vllm:num_requests_swapped{model_name="/mnt/Qwen1.5-14B-Chat/"} 0.0
# HELP vllm:num_requests_waiting Number of requests waiting to be processed.
# TYPE vllm:num_requests_waiting gauge
vllm:num_requests_waiting{model_name="/mnt/Qwen1.5-14B-Chat/"} 0.0
# HELP vllm:gpu_cache_usage_perc GPU KV-cache usage. 1 means 100 percent usage.
# TYPE vllm:gpu_cache_usage_perc gauge
vllm:gpu_cache_usage_perc{model_name="/mnt/Qwen1.5-14B-Chat/"} 0.0
# HELP vllm:cpu_cache_usage_perc CPU KV-cache usage. 1 means 100 percent usage.
# TYPE vllm:cpu_cache_usage_perc gauge
vllm:cpu_cache_usage_perc{model_name="/mnt/Qwen1.5-14B-Chat/"} 0.0
# HELP vllm:prompt_tokens_total Number of prefill tokens processed.
# TYPE vllm:prompt_tokens_total counter
vllm:prompt_tokens_total{model_name="/mnt/Qwen1.5-14B-Chat/"} 0.0
# HELP vllm:generation_tokens_total Number of generation tokens processed.
...

This raises two issues:

The local deployment path of the model is inadvertently revealed, this is a high risk;
The model_name in metrics does not correspond to the model name provided in ModelCard by openai.api_server.

This PR solves the problem, by passing the served_model_name parameter into ModelConfig and parsing it in the LLMEngine. For scenarios not using openai.apiserver, served_model_name can also be passed to AsyncEngineArgs/EngineArgs. And when served_model_name parameter is not specified, the Prometheus metrics output will behave as before (although I still do not recommend this).

… match `served_model_name`.

joerunde · 2024-04-15T15:54:37Z

vllm/engine/arg_utils.py

+            attr: getattr(args, attr)
+            for attr in attrs if hasattr(args, attr)
+        })
+        if hasattr(args, "served_model_name"):


Can this be simplified to

if engine_args.served_model_name is None: engine_args.served_model_name = args.model

The above dict comprehension would have already set engine_args.served_model_name if served_model_name was an attribute on args

Thanks for your suggestion!

I adopted this approach as a means to adapt to vllm.entrypoint.apiserver. For now, served_model_name only appears in vllm.entrypoint.openai.apiserver and currently has no description, when using vllm.entrypoint.apiserver, served_model_name is not an attribute of args.

Perhaps we could consider to complete served_model_name parameter in vllm.entrypoint.apiserver as another approach? If so, the implementation here could be simplified as suggested. And this change will not affect anyone using AsyncEngineArgs/EngineArgs directly.

I believe served_model_name is useful for all scenarios, if it can not be set, like in vllm.entrypoint.apiserver, sometimes a local path will be treated as the model's label, this doesn't appear to be the right way.

@DearPlanet vllm.entrypoint.apiserver is meant as more of a toy example and not being actively improved.

That said I agree if we are using this name for internal purposes it makes more sense to move it into EngineArgs, would you be ok making that change? And could augment its existing description with a mention that the first name will also be used in the metrics output. This should also address some of the latest CI test failures.

And then maybe best for the check to be if not engine_args.served_model_name: just in case it's an empty list (in theory it shouldn't be).

njhill

@DearPlanet thank you very much for the contribution!

Would you also be willing to add a unit test for this, hopefully it should be straightforward to extend the existing test_metrics.py.

njhill · 2024-04-25T22:23:50Z

vllm/engine/arg_utils.py

+            attr: getattr(args, attr)
+            for attr in attrs if hasattr(args, attr)
+        })
+        if hasattr(args, "served_model_name"):


@DearPlanet vllm.entrypoint.apiserver is meant as more of a toy example and not being actively improved.

That said I agree if we are using this name for internal purposes it makes more sense to move it into EngineArgs, would you be ok making that change? And could augment its existing description with a mention that the first name will also be used in the metrics output. This should also address some of the latest CI test failures.

And then maybe best for the check to be if not engine_args.served_model_name: just in case it's an empty list (in theory it shouldn't be).

njhill · 2024-04-25T22:30:04Z

vllm/engine/arg_utils.py

+            for attr in attrs if hasattr(args, attr)
+        })
+        if hasattr(args, "served_model_name"):
+            engine_args.served_model_name = args.served_model_name


It is a multi-valued arg and will be a list

Suggested change

engine_args.served_model_name = args.served_model_name

engine_args.served_model_name = args.served_model_name[0]

This reverts commit c6adebc.

…d descriptions for all param entries

DearPlanet · 2024-04-28T04:38:01Z

Thanks for suggestions @njhill!
I agree moving served_model_name to EngineArgs is a more elegant way and has less impact on the other parts. In addition, this parameter should support either list or str.
The latest update contains:

Move served_model_name to EngineArgs
Support served_model_name as type list or str
Add descriptions for this parameter
Add test case in test_metrics.py

njhill

Thanks @DearPlanet, just have couple of minor suggestions

vllm/config.py

Co-authored-by: Nick Hill <nickhill@us.ibm.com>

DearPlanet · 2024-05-04T12:43:07Z

Thank you again for the suggestions @njhill !😊

…trics (vllm-project#3937)

[Bugfix] Change the "model_name" tag in exposed Prometheus metrics to…

9ad067d

… match `served_model_name`.

DearPlanet changed the title ~~[Bugfix] Fix model_name tag in Prometheus metrics~~ [Bugfix] Fix inappropriate content of model_name tag in Prometheus metrics Apr 9, 2024

DearPlanet force-pushed the fix_metrics_tags branch from 24c7b63 to 9ad067d Compare April 9, 2024 12:43

Fix parameter parsing in EngineArgs

c6adebc

njhill self-assigned this Apr 9, 2024

joerunde reviewed Apr 15, 2024

View reviewed changes

DearPlanet requested a review from joerunde April 25, 2024 07:37

DearPlanet added 2 commits April 25, 2024 16:10

Merge remote-tracking branch 'upstream/main' into fix_metrics_tags

2b06a8b

refactor: reformat codes

a0e7407

DearPlanet force-pushed the fix_metrics_tags branch from 6b56bde to a0e7407 Compare April 25, 2024 08:16

njhill reviewed Apr 25, 2024

View reviewed changes

DearPlanet added 6 commits April 26, 2024 14:38

Revert "Fix parameter parsing in EngineArgs"

5756a04

This reverts commit c6adebc.

Move served_model_name to EngineArgs; adapt for name list input; ad…

5f7fbbc

…d descriptions for all param entries

Add testcase for model_name tag in metrics

119b0b2

Merge remote-tracking branch 'upstream/main' into fix_metrics_tags

c762bf2

Merge remote-tracking branch 'upstream/main' into fix_metrics_tags

9109130

fix(params): served_model_name allow str input in EngineArgs

55f405f

DearPlanet requested a review from njhill April 28, 2024 04:38

DearPlanet added 2 commits April 29, 2024 20:08

Merge remote-tracking branch 'upstream/main' into fix_metrics_tags

9d2fc25

Merge remote-tracking branch 'upstream/main' into fix_metrics_tags

827b2be

njhill approved these changes May 1, 2024

View reviewed changes

vllm/config.py Outdated Show resolved Hide resolved

vllm/config.py Outdated Show resolved Hide resolved

DearPlanet and others added 4 commits May 4, 2024 09:35

Merge branch 'main' into fix_metrics_tags

b25a645

Update descriptions of served_model_name in vllm/config.py

27a5e8b

Co-authored-by: Nick Hill <nickhill@us.ibm.com>

simplify condition judgements

0ab2e61

Co-authored-by: Nick Hill <nickhill@us.ibm.com>

fix: served_model_name should be moved to EngineArgs

7ea83a8

DearPlanet force-pushed the fix_metrics_tags branch from db5e462 to 7ea83a8 Compare May 4, 2024 02:31

DearPlanet and others added 2 commits May 4, 2024 15:21

fix: fix test cases

6410214

Merge branch 'main' into fix_metrics_tags

a2a8fa0

DearPlanet requested a review from njhill May 4, 2024 12:43

njhill merged commit 4302987 into vllm-project:main May 4, 2024
59 checks passed

robertgshaw2-neuralmagic pushed a commit to neuralmagic/nm-vllm that referenced this pull request May 6, 2024

[Bugfix] Fix inappropriate content of model_name tag in Prometheus me…

f8fb8c1

…trics (vllm-project#3937)

z103cb pushed a commit to z103cb/opendatahub_vllm that referenced this pull request May 7, 2024

[Bugfix] Fix inappropriate content of model_name tag in Prometheus me…

ac5ccb6

…trics (vllm-project#3937)

dtrifiro pushed a commit to opendatahub-io/vllm that referenced this pull request May 7, 2024

[Bugfix] Fix inappropriate content of model_name tag in Prometheus me…

e241eaf

…trics (vllm-project#3937)

dtrifiro mentioned this pull request May 15, 2024

bump ubi base image tag opendatahub-io/vllm#24

Merged

mawong-amd pushed a commit to ROCm/vllm that referenced this pull request Jun 3, 2024

[Bugfix] Fix inappropriate content of model_name tag in Prometheus me…

1c92702

…trics (vllm-project#3937)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] Fix inappropriate content of model_name tag in Prometheus metrics #3937

[Bugfix] Fix inappropriate content of model_name tag in Prometheus metrics #3937

DearPlanet commented Apr 9, 2024 •

edited

joerunde Apr 15, 2024

DearPlanet Apr 16, 2024

njhill Apr 25, 2024

njhill left a comment

njhill Apr 25, 2024

njhill Apr 25, 2024

DearPlanet commented Apr 28, 2024

njhill left a comment

DearPlanet commented May 4, 2024

	engine_args.served_model_name = args.served_model_name
	engine_args.served_model_name = args.served_model_name[0]

[Bugfix] Fix inappropriate content of model_name tag in Prometheus metrics #3937

[Bugfix] Fix inappropriate content of model_name tag in Prometheus metrics #3937

Conversation

DearPlanet commented Apr 9, 2024 • edited

Description

joerunde Apr 15, 2024

Choose a reason for hiding this comment

DearPlanet Apr 16, 2024

Choose a reason for hiding this comment

njhill Apr 25, 2024

Choose a reason for hiding this comment

njhill left a comment

Choose a reason for hiding this comment

njhill Apr 25, 2024

Choose a reason for hiding this comment

njhill Apr 25, 2024

Choose a reason for hiding this comment

DearPlanet commented Apr 28, 2024

njhill left a comment

Choose a reason for hiding this comment

DearPlanet commented May 4, 2024

DearPlanet commented Apr 9, 2024 •

edited