[Feature]: Propagate served-model-name to LMCache metrics

## **Feature Request: Propagate `served-model-name` to LMCache metrics**

### **Summary**

When running vLLM using the OpenAI-compatible API, the `--served-model-name` parameter is not propagated to LMCache-related Prometheus metrics. Instead, the internal `model_name` label reflects the underlying model path (e.g. `/huggingface/cache/.../DeepSeek-V3-0324`).

This makes it difficult to correlate metrics with the externally-exposed model name, especially when multiple served aliases point to the same model or when deployments rely on logical model names rather than filesystem paths.

### **Current Behavior**

Running vLLM with:

```
--served-model-name deepseek-v3-0324-longrunning
--kv-transfer-config='{"kv_connector":"LMCacheConnectorV1","kv_role":"kv_both"}'
```

Yields metrics such as:

```
lmcache_local_cache_usage{
  model_name="/huggingface/cache/hub/deepseek-ai/DeepSeek-V3-0324",
  ...
}
```

The `model_name` label is always the internal model path, not the served alias.

### **Requested Behavior**

Expose `served-model-name` in LMCache-related metrics, such as:

* Replace `model_name` with the served name **or**
* Add a second label, e.g. `served_model_name`, while preserving the internal model path

Example desired metric:

```
lmcache_local_cache_usage{
  model_name="/huggingface/cache/hub/deepseek-ai/DeepSeek-V3-0324",
  served_model_name="deepseek-v3-0324-longrunning",
  ...
}
```

### **Motivation**

* Easier multi-model monitoring when models are served under logical aliases.
* Prometheus/Grafana dashboards become simpler; no need for external relabeling.
* Avoids confusion when multiple deployments load the same underlying model path but serve it under different names.
* Aligns with user expectations: model name shown to API clients should match model name appearing in server metrics.

### **Use Case Example**

Running:

```
python3 -m vllm.entrypoints.openai.api_server \
  --model /huggingface/cache/hub/deepseek-ai/DeepSeek-V3-0324 \
  --served-model-name deepseek-v3-0324-longrunning \
  --kv-transfer-config='{"kv_connector":"LMCacheConnectorV1","kv_role":"kv_both"}'
```

Metrics show:

```
model_name="/huggingface/cache/hub/deepseek-ai/DeepSeek-V3-0324"
```

Expected:

```
served_model_name="deepseek-v3-0324-longrunning"
```

### **Proposed Implementation**

* Inject `served-model-name` into the worker model metadata.
* When generating LMCache metrics, add it as an additional Prometheus label.
* Backward-compatible: existing label values unchanged.



### Alternatives

_No response_

### Additional context

_No response_

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature]: Propagate served-model-name to LMCache metrics #30007

Feature Request: Propagate `served-model-name` to LMCache metrics

Summary

Current Behavior

Requested Behavior

Motivation

Use Case Example

Proposed Implementation

Alternatives

Additional context

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Feature]: Propagate served-model-name to LMCache metrics #30007

Description

Feature Request: Propagate served-model-name to LMCache metrics

Summary

Current Behavior

Requested Behavior

Motivation

Use Case Example

Proposed Implementation

Alternatives

Additional context

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Feature Request: Propagate `served-model-name` to LMCache metrics