[Core] Modify the initialization parameters of the lora manager #25249

jeejeelee · 2025-09-19T10:41:15Z

Purpose

Pass vllm_config to the internal lora manager to facilitate the advancement of #[Core] Enable LoRA support for classification model #24596 , as [Core] Enable LoRA support for classification model #24596 needs to obtain detail information such as task type.
Optimize the script name for lora weights(lora.py->lora_weight.py)

This PR will slight affect the loRA implementation of hardware plugins. cc @xuechendi @Yikun

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>

gemini-code-assist

Code Review

This pull request refactors the initialization of the LoRA manager to use a single vllm_config object, which simplifies the API and improves code clarity. The changes are well-implemented across the codebase. However, I've found a critical issue in one of the updated tests where the configuration for the test is not correctly set up, which will lead to test failures. I've provided a suggestion to fix it.

gemini-code-assist · 2025-09-19T10:43:23Z

tests/lora/test_lora_manager.py

+    model_config = ModelConfig(max_model_len=16)
+    vllm_config = VllmConfig(model_config=model_config,
+                             lora_config=lora_config)
+
+    vllm_config.scheduler_config.max_num_seqs = 4
+    vllm_config.scheduler_config.max_num_batched_tokens = 2
    worker_adapter_manager = LRUCacheWorkerLoRAManager(
-        4, 2,
-        dummy_model.unpadded_vocab_size - lora_config.lora_extra_vocab_size,
-        lora_config, device, EMBEDDING_MODULES, EMBEDDING_PADDING_MODULES)
+        vllm_config, device, EMBEDDING_MODULES, EMBEDDING_PADDING_MODULES)
+
+    worker_adapter_manager.max_num_seqs = 4
+    worker_adapter_manager.max_num_batched_tokens = 2


The vllm_config is not correctly initialized for the test. The ModelConfig within it doesn't have the hf_config from the dummy_model, which will cause vllm_config.model_config.get_vocab_size() to return 0 inside LRUCacheWorkerLoRAManager. This leads to incorrect behavior, especially when calculating target_embedding_padding.

To fix this, you should associate the dummy_model.config with the vllm_config's model_config. Also, the manual setting of max_num_seqs and max_num_batched_tokens on worker_adapter_manager is redundant as these are already set during initialization from the vllm_config.

Suggested change

model_config = ModelConfig(max_model_len=16)

vllm_config = VllmConfig(model_config=model_config,

lora_config=lora_config)

vllm_config.scheduler_config.max_num_seqs = 4

vllm_config.scheduler_config.max_num_batched_tokens = 2

worker_adapter_manager = LRUCacheWorkerLoRAManager(

4, 2,

dummy_model.unpadded_vocab_size - lora_config.lora_extra_vocab_size,

lora_config, device, EMBEDDING_MODULES, EMBEDDING_PADDING_MODULES)

vllm_config, device, EMBEDDING_MODULES, EMBEDDING_PADDING_MODULES)

worker_adapter_manager.max_num_seqs = 4

worker_adapter_manager.max_num_batched_tokens = 2

model_config = ModelConfig(max_model_len=16)

vllm_config = VllmConfig(model_config=model_config,

lora_config=lora_config)

# Manually set hf_config for the test since ModelConfig doesn't take it

# in __init__ and we are not loading from a real model path.

vllm_config.model_config.hf_config = dummy_model.config

vllm_config.scheduler_config.max_num_seqs = 4

vllm_config.scheduler_config.max_num_batched_tokens = 2

worker_adapter_manager = LRUCacheWorkerLoRAManager(

vllm_config, device, EMBEDDING_MODULES, EMBEDDING_PADDING_MODULES)

Isotr0py

Just some nits. Otherwise LGTM

vllm/lora/worker_manager.py

Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>

Yikun · 2025-09-19T17:01:08Z

Thanks for reminder! @jeejeelee
also cc @paulyu12 @wangxiyuan

xuechendi · 2025-09-19T19:56:54Z

@jeejeelee , thanks for the CC, will update in our plugin accordingly

…-project#25249) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>

paulyu12 · 2025-09-22T08:47:00Z

Thanks for reminder! @jeejeelee also cc @paulyu12 @wangxiyuan

@Yikun @wangxiyuan Updated our plugin: vllm-project/vllm-ascend#3095

### What this PR does / why we need it? Fix the impact to LoRA that vllm-project/vllm#25249 brought. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? pytest -sv tests/e2e/singlecard/test_ilama_lora.py pytest -sv tests/e2e/multicard/test_ilama_lora_tp2.py - vLLM version: v0.10.2 - vLLM main: vllm-project/vllm@9607d5e --------- Signed-off-by: paulyu12 <507435917@qq.com>

vanbasten23 · 2025-09-22T16:40:52Z

This PR broke the lora in TPU plugin. But thanks @gpolovets1 for the fix https://github.com/vllm-project/tpu_commons/pull/720. cc: @jeejeelee @Isotr0py

…-project#25249) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>

…-project#25249) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: charlifu <charlifu@amd.com>

Done

4444990

Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>

jeejeelee requested review from NickLucche, WoosukKwon, robertgshaw2-redhat, njhill, ywang96, comaniac and alexm-redhat as code owners September 19, 2025 10:41

jeejeelee requested review from Isotr0py and DarkLight1337 September 19, 2025 10:41

Merge branch 'main' into improve-lora

97f025f

mergify bot added v1 tpu Related to Google TPUs labels Sep 19, 2025

gemini-code-assist bot reviewed Sep 19, 2025

View reviewed changes

Isotr0py approved these changes Sep 19, 2025

View reviewed changes

vllm/lora/worker_manager.py Outdated Show resolved Hide resolved

vllm/lora/worker_manager.py Show resolved Hide resolved

jeejeelee added 2 commits September 19, 2025 15:36

Fix text

ba759b4

Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>

Merge branch 'main' into improve-lora

3980f81

jeejeelee added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 19, 2025

jeejeelee enabled auto-merge (squash) September 19, 2025 16:57

jeejeelee merged commit 2821986 into vllm-project:main Sep 19, 2025
57 checks passed

debroy-rh pushed a commit to debroy-rh/vllm that referenced this pull request Sep 19, 2025

[Core] Modify the initialization parameters of the lora manager (vllm…

b21d2c2

…-project#25249) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>

jeejeelee deleted the improve-lora branch September 20, 2025 01:28

paulyu12 mentioned this pull request Sep 22, 2025

[Bugfix][LoRA] Fix bug introduced by upstream vllm#25249 vllm-project/vllm-ascend#3095

Merged

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

[Core] Modify the initialization parameters of the lora manager (vllm…

d7a158c

…-project#25249) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>

charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025

[Core] Modify the initialization parameters of the lora manager (vllm…

8d1136f

…-project#25249) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com> Signed-off-by: charlifu <charlifu@amd.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Core] Modify the initialization parameters of the lora manager #25249

[Core] Modify the initialization parameters of the lora manager #25249

Uh oh!

jeejeelee commented Sep 19, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Sep 19, 2025

Uh oh!

Isotr0py left a comment

Uh oh!

Uh oh!

Uh oh!

Yikun commented Sep 19, 2025

Uh oh!

Uh oh!

xuechendi commented Sep 19, 2025

Uh oh!

paulyu12 commented Sep 22, 2025

Uh oh!

vanbasten23 commented Sep 22, 2025

Uh oh!

Uh oh!

Uh oh!

[Core] Modify the initialization parameters of the lora manager #25249

[Core] Modify the initialization parameters of the lora manager #25249

Uh oh!

Conversation

jeejeelee commented Sep 19, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

Isotr0py left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Yikun commented Sep 19, 2025

Uh oh!

Uh oh!

xuechendi commented Sep 19, 2025

Uh oh!

paulyu12 commented Sep 22, 2025

Uh oh!

vanbasten23 commented Sep 22, 2025

Uh oh!

Uh oh!

jeejeelee commented Sep 19, 2025 •

edited by github-actions bot

Loading