Defensively copy sampling_params (vllm-project#2881)

If the SamplingParams object passed to LLMEngine.add_request() is mutated after it returns, it could affect the async sampling process for that request. Suggested by @Yard1 vllm-project#2514 (comment)
xjpang · Mar 4, 2024 · 1f431c9 · 1f431c9
1 parent 1a639a8
commit 1f431c9
Showing 1 changed file with 3 additions and 0 deletions.
diff --git a/vllm/engine/llm_engine.py b/vllm/engine/llm_engine.py
@@ -464,6 +464,9 @@ def add_request(
             prompt_token_ids[:prefix_pos], lora_request.lora_int_id
             if lora_request else 0) if prefix_pos is not None else None
 
+        # Defensive copy of SamplingParams, which are used by the sampler
+        sampling_params = copy.deepcopy(sampling_params)
+
         # Create the sequence group.
         seq_group = SequenceGroup(request_id, [seq], sampling_params,
                                   arrival_time, lora_request, prefix)