Skip to content

Commit

Permalink
Defensively copy sampling_params (vllm-project#2881)
Browse files Browse the repository at this point in the history
If the SamplingParams object passed to LLMEngine.add_request() is mutated after it returns, it could affect the async sampling process for that request.

Suggested by @Yard1 vllm-project#2514 (comment)
  • Loading branch information
njhill authored and jimpang committed Mar 4, 2024
1 parent 1a639a8 commit 1f431c9
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions vllm/engine/llm_engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -464,6 +464,9 @@ def add_request(
prompt_token_ids[:prefix_pos], lora_request.lora_int_id
if lora_request else 0) if prefix_pos is not None else None

# Defensive copy of SamplingParams, which are used by the sampler
sampling_params = copy.deepcopy(sampling_params)

# Create the sequence group.
seq_group = SequenceGroup(request_id, [seq], sampling_params,
arrival_time, lora_request, prefix)
Expand Down

0 comments on commit 1f431c9

Please sign in to comment.