Why temperature=0,top_p=1,seed=42 is still not enough to fix the llm's output!? #17166

beyondguo · 2025-04-25T06:54:43Z

beyondguo
Apr 25, 2025

We are using Qwen2.5-14B-Instruct with vLLM. However, we found the following things can make the output different, even we set temperature=0,top_p=1,seed=42:

vllm serve is different with vllm offline inference, using the same chat_template
vllm serve with different number of cards
different vllm version
using H100 or H200 can make a difference

That is strange. Can someone tell me why? and how can I fix the output, when changing inference enveriments?

onestardao · 2025-08-18T11:45:03Z

onestardao
Aug 18, 2025

You’ve identified a classic determinism failure—ProblemMap No.15: “Inference seed drift & non-reproducibility.”
Setting temperature=0, top_p=1, and a fixed seed isn’t enough to guarantee identical outputs in LLM systems, due to:

Kernel-level nondeterminism (cuDNN, CUDA ops, library versions can cause subtle floating-point drift—even across driver or hardware generations).
Microarchitectural differences (H100 vs. H200, for instance, change low-level accumulation or memory ordering).
Multi-card parallelism: collective ops and data splits across cards reintroduce stochasticity.
vLLM code/serving mode: vllm serve and offline may have slightly different preprocessing, sharding, or scheduler implementations.

This means that even with the same chat_template, you’ll hit small, often invisible, numerical differences that amplify down the decoding path.
If you want a full checklist or technical workarounds for strict reproducibility (tested on vLLM, TRT-LLM, and similar frameworks), let me know—this is a common pitfall for anyone shipping high-reliability inference.

Problem details and mitigation tips are mapped in this public index:
https://github.com/onestardao/WFGY/blob/main/ProblemMap/README.md

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Why temperature=0,top_p=1,seed=42 is still not enough to fix the llm's output!? #17166

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Why temperature=0,top_p=1,seed=42 is still not enough to fix the llm's output!? #17166

Uh oh!

beyondguo Apr 25, 2025

Replies: 1 comment

Uh oh!

onestardao Aug 18, 2025

beyondguo
Apr 25, 2025

onestardao
Aug 18, 2025