docs: FAQ: inference.nvidia.com has no response diversity by bxyu-nvidia · Pull Request #718 · NVIDIA-NeMo/Gym

bxyu-nvidia · 2026-02-17T17:01:03Z

No description provided.

Signed-off-by: Brian Yu <bxyu@nvidia.com>

jfarris-nvidia · 2026-02-17T19:15:49Z

+
+
+# FAQ: Model responses from inference.nvidia.com have no diversity
+`inference.nvidia.com` uses LiteLLM caching by default which leads to no diversity in model responses (pass@1 similar to pass@5). Please set something like the following flags in order to enable diverse responses:


Potentially include link to LiteLLM docs for full info: https://docs.litellm.ai/docs/proxy/caching#no-cache

Signed-off-by: Brian Yu <bxyu@nvidia.com>

Signed-off-by: Brian Yu <bxyu@nvidia.com> Signed-off-by: Frankie Siino <fsiino@nvidia.com>

…o#718) Signed-off-by: Brian Yu <bxyu@nvidia.com>

add snippet

cd1ebfa

Signed-off-by: Brian Yu <bxyu@nvidia.com>

bxyu-nvidia requested a review from jfarris-nvidia February 17, 2026 19:13

jfarris-nvidia reviewed Feb 17, 2026

View reviewed changes

jfarris-nvidia approved these changes Feb 17, 2026

View reviewed changes

bxyu-nvidia merged commit 1065096 into main Feb 17, 2026
6 checks passed

bxyu-nvidia deleted the bxyu/inference-nvidia-no-cache branch February 17, 2026 19:34

fsiino-nvidia pushed a commit that referenced this pull request Feb 21, 2026

docs: FAQ: inference.nvidia.com has no response diversity (#718)

1097d31

Signed-off-by: Brian Yu <bxyu@nvidia.com>

fsiino-nvidia pushed a commit that referenced this pull request Feb 21, 2026

docs: FAQ: inference.nvidia.com has no response diversity (#718)

3fdedca

Signed-off-by: Brian Yu <bxyu@nvidia.com>

fsiino-nvidia pushed a commit that referenced this pull request Feb 21, 2026

docs: FAQ: inference.nvidia.com has no response diversity (#718)

d67f587

Signed-off-by: Brian Yu <bxyu@nvidia.com> Signed-off-by: Frankie Siino <fsiino@nvidia.com>

abubakaria56 pushed a commit to abubakaria56/Gym that referenced this pull request Mar 2, 2026

docs: FAQ: inference.nvidia.com has no response diversity (NVIDIA-NeM…

00439ee

…o#718) Signed-off-by: Brian Yu <bxyu@nvidia.com>

abubakaria56 pushed a commit to abubakaria56/Gym that referenced this pull request Mar 2, 2026

docs: FAQ: inference.nvidia.com has no response diversity (NVIDIA-NeM…

60002cd

…o#718) Signed-off-by: Brian Yu <bxyu@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: FAQ: inference.nvidia.com has no response diversity#718

docs: FAQ: inference.nvidia.com has no response diversity#718
bxyu-nvidia merged 1 commit intomainfrom
bxyu/inference-nvidia-no-cache

bxyu-nvidia commented Feb 17, 2026

Uh oh!

jfarris-nvidia Feb 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants



		# FAQ: Model responses from inference.nvidia.com have no diversity
		`inference.nvidia.com` uses LiteLLM caching by default which leads to no diversity in model responses (pass@1 similar to pass@5). Please set something like the following flags in order to enable diverse responses:

Conversation

bxyu-nvidia commented Feb 17, 2026

Uh oh!

jfarris-nvidia Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants