Skip to content

docs: FAQ: inference.nvidia.com has no response diversity#718

Merged
bxyu-nvidia merged 1 commit intomainfrom
bxyu/inference-nvidia-no-cache
Feb 17, 2026
Merged

docs: FAQ: inference.nvidia.com has no response diversity#718
bxyu-nvidia merged 1 commit intomainfrom
bxyu/inference-nvidia-no-cache

Conversation

@bxyu-nvidia
Copy link
Copy Markdown
Contributor

No description provided.

Signed-off-by: Brian Yu <bxyu@nvidia.com>
Comment thread docs/reference/faq.md


# FAQ: Model responses from inference.nvidia.com have no diversity
`inference.nvidia.com` uses LiteLLM caching by default which leads to no diversity in model responses (pass@1 similar to pass@5). Please set something like the following flags in order to enable diverse responses:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially include link to LiteLLM docs for full info: https://docs.litellm.ai/docs/proxy/caching#no-cache

@bxyu-nvidia bxyu-nvidia merged commit 1065096 into main Feb 17, 2026
6 checks passed
@bxyu-nvidia bxyu-nvidia deleted the bxyu/inference-nvidia-no-cache branch February 17, 2026 19:34
fsiino-nvidia pushed a commit that referenced this pull request Feb 21, 2026
fsiino-nvidia pushed a commit that referenced this pull request Feb 21, 2026
fsiino-nvidia pushed a commit that referenced this pull request Feb 21, 2026
Signed-off-by: Brian Yu <bxyu@nvidia.com>
Signed-off-by: Frankie Siino <fsiino@nvidia.com>
abubakaria56 pushed a commit to abubakaria56/Gym that referenced this pull request Mar 2, 2026
abubakaria56 pushed a commit to abubakaria56/Gym that referenced this pull request Mar 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants