Added backend_options
parameter to llm judges.
#963
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
As part of a community task I have been collaborating on, I've encountered various challenges with the litellm judge backend. The challenges were (#962):
These changes are solved in this PR in the following way:
The
JudgeLM
andJudgeLLM
constructors now accept abackend_options
parameter (dict). In case of the litellm backend, this is then converted to a new dataclassLitellmBackendOptions
, which allows to specify whether to use caching or not, how many concurrent requests should be performed, and whether to increase the number of output tokens in case of reasoning models or not.Additionally, the litellm backend will now by default ignore chat completion arguments that are not supported by the currently used inference provider. The
max_tokens
parameter is now respected by the litellm backend instead of using a hardcoded value.I'm looking forward to discussing the current solution or potential alternatives!