refactor!: Simplify LLMResult and ChatResult classes #363
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Relates to #266
We've updated the
LanguageModelResult
class structure, along with its subclassesLLMResult
andChatResult
, to simplify their design. Previously, these classes contained a list ofgenerations
for each model's outputs, complicating single-output access since users often require only one generation.Key Changes:
Simplified design: each
LanguageModelResult
now stores a single output directly. To generate multiple outputs, use the new.batch
method (instead of.invoke
or.stream
), which returns a list ofLanguageModelResult
instances.Unified finish reason: we've standardized the finish reason across all providers, simplifying the process of switching between providers for users who have logic dependent on the finish reason.
Result class details:
LLMs like
OpenAI
returns anLLMResult
, chat models likeChatOpenAI
returns aChatResult
. Both classes inherit fromLanguageModelResult
and contain:id
: Identifier for the generation.output
: The text output (aString
for LLMs) or anAIChatMessage
for chat models.finishReason
: Enum specifying why the model ceased token generation.metadata
: A map containing provider-specific details about the generation (such as model info, block reason, citations, etc.).usage
: Token usage statistics, provided by the model.streaming
: Indicates if the result is streamed.Migration guide
LLM output
To get the String output from an LLM (e.g.
OpenAI
):Before: The following options were available to get the String output
After:
Chat model output
To get the String output from a Chat Model (e.g.
ChatOpenAI
):Before: The following options were available to get the String output
After:
StringOutputParser
If you are using a
StringOutputParser
in a chain to get the output as String you don't have to change anything.Finish reason
To get the finish reason of a generation:
Before: It was part of the
generationInfo
metadata, and it was of type String with different values for different providersAfter: It is now an enum with standardized values for all the providers. If a providers doesn't return a finish reason,
FinishReason.unspecified
is returned.Unified finish reasons:
ID
Before not every provider was returning a generation id, now all of them do.
Token usage
You can still access the number of tokens consumed for the generation (not every provider returns it).
Other metadata
To access other provider-specific metadata:
Before:
After:
Batch generation
Some providers (e.g.
OpenAI
) used to allow generating multiple outputs for the same prompt in a single request:Before:
The same functionality can now be achieved with the new standard
.bach
method that everyRunnable
supports.After:
TODO