Conversation
…g models Add fallback behavior when LLM returns empty/blank content with finish_reason=length, which occurs when reasoning models exhaust their token budget on internal reasoning with no tokens left for output. - Add isEmptyContentWithLengthFinish() utility method in AbstractLlmClient - Apply fallback to intent detection (returns fallbackSearch result) - Apply fallback to relevance evaluation (returns fallbackAllRelevant result) - Add Javadoc to ChatAction.getUserId() Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Handle the edge case where LLM reasoning models return empty/blank content with
finish_reason=length, which occurs when the model exhausts its token budget entirely on internal chain-of-thought reasoning, leaving no tokens for the actual output.Changes Made
AbstractLlmClient: AddedisEmptyContentWithLengthFinish()utility method to detect the empty-content-with-length-finish patternAbstractLlmClient: Applied fallback to intent detection paths — returnsIntentDetectionResult.fallbackSearch(userMessage)when token exhaustion is detectedAbstractLlmClient: Applied fallback to relevance evaluation — returnsRelevanceEvaluationResult.fallbackAllRelevant(allDocIds)when token exhaustion is detectedChatAction: Added Javadoc togetUserId()for clarityTesting
Breaking Changes
None. The changes add fallback behavior for an edge case that previously would have led to empty/null content being parsed.
Additional Notes
This issue is specific to reasoning models (e.g., o1, o3, DeepSeek R1) that use tokens for internal reasoning steps. When
finish_reason=lengthis combined with blank content, it's a reliable signal that the reasoning consumed all available tokens with no room for the final answer.