Skip to content

fix(llm): handle empty content with length finish reason for reasoning models#3061

Merged
marevol merged 1 commit intomasterfrom
fix/llm-reasoning-model-token-exhaustion-fallback
Mar 8, 2026
Merged

fix(llm): handle empty content with length finish reason for reasoning models#3061
marevol merged 1 commit intomasterfrom
fix/llm-reasoning-model-token-exhaustion-fallback

Conversation

@marevol
Copy link
Contributor

@marevol marevol commented Mar 8, 2026

Summary

Handle the edge case where LLM reasoning models return empty/blank content with finish_reason=length, which occurs when the model exhausts its token budget entirely on internal chain-of-thought reasoning, leaving no tokens for the actual output.

Changes Made

  • AbstractLlmClient: Added isEmptyContentWithLengthFinish() utility method to detect the empty-content-with-length-finish pattern
  • AbstractLlmClient: Applied fallback to intent detection paths — returns IntentDetectionResult.fallbackSearch(userMessage) when token exhaustion is detected
  • AbstractLlmClient: Applied fallback to relevance evaluation — returns RelevanceEvaluationResult.fallbackAllRelevant(allDocIds) when token exhaustion is detected
  • ChatAction: Added Javadoc to getUserId() for clarity

Testing

  • The fallbacks degrade gracefully: intent detection falls back to a standard search, and relevance evaluation treats all documents as relevant rather than failing

Breaking Changes

None. The changes add fallback behavior for an edge case that previously would have led to empty/null content being parsed.

Additional Notes

This issue is specific to reasoning models (e.g., o1, o3, DeepSeek R1) that use tokens for internal reasoning steps. When finish_reason=length is combined with blank content, it's a reliable signal that the reasoning consumed all available tokens with no room for the final answer.

…g models

Add fallback behavior when LLM returns empty/blank content with
finish_reason=length, which occurs when reasoning models exhaust their
token budget on internal reasoning with no tokens left for output.

- Add isEmptyContentWithLengthFinish() utility method in AbstractLlmClient
- Apply fallback to intent detection (returns fallbackSearch result)
- Apply fallback to relevance evaluation (returns fallbackAllRelevant result)
- Add Javadoc to ChatAction.getUserId()

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@marevol marevol added this to the 15.6.0 milestone Mar 8, 2026
@marevol marevol self-assigned this Mar 8, 2026
@marevol marevol merged commit 758d33a into master Mar 8, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant