Skip to content

Add titleMaxTokens #8127

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

twinity1
Copy link
Contributor

Summary

This PR adds the titleMaxTokens configuration option. The default value of 75 tokens is sometimes insufficient when generating conversation titles, resulting in truncated responses. This change allows users to specify a higher token limit for title generation. This is primarily issue for LiteLLM with AWS Bedrock.

Request:

{
  "user": "686055ad0943bd2e1cd2457c",
  "model": "eu.anthropic.claude-3-7-sonnet-20250219-v1:0",
  "stream": false,
  "messages": [
    {
      "role": "user",
      "content": "Write a concise title for this conversation in the detected language. Title in 5 Words or Less. No Punctuation or Quotation.\nUser: hello\nAI: Hello! How are you today? I'm here to help with any questions or tasks you might have. Feel free to ask about anything you're curious about or need assistance with."
    }
  ],
  "max_tokens": 75,
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "extract",
      "schema": {
        "type": "object",
        "$schema": "http://json-schema.org/draft-07/schema#",
        "required": [
          "title"
        ],
        "properties": {
          "title": {
            "type": "string",
            "description": "A concise title for the conversation in 5 words or less, without punctuation or quotation"
          }
        },
        "additionalProperties": false
      },
      "strict": true
    }
  }
}

Response:

{
  "id": "chatcmpl-73395bfd-633d-4275-8a40-47116d256686",
  "model": "arn:aws:bedrock:eu-north-1:502626041493:inference-profile/eu.anthropic.claude-3-7-sonnet-20250219-v1:0",
  "usage": {
    "total_tokens": 542,
    "prompt_tokens": 467,
    "completion_tokens": 75,
    "prompt_tokens_details": {
      "audio_tokens": null,
      "cached_tokens": 0
    },
    "cache_read_input_tokens": 0,
    "completion_tokens_details": null,
    "cache_creation_input_tokens": 0
  },
  "object": "chat.completion",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "{}",
        "tool_calls": null,
        "function_call": null
      },
      "finish_reason": "length" // this is the problem
    }
  ],
  "created": 1751201718,
  "system_fingerprint": null
}

See "finish_reason": "length"

Change Type

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

Testing

I tested the integration by configuring a LiteLLM endpoint with titleMaxTokens: 150 and verified that title generation now completes successfully without being cut off due to token limits.

Test Configuration:

  • LiteLLM endpoint with Claude 3.7 Sonnet model (AWS Bedrock)
  • Title generation with default 75 tokens (fails with "finish_reason": "length")
  • Title generation with 150 tokens (succeeds with complete title)
endpoints:
  custom:
    - name: "LiteLLM"
      apiKey: "xxxxx"
      baseURL: "https://litellm.domain.com/v1"
      models:
        default: [ "eu.anthropic.claude-3-7-sonnet-20250219-v1:0" ]
        fetch: true
      titleConvo: true
      titleModel: "eu.anthropic.claude-3-7-sonnet-20250219-v1:0"
      titleMessageRole: "user"
      titleMaxTokens: 150
      forcePrompt: false
      modelDisplayLabel: "LiteLLM"

Checklist

  • My code adheres to this project's style guidelines
  • I have performed a self-review of my own code
  • I have commented in any complex areas of my code
  • I have made pertinent documentation changes
  • My changes do not introduce new warnings
  • My changes are effective and the feature works as expected
  • Local unit tests pass with my changes
  • Any changes dependent on mine have been merged and published in downstream modules
  • [] A pull request for updating the documentation has been submitted

@twinity1
Copy link
Contributor Author

This would solve BerriAI/litellm#9857 as there's no response from LiteLLM team (the key is to define titleMessageRole: "user" and newly added titleMaxTokens: 150).

@danny-avila if you agree with the PR, I will prepare the docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant