Skip to content

Conversation

Spycsh
Copy link
Collaborator

@Spycsh Spycsh commented Jun 28, 2024

Description

The default TGI max_total_tokens is 2048, which is not good for long retrieved input context.
Sometimes the TGI will raise an error like Input validation error: inputs tokens + max_new_tokens must be <= 2048

One fix is to double the max tokens --max-input-length 2048 --max-total-tokens 4096

The other issue that may sometimes break the flow is in the TEI reranking. It usually occurs with the error log: Input validation error: inputs must have less than 512 tokens. TEI has a limit of 512 input tokens and this can be remedied with the --auto-truncate flag.

Issues

n/a

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds new functionality)
  • Breaking change (fix or feature that would break existing design and interface)

Dependencies

None

Tests

None

@Spycsh Spycsh changed the title add TEI and TGI parameters for long docs Add key TEI and TGI parameters for handling long retrievals Jun 28, 2024
@chensuyue chensuyue merged commit 1b30783 into opea-project:main Jun 28, 2024
yogeshmpandey pushed a commit to hteeyeoh/GenAIExamples that referenced this pull request Aug 12, 2024
JakubLedworowski pushed a commit to JakubLedworowski/GenAIExamples that referenced this pull request Jan 28, 2025
Signed-off-by: Yingchun Guo <yingchun.guo@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants