Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decrease default number of tokens in UI #351

Merged
merged 1 commit into from
Mar 11, 2024
Merged

decrease default number of tokens in UI #351

merged 1 commit into from
Mar 11, 2024

Conversation

pmeier
Copy link
Member

@pmeier pmeier commented Mar 11, 2024

This is a temporary fix for the Cohere assistants. Although in theory they support a 4k context window

_CONTEXT_SIZE: int = 4_000
# See https://docs.cohere.com/docs/models#command

they seem to be using a very different tokenizer than what we use. When trying the assistant in the web UI with a document that actually supply 4k tokens, I get the following error message in the response

too many tokens: size limit exceeded by 1502 tokens. Try using shorter or fewer inputs, or setting prompt_truncation='AUTO'.

So the number of tokens actually differs by almost 40%.

In this PR I simply reduce the default number of tokens that we pull from the source storage to feed into the assistant. Setting this automatically depending on the assistant is a larger task that I'll open an issue about. Plus, the tokenizer would actually need to be dependent on the assistant as well.

Functionally, this change should make too much of a difference. Unless the user has a very complex query, by default we are still going to pull in four sources. That should be sufficient.

@pmeier pmeier requested a review from nenb March 11, 2024 14:18
@pmeier pmeier merged commit 329a1a0 into main Mar 11, 2024
10 checks passed
@pmeier pmeier deleted the default-num-tokens branch March 11, 2024 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants