Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is context window size limited to 2k tokens, regardless of the model used? #1781

Closed
brankoradovanovic-mcom opened this issue Dec 26, 2023 · 3 comments

Comments

@brankoradovanovic-mcom
Copy link

brankoradovanovic-mcom commented Dec 26, 2023

It seems that the message "Recalculating context" in the chat (or "LLaMA: reached the end of the context window so resizing" during API calls) appears after 2k tokens, regardless of the model used.

When that happens, the models indeed forget the content that preceded the current context window. So e.g. they are unable to answer multiple questions about a chunk of text, as questions and answers add to the conversation and eventually push this chunk outside of the context window. From that point on, the answers become 100% hallucinations.

There are many models now that advertise large context window sizes - Yi LLMs in particular. However, from what I've tried, none of that seems to work in GPT4All: "Recalculating context" always appears at the 2k mark. Is it really supposed to be this way?

(Note this is not the same issue as #1638. I'd argue it is actually worse. Prompt size limitations could be worked around by chunking the input. Unfortunately, since the context window size is the same as max prompt size, chunking the input doesn't help at all.)

@dlippold
Copy link

The 2k token context size was hard coded independently of the used model until recently. Now it is fixed, see d1c56b8 and the bug reports #1749 and #1668. But that fix is not yet in the current release (there was not new release after the fix).

@cebtenzzre
Copy link
Member

I temporarily reopened #1668 for visibility.

@cebtenzzre cebtenzzre closed this as not planned Won't fix, can't repro, duplicate, stale Dec 29, 2023
@Zibri
Copy link

Zibri commented May 13, 2024

I put 200000 tokens in gpt cli program but I still get that eerror often and probably way beforee 200000!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants