-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Token context window reaches max too fast #126
Comments
Hey @TheWhiteWord, thanks for opening. we may need to do some tricky stuff to solve this, but it's possible. is there a way you envision these parameters being set? perhaps we should add a flag like the problem with raising the context window:
|
Hey @KillianLucas , the pleasure is mine. You and your team have been doing a great job. This is so exciting. I think that setting the parameters on activation is definitely something with investigating. However, and this comes from a "user" point of view, (I do not have much tech knowledge of what is possible) if the parameter was able to be changed during the session through some command, that would allow one to adjust to the situation, as often is unpredictable if the issue is going to happen, or when. 1. ANSWER I totally see the problem with raising the token count too high... my PC did not like that too much (xd). A wizard is a good solution to avoid people pushing the limits too much. 2. ANSWER Still I think the proper solution is indeed to find a compromise and probably chop off the top of the prompt, even thou my instinct tells me that it may cause "memory" problems, but again I have low tech knowledge. I wonder if there is some clever way for it to recognize when the repetitions are identical, and over a certain amount, and force it to sum them up as one. In fact if the repetitions are identical means that no progress was made, and therefore the information can be reduced. Anyhow, I am sure you know way better than me what can and can't be done. I mean you built the whole thing, and that in itself is a great accomplishment. Well done! And thanks for replying so fast... i guess you are getting no sleep for the next few weeks XD GOOD LUCK! |
Adding to this conversation, I changed the As the coder(s) of tokentrim mention in
they're not sure of the token size for this model. Referring to the CodeLlama model card in their white paper, its models
(Read the white paper about issues when surpassing 16K though.) So I went with 4096 and put that in tokentrim AND in interpreter.py
tokentrim.py
I now got passed that earlier-than-expected boundary and am running into some other issues - of course 😁 |
@merlinfrombelgium Thank you! Your solution works for me. Max context is also hard-coded at line 176 in llama_2.py
|
Tried changing the tokens on these 3 files. But for me, it always stops at around a certain char count. These mods make no difference in my case, it keeps repeating the same code over and over:
Another example
|
its in interpreter there @zono50 change it to the actual path
or run it from where you were |
The code was refactored recently, so it's in |
yeah may bad. should still apply |
cc: @ishaan-jaff feels relevant to your token trimmer work |
I've also seen CodeLlama-34B repeating its own code when asking it to fix a bug, sometimes it even repeats simple things, there must be something wrong with open-interpreter's prompt chaining/processing
|
I’m going to close this one as stale for now. Feel free to reopen if there’s more to discuss with the latest version. |
I Keep running out of tokens before managing to accomplish any "multiple" steps task. The program encounters errors for mistakes or, like in this case, it does not manage to pip install the libraries that are needed for the task.
I have looked into the documentation but I can not find a way to give admin power to the bot to install the pip it needs at any given time. Also, I can not find a way to extend the amount of tokens.
Can anyone help?
(Model=Llama2 7B Large)
Edit: (it turns out that it actually manages to pip install the libraries but still crashes from going over the amount of the token. My intuition is that it keeps making mistakes and trying again until it manages to do it, but by doing so it runs out of tokens)
Edit2:(I have managed to extend the context window by manually modifying the number of tokens on the Llama2.py file. However (FEATURE REQUEST) would be good to be able to do this inside the program itself. But maybe I am just too noob to know how to)
also
(By testing further it seems that the problem of the utilization of the token may also be connected to the fact that when there is an error the model often repeats the error text multiple times over in the same prompt, eventually creating a long list of the same repetition that eats up the allowance. I can paste examples if you think is a correct intuition and may be helpful)
EXAMPLE OF THE ISSUE:
curl -s https://www.bbc.com/news/rss.xml | xmllint --format - | grep "<title>" | head -n 10 | sed 's/<[^>]*>//g'
Would you like to run this code? (y/n)
y
curl -s https://www.bbc.com/news/rss.xml | xmllint --format - | grep "<title>" | head -n 10 | sed 's/<[^>]*>//g'
Microsoft Windows [Version 10.0.22621.2134]
(c) Microsoft Corporation. All rights reserved.
(open-interpreter) C:\Users\DCinz>curl -s https://www.bbc.com/news/rss.xml | xmllint --format - | grep "<title>" |
head -n 10 | sed 's/<[^>]*>//g'
'xmllint' is not recognized as an internal or external command,
operable program or batch file.
(open-interpreter) C:\Users\DCinz>
it appears that you are using Windows 10 and have xmllint installed on your system. However, the command is not
recognized as a valid internal or external command.
To fix this issue, you can try installing xmllint by following these steps:
1 Open the Command Prompt as an administrator. To do this, right-click on the Start button and select "Command
Prompt (Admin)".
2 Type the following command and press Enter:
choco install xmllint
Would you like to run this code? (y/n)
y
choco install xmllint
Traceback (most recent call last):
File "C:\Users\DCinz\anaconda3\envs\Open-interpreter\Lib\site-packages\interpreter\code_interpreter.py", line 158,
in run
code = self.add_active_line_prints(code)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\DCinz\anaconda3\envs\Open-interpreter\Lib\site-packages\interpreter\code_interpreter.py", line 249,
in add_active_line_prints
return add_active_line_prints_to_python(code)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\DCinz\anaconda3\envs\Open-interpreter\Lib\site-packages\interpreter\code_interpreter.py", line 411,
in add_active_line_prints_to_python
tree = ast.parse(code)
^^^^^^^^^^^^^^^
File "C:\Users\DCinz\anaconda3\envs\Open-interpreter\Lib\ast.py", line 50, in parse
return compile(source, filename, mode, flags,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 1
choco install xmllint
^^^^^^^
SyntaxError: invalid syntax
it appears that you are using Windows 10 and have xmllint installed on your system. However, the command is not
recognized as a valid internal or external command.
To fix this issue, you can try
llama_tokenize_with_model: too many tokens
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in run_code
File "C:\Users\DCinz\anaconda3\envs\open-interpreter\Scripts\interpreter.exe_main.py", line 7, in
File "C:\Users\DCinz\anaconda3\envs\Open-interpreter\Lib\site-packages\interpreter\interpreter.py", line 104, in cli
cli(self)
File "C:\Users\DCinz\anaconda3\envs\Open-interpreter\Lib\site-packages\interpreter\cli.py", line 46, in cli
interpreter.chat()
File "C:\Users\DCinz\anaconda3\envs\Open-interpreter\Lib\site-packages\interpreter\interpreter.py", line 256, in chat
self.respond()
File "C:\Users\DCinz\anaconda3\envs\Open-interpreter\Lib\site-packages\interpreter\interpreter.py", line 469, in respond
for chunk in response:
File "C:\Users\DCinz\anaconda3\envs\Open-interpreter\Lib\site-packages\llama_cpp\llama.py", line 901, in _create_completion
raise ValueError(
ValueError: Requested tokens (1114) exceed context window of 1048
(open-interpreter) C:\Users\DCinz>
The text was updated successfully, but these errors were encountered: