Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the maximum context length issue by chunking #3222

Merged
merged 20 commits into from
May 1, 2023

Conversation

kinance
Copy link
Contributor

@kinance kinance commented Apr 25, 2023

Background

Multiple issues opened about the same issue, e.g. #2801 #2871 #2906 and more, which multiple commands calls memory.add() which then calls create_embedding_with_ada, in the cases where the input text exceeds the model's 8191 token limit, we will get an InvalidRequestError saying that "This model's maximum context length is 8191 tokens...".

Resolves #2801, resolves #2871, resolves #2906, resolves #3244

Changes

The issue is fixed by chunking the input text, then running embedding individually and then combining by weighted averaging. This approach is suggested by the OpenAI. This change model after OpenAI Cookbook. This PR should fix numbers of open issues including the ones mentioned above and more.

PR Quality Checklist

  • My pull request is atomic and focuses on a single change.
  • I have thoroughly tested my changes with multiple different prompts.
  • I have considered potential risks and mitigations for my changes.
  • I have documented my changes clearly and comprehensively.
  • I have not snuck in any "extra" small tweaks changes

@codecov
Copy link

codecov bot commented Apr 25, 2023

Codecov Report

Patch coverage: 86.48% and project coverage change: +0.24 🎉

Comparison is base (0ef6f06) 60.31% compared to head (572cac9) 60.55%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3222      +/-   ##
==========================================
+ Coverage   60.31%   60.55%   +0.24%     
==========================================
  Files          69       69              
  Lines        3152     3184      +32     
  Branches      525      528       +3     
==========================================
+ Hits         1901     1928      +27     
- Misses       1118     1122       +4     
- Partials      133      134       +1     
Impacted Files Coverage Δ
autogpt/llm/__init__.py 100.00% <ø> (ø)
autogpt/llm/modelsinfo.py 100.00% <ø> (ø)
autogpt/config/config.py 76.25% <66.66%> (-0.58%) ⬇️
autogpt/llm/llm_utils.py 66.66% <92.85%> (+5.34%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@kinance kinance requested a review from Pwuts April 25, 2023 15:28
@kinance
Copy link
Contributor Author

kinance commented Apr 25, 2023

@Pwuts I think this change can fix and close multiple open issues. Could you please review, approve and merge?

@Pwuts
Copy link
Member

Pwuts commented Apr 25, 2023

Please link issues if this PR resolves them

@Pwuts
Copy link
Member

Pwuts commented Apr 25, 2023

Also, this is missing test coverage. Can you fix that (using pytest, not unittest)?

Copy link
Member

@Pwuts Pwuts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add unit tests using pytest

@github-actions github-actions bot added the conflicts Automatically applied to PRs with merge conflicts label Apr 25, 2023
@github-actions
Copy link
Contributor

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

@GoMightyAlgorythmGo
Copy link

GoMightyAlgorythmGo commented Apr 25, 2023

endless crashes since 4 days a lot but happening less often since 2-3 weeks. Crashed 4 times in a row and constantly for 3 hours every restart. Here is some code to cap the max length for GPT3.5t because max is about 8191 tokens so to be save under 24000 seems to be fine most of the time here the code:

#3239 (comment)

@sidewaysthought

This comment was marked as off-topic.

Add basic unit test for the new chunked func
@vercel
Copy link

vercel bot commented Apr 26, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
docs ⬜️ Ignored (Inspect) Visit Preview May 1, 2023 6:06pm

@github-actions
Copy link
Contributor

Conflicts have been resolved! 🎉 A maintainer will review the pull request shortly.

@github-actions github-actions bot removed the conflicts Automatically applied to PRs with merge conflicts label Apr 26, 2023
@vercel vercel bot temporarily deployed to Preview April 26, 2023 15:42 Inactive
@kinance
Copy link
Contributor Author

kinance commented Apr 26, 2023

Linked the issues that this PR is going to fix and added a unit test for the new chunk token func

@vercel vercel bot temporarily deployed to Preview April 26, 2023 15:52 Inactive
@github-actions github-actions bot added the conflicts Automatically applied to PRs with merge conflicts label Apr 26, 2023
@github-actions
Copy link
Contributor

This pull request has conflicts with the base branch, please resolve those so we can evaluate the pull request.

sidewaysthought added a commit to sidewaysthought/Auto-GPT that referenced this pull request Apr 26, 2023
sidewaysthought added a commit to sidewaysthought/Auto-GPT that referenced this pull request Apr 27, 2023
@github-actions github-actions bot removed the conflicts Automatically applied to PRs with merge conflicts label Apr 27, 2023
Pwuts
Pwuts previously approved these changes May 1, 2023
Copy link
Member

@Pwuts Pwuts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Best we can do for now; we'll have to iterate on this when reworking the memory system

waynehamadi
waynehamadi previously approved these changes May 1, 2023
tests/integration/conftest.py Outdated Show resolved Hide resolved
@Pwuts Pwuts dismissed stale reviews from waynehamadi and themself via 82c5ae0 May 1, 2023 17:45
@github-actions
Copy link
Contributor

github-actions bot commented May 1, 2023

This PR exceeds the recommended size of 200 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size

@github-actions
Copy link
Contributor

github-actions bot commented May 1, 2023

This PR exceeds the recommended size of 200 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size

richbeales
richbeales previously approved these changes May 1, 2023
@github-actions
Copy link
Contributor

github-actions bot commented May 1, 2023

This PR exceeds the recommended size of 200 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size

@github-actions
Copy link
Contributor

github-actions bot commented May 1, 2023

This PR exceeds the recommended size of 200 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size

@github-actions
Copy link
Contributor

github-actions bot commented May 1, 2023

This PR exceeds the recommended size of 200 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size

@github-actions
Copy link
Contributor

github-actions bot commented May 1, 2023

This PR exceeds the recommended size of 200 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size

@Pwuts Pwuts merged commit 4767fe6 into Significant-Gravitas:master May 1, 2023
11 checks passed
@arrfonseca
Copy link

I've committed the changes to the master branch locally.
I still have the token length error when reading a long txt file. Nevertheless, I noticed it performed an extra step before crashing and displayed all the text in the terminal window... Maybe its related to the limit of 200 lines?

/lib/python3.10/site-packages/openai/api_requestor.py", line 682, in _interpret_response_line
raise self.handle_error_response(
openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 308617 tokens. Please reduce the length of the messages.

@kinance
Copy link
Contributor Author

kinance commented May 1, 2023

@arrfonseca could you paste your call stack? The steps how to reproduce? This PR only fixed the max length issue using embeddings around memory.add which multiple commands used.

@arrfonseca
Copy link

arrfonseca commented May 1, 2023

Im asking the AI to make a screenplay in 5 parts of 15 minutes on a text book. I have the text in .tx format
The traceback is as follows.

Traceback (most recent call last):
File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/Users/alessandro/Auto-GPT/autogpt/main.py", line 5, in
autogpt.cli.main()
File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/click/core.py", line 1130, in call
return self.main(*args, **kwargs)
File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/click/core.py", line 1635, in invoke
rv = super().invoke(ctx)
File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "/Users/alessandro/Auto-GPT/autogpt/cli.py", line 90, in main
run_auto_gpt(
File "/Users/alessandro/Auto-GPT/autogpt/main.py", line 157, in run_auto_gpt
agent.start_interaction_loop()
File "/Users/alessandro/Auto-GPT/autogpt/agent/agent.py", line 93, in start_interaction_loop
assistant_reply = chat_with_ai(
File "/Users/alessandro/Auto-GPT/autogpt/llm/chat.py", line 166, in chat_with_ai
agent.summary_memory = update_running_summary(
File "/Users/alessandro/Auto-GPT/autogpt/memory_management/summary_memory.py", line 114, in update_running_summary
current_memory = create_chat_completion(messages, cfg.fast_llm_model)
File "/Users/alessandro/Auto-GPT/autogpt/llm/llm_utils.py", line 166, in create_chat_completion
response = api_manager.create_chat_completion(
File "/Users/alessandro/Auto-GPT/autogpt/llm/api_manager.py", line 55, in create_chat_completion
response = openai.ChatCompletion.create(
File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/openai/api_resources/chat_completion.py", line 25, in create
return super().create(*args, **kwargs)
File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
response, _, api_key = requestor.request(
File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/openai/api_requestor.py", line 226, in request
resp, got_stream = self._interpret_response(result, stream)
File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/openai/api_requestor.py", line 619, in _interpret_response
self._interpret_response_line(
File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/openai/api_requestor.py", line 682, in _interpret_response_line
raise self.handle_error_response(
openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 308617 tokens. Please reduce the length of the messages.

Before that the call, if I get this right, was:
THOUGHTS: I need to start by analyzing the text in /Users/alessandro/Auto-GPT/auto_gpt_workspace/text-noam.txt to get a better understanding of the book 'Manufacturing Consent' by Noam Chomsky. I should also do some research on the internet to gather more information about the book and its themes. Once I have a good understanding of the book, I can start planning the structure of the documentary and writing the scripts for each episode.
REASONING: Analyzing the text and doing research will give me the necessary information to create a well-informed documentary.
PLAN:

  • Analyze the text in /Users/alessandro/Auto-GPT/auto_gpt_workspace/text-noam.txt
  • Do research on the internet to gather more information
  • Plan the structure of the documentary
  • Write the scripts for each episode
    CRITICISM: I need to make sure that I am thorough in my research and that I am accurately representing the themes of the book in the documentary.
    Error:
    Attempted to access absolute path '/Users/alessandro/Auto-GPT/auto_gpt_workspace/text-noam.txt' in workspace '/Users/alessandro/Auto-GPT/autogpt/auto_gpt_workspace'.
    NEXT ACTION: COMMAND = read_file ARGUMENTS = {'filename': '/Users/alessandro/Auto-GPT/auto_gpt_workspace/text-noam.txt'}
    Enter 'y' to authorise command, 'y -N' to run N continuous commands, 's' to run self-feedback commands'n' to exit program, or enter feedback for ...
    Asking user via keyboard...
    Input:y

After the authorization the process displayed all the book text in the terminal window, without line breaks or anything. Than autogpt crashed. I'm working in the master branch after a git pull.

@arrfonseca
Copy link

So sorry about that. I realized I had to switch to the branch fix/crash-on-context-overflow and it gives me
-=-=-=-=-=-=-= COMMAND AUTHORISED BY USER -=-=-=-=-=-=-=
SYSTEM: Failure: command read_file returned too much output. Do not execute this command again with the same arguments.
It summarized the .txt file to circumvent the "too much output" so the work is done on a very small portion of text
I'm trying to tell it to break the .txt file into smaller files to have the complete reading of the book.

@kinance kinance deleted the fix-bug-2801-2871-2906 branch May 2, 2023 15:50
@kinance
Copy link
Contributor Author

kinance commented May 2, 2023

Im asking the AI to make a screenplay in 5 parts of 15 minutes on a text book. I have the text in .tx format The traceback is as follows.

Traceback (most recent call last): File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/Users/alessandro/Auto-GPT/autogpt/main.py", line 5, in autogpt.cli.main() File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/click/core.py", line 1130, in call return self.main(*args, **kwargs) File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/click/core.py", line 1635, in invoke rv = super().invoke(ctx) File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/click/core.py", line 760, in invoke return __callback(*args, **kwargs) File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func return f(get_current_context(), *args, **kwargs) File "/Users/alessandro/Auto-GPT/autogpt/cli.py", line 90, in main run_auto_gpt( File "/Users/alessandro/Auto-GPT/autogpt/main.py", line 157, in run_auto_gpt agent.start_interaction_loop() File "/Users/alessandro/Auto-GPT/autogpt/agent/agent.py", line 93, in start_interaction_loop assistant_reply = chat_with_ai( File "/Users/alessandro/Auto-GPT/autogpt/llm/chat.py", line 166, in chat_with_ai agent.summary_memory = update_running_summary( File "/Users/alessandro/Auto-GPT/autogpt/memory_management/summary_memory.py", line 114, in update_running_summary current_memory = create_chat_completion(messages, cfg.fast_llm_model) File "/Users/alessandro/Auto-GPT/autogpt/llm/llm_utils.py", line 166, in create_chat_completion response = api_manager.create_chat_completion( File "/Users/alessandro/Auto-GPT/autogpt/llm/api_manager.py", line 55, in create_chat_completion response = openai.ChatCompletion.create( File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/openai/api_resources/chat_completion.py", line 25, in create return super().create(*args, **kwargs) File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create response, _, api_key = requestor.request( File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/openai/api_requestor.py", line 226, in request resp, got_stream = self._interpret_response(result, stream) File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/openai/api_requestor.py", line 619, in _interpret_response self._interpret_response_line( File "/Users/alessandro/anaconda3/envs/ale-gpt/lib/python3.10/site-packages/openai/api_requestor.py", line 682, in _interpret_response_line raise self.handle_error_response( openai.error.InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 308617 tokens. Please reduce the length of the messages.

Before that the call, if I get this right, was: THOUGHTS: I need to start by analyzing the text in /Users/alessandro/Auto-GPT/auto_gpt_workspace/text-noam.txt to get a better understanding of the book 'Manufacturing Consent' by Noam Chomsky. I should also do some research on the internet to gather more information about the book and its themes. Once I have a good understanding of the book, I can start planning the structure of the documentary and writing the scripts for each episode. REASONING: Analyzing the text and doing research will give me the necessary information to create a well-informed documentary. PLAN:

* Analyze the text in /Users/alessandro/Auto-GPT/auto_gpt_workspace/text-noam.txt

* Do research on the internet to gather more information

* Plan the structure of the documentary

* Write the scripts for each episode
  CRITICISM:  I need to make sure that I am thorough in my research and that I am accurately representing the themes of the book in the documentary.
  Error:
  Attempted to access absolute path '/Users/alessandro/Auto-GPT/auto_gpt_workspace/text-noam.txt' in workspace '/Users/alessandro/Auto-GPT/autogpt/auto_gpt_workspace'.
  NEXT ACTION:  COMMAND = read_file ARGUMENTS = {'filename': '/Users/alessandro/Auto-GPT/auto_gpt_workspace/text-noam.txt'}
  Enter 'y' to authorise command, 'y -N' to run N continuous commands, 's' to run self-feedback commands'n' to exit program, or enter feedback for ...
  Asking user via keyboard...
  Input:y

After the authorization the process displayed all the book text in the terminal window, without line breaks or anything. Than autogpt crashed. I'm working in the master branch after a git pull.

File "/Users/alessandro/Auto-GPT/autogpt/memory_management/summary_memory.py", line 114, in update_running_summary current_memory = create_chat_completion(messages, cfg.fast_llm_model) File "/Users/alessandro/Auto-GPT/autogpt/llm/llm_utils.py", line 166, in create_chat_completion response = api_manager.create_chat_completion(

It's a new bug introduced from a change in memory management.

@Pwuts
Copy link
Member

Pwuts commented May 2, 2023

That (new) issue should be mitigated by #3646 while we work on a better and more permanent fix

@meanostrich
Copy link

Can someone please guide me to a newbie's guide for how to implement the patch changes you guys are releasing? I get this error all the time. I see that this individual has corrected it by modifying 9 files. Is there an easy way to implement the changes without going through each file and making the changes manually? I am worried I may miss something and I'm not as astute with modifying auto-gpt. thank you in advance, and sorry for the newb question.

@Pwuts
Copy link
Member

Pwuts commented May 30, 2023

@meanostrich I would advise against applying individual patches. Instead, keep an eye on our GitHub and other channels for new releases which contain the newest patches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project