-
Notifications
You must be signed in to change notification settings - Fork 8.3k
fix: Context window truncation using CondensationAction
#7578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Context window truncation using CondensationAction
#7578
Conversation
…ving last of the truncation tests
| assert controller.state.truncation_id is not None | ||
| assert controller.state.truncation_id > controller.state.start_id | ||
|
|
||
| def test_history_restoration_after_truncation(self, mock_event_stream, mock_agent): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was moved to test_agent_controller.py with no change.
|
|
||
|
|
||
| class TestTruncation: | ||
| def test_apply_conversation_window_basic(self, mock_event_stream, mock_agent): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was moved to test_agent_controller.py, and extended with the checks from test_context_window_exceeded_handling.
| assert len(controller.state.history) == history_len | ||
| assert len(controller.get_trajectory()) == history_len | ||
|
|
||
| def test_context_window_exceeded_handling(self, mock_event_stream, mock_agent): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was folded into test_apply_conversation_window_basic during the move.
| if isinstance(event, CmdOutputObservation): | ||
| assert any(e._id == event._cause for e in truncated[: i + 1]) | ||
|
|
||
| def test_truncation_does_not_impact_trajectory(self, mock_event_stream, mock_agent): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check for this test was folded into test_context_window_exceeded_error_handling (that's not the test originally from this file, note the extra error in the name).
|
To anyone looking at this code: I recommend starting with the The other changes are important but mostly serve to test/support the above. |
neubig
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this looks good to me!
I have a slight concern about the efficiency implications of calling View.from_events many times, but maybe it's not a major concern.
Hmm, I see what you mean. I think we'll always have to call it once per-step, but since it's linear in the size of the history that's probably okay. I can optimize away any more calls by caching the view construction in the state. |
Co-authored-by: Graham Neubig <neubig@gmail.com>
enyst
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice PR, thank you for this! I now see better what you had in mind. 👍
|
I tried to play with older streams, and it looks good. I had some fun with Haiku-generated summaries for 400+ events - amazingly it did mostly do a summary indeed, although it didn't respect the format. Sometimes when it's large enough, Sonnet gets taken away by the events and forgets to summarize. 😅 It did indeed get some I didn't try streams generated on branch (just a few older ones), I think it's fine though. |
There's a few instances we've seen where the summarization forgets pretty important bits. If you're good with these changes LMK -- once I get this merged I'm going to start improving the summarization prompt. |
enyst
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to note, it looks like it also works as expected with no condenser enabled/found.
Thank you for this, I think it's absolutely worth taking this in asap and have fun. ☕️
|
Sorry I didn't get a chance to review carefully but I skimmed over it and didn't spot any major issue. Thanks for the fix! |
…7578) Co-authored-by: Calvin Smith <calvin@all-hands.dev> Co-authored-by: Graham Neubig <neubig@gmail.com>
…7578) Co-authored-by: Calvin Smith <calvin@all-hands.dev> Co-authored-by: Graham Neubig <neubig@gmail.com>
End-user friendly description of the problem this fixes or functionality that this introduces.
Give a summary of what the PR does, explaining any non-trivial design decisions.
When an LLM hits a context window limitation, the agent controller chops the history roughly in-half so the agent can continue to make forward progress. This behavior is a necessary fallback but introduces some strange behavior (#7175, for example) when sessions are reloaded, especially when condensers are in play.
#7353 introduced two new primitives that can help to resolve this issue: the
CondensationActionand theView. Instead of relying on the agent controller to maintain an event stream and a history and keep them in-sync modulo any truncations, we can insteadCondensationActionevent. This event records which parts of the history should be removed.Viewobjects downstream instead of the rawState.history. When given a list of events, theViewautomatically applies any condensation actions to produce a smaller history that can be used by agents.This approach effectively stores the context truncation in the event stream/history. If we reload a session or some state gets out-of-sync, we can produce the exact same truncated history directly from the event stream.
(As an added benefit, this uses the same system as condensers, so they shouldn't end up in a position where they have to fight for priority on what/how to truncate.)
To support these changes, this PR makes the following edits:
Viewdefinition relocated andopenhands.memorysub-module structure slightly reorganized. Necessary to avoid circular imports for the next edit.Stateobjects have aState.viewproperty which constructs aViewdirectly from the state's history.CondensationActioninstead of manually chopping up the history.state.historyin agents updated tostate.view.Link of any specific issues this addresses.