Skip to content

Conversation

@csmith49
Copy link
Collaborator

  • This change is worth documenting at https://docs.all-hands.dev/
  • Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below

End-user friendly description of the problem this fixes or functionality that this introduces.


Give a summary of what the PR does, explaining any non-trivial design decisions.

When an LLM hits a context window limitation, the agent controller chops the history roughly in-half so the agent can continue to make forward progress. This behavior is a necessary fallback but introduces some strange behavior (#7175, for example) when sessions are reloaded, especially when condensers are in play.

#7353 introduced two new primitives that can help to resolve this issue: the CondensationAction and the View. Instead of relying on the agent controller to maintain an event stream and a history and keep them in-sync modulo any truncations, we can instead

  1. Modify the context truncation behavior to emit a CondensationAction event. This event records which parts of the history should be removed.
  2. Rely on the use of View objects downstream instead of the raw State.history. When given a list of events, the View automatically applies any condensation actions to produce a smaller history that can be used by agents.

This approach effectively stores the context truncation in the event stream/history. If we reload a session or some state gets out-of-sync, we can produce the exact same truncated history directly from the event stream.

(As an added benefit, this uses the same system as condensers, so they shouldn't end up in a position where they have to fight for priority on what/how to truncate.)

To support these changes, this PR makes the following edits:

  • View definition relocated and openhands.memory sub-module structure slightly reorganized. Necessary to avoid circular imports for the next edit.
  • State objects have a State.view property which constructs a View directly from the state's history.
  • Agent controller context truncation updated to emit CondensationAction instead of manually chopping up the history.
  • Tests updated.
  • References to state.history in agents updated to state.view.

Link of any specific issues this addresses.

@csmith49 csmith49 marked this pull request as ready for review March 28, 2025 23:52
assert controller.state.truncation_id is not None
assert controller.state.truncation_id > controller.state.start_id

def test_history_restoration_after_truncation(self, mock_event_stream, mock_agent):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was moved to test_agent_controller.py with no change.



class TestTruncation:
def test_apply_conversation_window_basic(self, mock_event_stream, mock_agent):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was moved to test_agent_controller.py, and extended with the checks from test_context_window_exceeded_handling.

assert len(controller.state.history) == history_len
assert len(controller.get_trajectory()) == history_len

def test_context_window_exceeded_handling(self, mock_event_stream, mock_agent):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was folded into test_apply_conversation_window_basic during the move.

if isinstance(event, CmdOutputObservation):
assert any(e._id == event._cause for e in truncated[: i + 1])

def test_truncation_does_not_impact_trajectory(self, mock_event_stream, mock_agent):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check for this test was folded into test_context_window_exceeded_error_handling (that's not the test originally from this file, note the extra error in the name).

@csmith49
Copy link
Collaborator Author

To anyone looking at this code: I recommend starting with the View class to see how CondensationAction events are used, then taking a look at the changes in the agent controller.

The other changes are important but mostly serve to test/support the above.

@csmith49 csmith49 requested review from enyst and li-boxuan March 30, 2025 14:28
Copy link
Contributor

@neubig neubig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this looks good to me!

I have a slight concern about the efficiency implications of calling View.from_events many times, but maybe it's not a major concern.

@csmith49
Copy link
Collaborator Author

Overall, this looks good to me!

I have a slight concern about the efficiency implications of calling View.from_events many times, but maybe it's not a major concern.

Hmm, I see what you mean. I think we'll always have to call it once per-step, but since it's linear in the size of the history that's probably okay. I can optimize away any more calls by caching the view construction in the state.

Copy link
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice PR, thank you for this! I now see better what you had in mind. 👍

@enyst
Copy link
Collaborator

enyst commented Mar 31, 2025

I tried to play with older streams, and it looks good. I had some fun with Haiku-generated summaries for 400+ events - amazingly it did mostly do a summary indeed, although it didn't respect the format. Sometimes when it's large enough, Sonnet gets taken away by the events and forgets to summarize. 😅

It did indeed get some truncation_id without blinking.

I didn't try streams generated on branch (just a few older ones), I think it's fine though.

@csmith49
Copy link
Collaborator Author

I tried to play with older streams, and it looks good. I had some fun with Haiku-generated summaries for 400+ events - amazingly it did mostly do a summary indeed, although it didn't respect the format. Sometimes when it's large enough, Sonnet gets taken away by the events and forgets to summarize. 😅

There's a few instances we've seen where the summarization forgets pretty important bits. If you're good with these changes LMK -- once I get this merged I'm going to start improving the summarization prompt.

Copy link
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to note, it looks like it also works as expected with no condenser enabled/found.

Thank you for this, I think it's absolutely worth taking this in asap and have fun. ☕️

@csmith49 csmith49 merged commit abaf0da into OpenHands:main Mar 31, 2025
24 checks passed
@li-boxuan
Copy link
Collaborator

Sorry I didn't get a chance to review carefully but I skimmed over it and didn't spot any major issue. Thanks for the fix!

doew pushed a commit to doew/OpenHands that referenced this pull request Apr 2, 2025
…7578)

Co-authored-by: Calvin Smith <calvin@all-hands.dev>
Co-authored-by: Graham Neubig <neubig@gmail.com>
shabbir-shakudo pushed a commit to devsentient/OpenHands that referenced this pull request Jul 15, 2025
…7578)

Co-authored-by: Calvin Smith <calvin@all-hands.dev>
Co-authored-by: Graham Neubig <neubig@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants