Skip to content

Conversation

@ErikBjare
Copy link
Member

@ErikBjare ErikBjare commented Sep 17, 2025

Implements #546


Important

Introduces auto-compacting for conversations with massive tool results, including a new /compact command and automatic hooks, with comprehensive tests.

  • Behavior:
    • Implements /compact command in commands.py for auto-compacting conversations using either an aggressive algorithm or LLM-powered resume generation.
    • Adds autocompact_hook in tools/autocompact.py to automatically trigger compacting when needed.
    • Introduces SessionCompleteException in tools/complete.py to handle session completion.
  • Utilities:
    • Adds auto_compact_log() and should_auto_compact() in util/auto_compact.py for handling large tool results.
    • Implements _create_tool_result_summary() to summarize large tool outputs.
  • Tests:
    • Adds test_auto_compact.py to test auto-compacting functionality.
    • Updates test_tools_todo.py to include new todo states and summary format.
    • Adds test_chat_history.py to test chat history context feature.

This description was created by Ellipsis for 9d6001a. You can customize this summary. It will automatically update as commits are pushed.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed everything up to d9c75ee in 1 minute and 58 seconds. Click for details.
  • Reviewed 690 lines of code in 6 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. gptme/commands.py:217
  • Draft comment:
    Potential division by zero when original_tokens is 0. Consider checking before division.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 50% The code does have a potential division by zero if original_tokens is 0. However, this would only happen if either: 1) no model is found (m is None), or 2) the input messages are empty. In both cases, we probably shouldn't even be trying to compact the messages. Looking at line 192, there's a check should_auto_compact() that likely prevents these edge cases. But without seeing that function, I can't be 100% sure. I don't have visibility into should_auto_compact() or len_tokens() functions. The division by zero could theoretically happen if those functions don't properly handle edge cases. While there is a theoretical risk, the code has validation earlier in the function that makes this unlikely. The suggested fix adds complexity for an edge case that probably can't occur. The comment identifies a real but likely impossible edge case. The existing validation probably prevents it, and the suggested fix adds unnecessary complexity.

Workflow ID: wflow_wSZZ5J9Sm7p6XBzt

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@codecov
Copy link

codecov bot commented Sep 17, 2025

Codecov Report

❌ Patch coverage is 58.03109% with 81 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
gptme/commands.py 13.20% 46 Missing ⚠️
gptme/tools/autocompact.py 39.58% 29 Missing ⚠️
gptme/chat.py 53.84% 6 Missing ⚠️

📢 Thoughts on this report? Let us know!

@openhands-ai
Copy link

openhands-ai bot commented Sep 17, 2025

Looks like there are a few issues preventing this PR from being merged!

  • GitHub Actions are failing:
    • Test

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #645 at branch `dev/resume`

Feel free to include any additional details that might help me get this PR into a better state.

You can manage your notification settings

@ErikBjare
Copy link
Member Author

ErikBjare commented Oct 3, 2025

This should probably be implemented as a tool and make use of hooks and commands: #660

ErikBjare and others added 2 commits October 3, 2025 12:52
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed b84905b in 1 minute and 50 seconds. Click for details.
  • Reviewed 19 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. tests/test_auto_compact.py:23
  • Draft comment:
    Good use of varied filenames to simulate non-compressible tool output.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
2. tests/test_auto_compact.py:26
  • Draft comment:
    Ensure that using target_tokens//2 filenames (~2 tokens each) always yields a sufficiently massive message (both in token and char count) to satisfy test thresholds, especially if model contexts vary.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50% None

Workflow ID: wflow_iOZEjBbHlo1Thvpn

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

- Replace repeated 'x' chars with varied filenames that tokenize properly
- Repeated chars compress too well during tokenization
- Filenames like 'file_N.txt' tokenize less efficiently for accurate testing
@ErikBjare ErikBjare changed the title feat: started working on /compact and auto-compacting feat: implement /compact and auto-compacting Oct 3, 2025
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 11e5c16 in 3 minutes and 5 seconds. Click for details.
  • Reviewed 319 lines of code in 5 files
  • Skipped 0 files when reviewing.
  • Skipped posting 6 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. gptme/chat.py:131
  • Draft comment:
    Good use of SessionCompleteException to exit the main loop cleanly via exception handling.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
2. gptme/chat.py:180
  • Draft comment:
    The 'assert msg is not None' check on the prompt queue is a solid defensive programming measure.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
3. gptme/chat.py:238
  • Draft comment:
    Delegating complete tool detection to its hook (raising the exception) simplifies the loop logic.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
4. gptme/logmanager.py:355
  • Draft comment:
    Replacing the auto-compaction check with regular reduction delegates compaction to the dedicated hook; ensure that all compaction cases are still handled as expected.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
5. gptme/tools/autocompact.py:44
  • Draft comment:
    Consider enhancing fork name uniqueness (e.g., by appending a timestamp or UUID) to avoid potential collisions on repeated compactions.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
6. gptme/tools/complete.py:39
  • Draft comment:
    Using slicing (log[-5:]) to check recent messages is safe even when the log has fewer than 5 entries.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_6JbZX93KIehzm56X

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Critical fixes to prevent infinite loop issue from 2025-10-03-hopping-purple-squid:

1. Add reentrancy guard (60s minimum interval between attempts)
   - Prevents hook from running repeatedly if it fails
   - Breaks potential infinite loop cycles

2. Change success message to hide=True
   - Prevents assistant from responding to autocompact messages
   - Eliminates message cascade that triggered loops

3. Add comprehensive error handling around compaction
   - Wrap all compaction logic in try-except
   - Log errors but don't yield messages
   - Prevents cascading failures

Root cause: Hook failures + assistant responses created message cascade
that triggered MESSAGE_POST_PROCESS repeatedly, causing infinite loop.
Changed from checking exact string match to checking components,
since the new format uses 'state: count' instead of 'count state'
and Counter iteration order may vary.
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 15ce28d in 1 minute and 48 seconds. Click for details.
  • Reviewed 103 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. gptme/tools/autocompact.py:96
  • Draft comment:
    Guard against division by zero in the percentage reduction calculation.
  • Reason this comment was not posted:
    Comment was on unchanged code.
2. gptme/tools/autocompact.py:16
  • Draft comment:
    Consider thread-safety for the global reentrancy guard if autocompact_hook is used concurrently.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The code does use global state for reentrancy protection. If autocompact_hook is called from multiple threads, there could be race conditions around reading/writing _last_autocompact_time. However, I don't have enough context about whether this code is actually used in a multi-threaded context. The comment is speculative without evidence that threading is actually used. I don't have proof that this code runs in a multi-threaded environment. The comment may be raising a theoretical issue that doesn't exist in practice. While thread safety is important, we should only keep comments that point out actual, not theoretical, issues. Without evidence of multi-threading, this is too speculative. Delete the comment as it makes a speculative suggestion without evidence that thread-safety is actually needed in this codebase.
3. gptme/tools/autocompact.py:68
  • Draft comment:
    Review error handling consistency: fork failures yield a user message while auto-compaction errors only log the error. Confirm if this disparity is intentional.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_y64nUV6ZGHrYbsI3

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 9d6001a in 1 minute and 4 seconds. Click for details.
  • Reviewed 17 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. tests/test_tools_todo.py:151
  • Draft comment:
    Refactoring the summary assertions to check individual substrings instead of a fixed format is a good improvement to account for order variations. For future maintainability, consider using a regular expression or structured parsing if the summary format might change further.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_hemRUYAkkfgi86o1

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@ErikBjare ErikBjare merged commit 61392fc into master Oct 3, 2025
10 checks passed
@ErikBjare ErikBjare mentioned this pull request Oct 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants