Skip to content

research(context): HiAgent subgoal-aware context compaction for long-horizon task coherence #2022

@bug-ops

Description

@bug-ops

Source

HiAgent: Hierarchical Working Memory Management for Solving Long-Horizon Agent Tasks with LLMs
ACL 2025 — https://aclanthology.org/2025.acl-long.1575.pdf

Core Idea

Rather than compressing history based on recency alone, the agent tracks the current subgoal and compresses only information that is no longer relevant to that subgoal. The mechanism:

  1. Before each action cluster, the agent formulates a subgoal (1-2 sentences)
  2. Context is partitioned into: relevant-to-subgoal (kept), completed-subgoal (summarizable), and outdated context (compressible)
  3. When context pressure hits, only outdated/completed sections are summarized -- not active working memory

Results: 2x success rate improvement and 3.8x step reduction on long-horizon benchmarks (WebArena, SWE-bench variants).

Current Zeph Gap

Zeph's compaction strategies (reactive, task_aware, MIG) compress based on token thresholds and recency, but do not consider whether a message segment is still relevant to the current task goal. This means:

  • Active reasoning chains can get summarized while in use
  • Completed subtask context (no longer needed) remains in full detail
  • The agent cannot distinguish "I just used this" from "I used this 10 turns ago and it's done"

The task_aware strategy is closest, but it compresses at fixed thresholds rather than dynamically tracking subgoal relevance.

Implementation Sketch

  1. Add subgoal tracking: after each assistant response containing a plan step or tool result, extract a 1-sentence subgoal description (fire-and-forget LLM call, similar to existing trajectory summarization)
  2. Tag context segments with the subgoal they served (stored in message metadata or a side table)
  3. During compaction, prioritize compressing segments whose subgoal is marked "completed" over segments with active or current subgoal
  4. Expose current subgoal in debug output and optionally in TUI status bar

This is complementary to existing compaction strategies -- it's an input signal to the compaction decision, not a replacement.

Complexity

Medium. Requires subgoal extraction (new LLM call, fire-and-forget), metadata tagging on messages, and compaction priority logic. The infrastructure (fire-and-forget LLM calls, message metadata) already exists in zeph-core.

Expected Benefit

  • Reduces compaction of active working memory (less context thrashing)
  • More efficient compression of completed subtask history
  • Improved coherence in long multi-step tasks (WebArena/SWE-bench class)

See Also

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2High value, medium complexitycontextContext management and message handlingresearchResearch-driven improvement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions