Problem
When querying documents with deep heading hierarchies (e.g. 5+ levels), the agentic retrieval agent consistently returns high-level summaries instead of the precise leaf-level content that answers the question.
Root Cause Observed (User Perspective)
The document navigation phase only "sees" the top-level outline sections. When a parent node (e.g. Section 5: Safety Measures) is selected during discovery, its child chunks (e.g. 5.3 / 5.3.1 Monitoring) are loaded into a flat buffer rather than being properly nested into the document tree. As a result:
- The agent cannot drill down to child sections within the same navigation turn.
- Deep chunks are displayed as orphaned text instead of being nested under their section headings.
- The answer is assembled from the generic parent summary, missing the specific technical details.
Steps to Reproduce
- Ingest a structured document with at least 4 heading levels (e.g. a technical construction or safety report).
- Ask a question that requires a specific detail from a level-3 or deeper sub-section.
- Observe that the retrieved evidence contains only level-1 section summaries.
Expected Behavior
- When the agent COLLECTs a non-leaf section, all descendant chunks should be reparented into the correct child subtree.
- The navigation runner should build a hierarchical outline by nesting outline metadata + hydrated leaf content, level by level.
- A
shallow hydration mode should exist so the agent can fetch direct children only (avoiding over-fetching the entire subtree).
- The depth-limit filter in section loading should be bypassable when the navigation runner explicitly needs the full tree.
Impact
Users asking detailed, fact-specific questions on long documents receive dangerously incomplete answers. This erodes trust in the knowledge retrieval system for professional use cases (legal, engineering, medical documentation).
Problem
When querying documents with deep heading hierarchies (e.g. 5+ levels), the agentic retrieval agent consistently returns high-level summaries instead of the precise leaf-level content that answers the question.
Root Cause Observed (User Perspective)
The document navigation phase only "sees" the top-level outline sections. When a parent node (e.g. Section 5: Safety Measures) is selected during discovery, its child chunks (e.g. 5.3 / 5.3.1 Monitoring) are loaded into a flat buffer rather than being properly nested into the document tree. As a result:
Steps to Reproduce
Expected Behavior
shallowhydration mode should exist so the agent can fetch direct children only (avoiding over-fetching the entire subtree).Impact
Users asking detailed, fact-specific questions on long documents receive dangerously incomplete answers. This erodes trust in the knowledge retrieval system for professional use cases (legal, engineering, medical documentation).