feat: hierarchical outline index for long markdown artifacts#241
feat: hierarchical outline index for long markdown artifacts#241MainQuest99 wants to merge 4 commits intorohitg00:mainfrom
Conversation
Adds a hierarchical outline index for long markdown artefacts (CLAUDE.md,
briefs, audits) inspired by PageIndex. Agents can fetch the heading tree
plus extract a single section without loading the whole file.
- new mem:outlines KV scope (schema.ts)
- Outline / OutlineNode types
- buildOutline: ATX heading parser, ignores fenced code blocks, positional
node ids ("1.2.3"), line_end = previous line of next heading <= level
- mem::outline-build / -get / -section iii functions with mtime+size
staleness detection
- 6 unit tests for the parser (flat, code-blocks, 4-level nesting, DFS,
no-headings, sibling H1s)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wires memory_build_outline, memory_get_outline, memory_get_section into the MCP server and the REST API: - new V089_OUTLINE_TOOLS registry block (3 tools) - mcp::tools::call switch handlers (path/artifact_id/node_id validation) - POST /agentmemory/outline/build, GET /agentmemory/outline, POST /agentmemory/outline/section - 5 KV-roundtrip tests (build, get, section extract, stale detection, missing node) + 5 integration tests on realistic CLAUDE-style fixtures including a 500-line latency probe Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- src/hooks/outline-regen.ts: standalone PostToolUse hook. When Write / Edit / MultiEdit touches a tracked path (CLAUDE.md, MEMORY.md, AGENTS.md by default; whitelist at ~/.config/agentmemory/outline-tracked.txt), POSTs /agentmemory/outline/build with a 2s timeout. Best-effort, never blocks the agent. - scripts/outline-backfill.ts: walks ~/CaptainAgent, ~/.claude/projects, and ~ (depth-limited, skips node_modules / .git / dist), builds an outline for every CLAUDE.md / MEMORY.md / AGENTS.md it finds and prints built/failed counts. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- AGENTS.md: 43 -> 46 MCP tools, 103 -> 106 REST endpoints - README.md: 4x "43 MCP tools" -> "46", 1x "43 tools, 6 resources" -> "46", "109 endpoints on port" -> "112" - plugin/.claude-plugin/plugin.json: 43 MCP tools -> 46 - src/index.ts: register registerOutlineFunctions + update startup log line - docs/outline.md: feature guide (why, schema, MCP tool signatures, REST endpoints, hook + backfill, limitations) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Someone is attempting to deploy a commit to the rohitg00's projects Team on Vercel. A member of the Team first needs to authorize it. |
📝 WalkthroughWalkthroughThis PR introduces an Outline Index system for large Markdown artifacts. It adds heading-tree parsing and persistent storage for Markdown files, three new MCP tools and REST endpoints to build/retrieve outlines and extract sections, auto-regeneration hooks that watch specific agent tool writes, a backfill utility for bulk outline construction, and comprehensive tests covering parsing, storage staleness, and error handling. Tool count increases from 43 to 46; REST endpoints from 103 to 106. ChangesOutline Index Feature Implementation
Sequence DiagramssequenceDiagram
participant Client
participant BuildService as outlineBuild
participant Filesystem as FS
participant KVStore as KV
Client->>BuildService: path
BuildService->>FS: readFile
BuildService->>BuildService: parseHeadings
BuildService->>BuildService: buildOutline tree
BuildService->>KVStore: persist outline
BuildService-->>Client: { success, nodeCount }
sequenceDiagram
participant Client
participant SectionService as outlineSection
participant KVStore as KV
participant Filesystem as FS
Client->>SectionService: artifact_id, node_id
SectionService->>KVStore: loadOutline
SectionService->>SectionService: findNode
SectionService->>FS: readFile current
SectionService->>SectionService: validateStaleness
alt stale file
SectionService-->>Client: error { stale: true }
else fresh
SectionService-->>Client: { text, length }
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
AGENTS.md (1)
108-113:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winStale test count in the "Current Stats" block.
Line 113 still shows
627 testsand line 101 still reads(613+ tests), but the PR reports 735/735 passing (16 new outline tests added). These two counts were missed while the adjacent tool/endpoint counts on lines 108–109 were updated.🔧 Proposed fix
-All tests must pass before PR: `npm test` (613+ tests) +All tests must pass before PR: `npm test` (735+ tests)-- 627 tests +- 735 tests🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@AGENTS.md` around lines 108 - 113, Update the stale test counts in the "Current Stats" block: replace the occurrences of "627 tests" and "(613+ tests)" with the current total reported by the PR (e.g., "735 tests" and "(735 tests)" or consistent phrasing used elsewhere), ensuring the strings matching "627 tests" and "(613+ tests)" in the AGENTS.md "Current Stats" section are updated to reflect the new total so counts and phrasing remain consistent with the PR status.README.md (2)
136-136:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winStale endpoint badge in the "Works with every agent" table.
The "Any agent" cell badge still reads
109-endpoints. The API section (line 755) was updated to 112, but this shield badge was missed.🔧 Proposed fix
-<img src="https://img.shields.io/badge/109-endpoints-1f6feb?style=flat-square" alt="REST API" width="48" /> +<img src="https://img.shields.io/badge/112-endpoints-1f6feb?style=flat-square" alt="REST API" width="48" />Also worth updating line 747 while here:
-# Tool visibility: "core" (7 tools) or "all" (43 tools) +# Tool visibility: "core" (7 tools) or "all" (46 tools)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@README.md` at line 136, Update the stale shields.io badge that still reads "109-endpoints" in the "Any agent" table cell: find the <img ...> tag whose src contains "shields.io" and the text "109-endpoints" and change it to "112-endpoints" (and update any alt/title text that mentions 109 accordingly); also update the other badge referenced near the "line 747" mention the same way so all endpoint badges consistently show 112.
565-585:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winStale tool counts in MCP Server section.
Line 565 was updated to "46 tools" but the section heading (line 567) and the collapsible details label (line 585) still reference the old count of 43.
🔧 Proposed fix
-### 43 Tools +### 46 Tools-<summary>Extended tools (43 total — set AGENTMEMORY_TOOLS=all)</summary> +<summary>Extended tools (46 total — set AGENTMEMORY_TOOLS=all)</summary>🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@README.md` around lines 565 - 585, Update the stale tool-counts in the README so they match the new total (46): change the heading text "### 43 Tools" to "### 46 Tools" and update the collapsible details summary "Extended tools (43 total — set AGENTMEMORY_TOOLS=all)" to "Extended tools (46 total — set AGENTMEMORY_TOOLS=all)"; ensure both exact strings are replaced so the visible counts are consistent.src/types.ts (1)
449-495:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRecord audits for outline operations as required by the pattern.
The
outline-build,outline-get, andoutline-sectionfunctions register viasdk.registerFunction()but skiprecordAudit()calls, violating the required pattern. Each function performs state mutations (kv.set,kv.get) and should record these operations with entries like"outline_build","outline_get","outline_section"added toAuditEntry.operation.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/types.ts` around lines 449 - 495, The AuditEntry.operation union in the AuditEntry interface is missing entries for outline operations used elsewhere; update the AuditEntry type (the operation union in the AuditEntry interface) to include "outline_build", "outline_get", and "outline_section" so the functions that call kv.set/kv.get and recordAudit() can log those operations (refer to AuditEntry and its operation union to locate where to add the three new string literals).
🧹 Nitpick comments (4)
scripts/outline-backfill.ts (2)
99-108: 💤 Low valueSequential
fetches will be slow on large backfills.
for (const path of targets)awaits each POST in turn. For dozens to hundreds of artifacts plus a 5s per-request timeout, this becomes the dominant runtime. Consider a small concurrency pool (e.g., 4–8 in flight) usingPromise.allover chunked batches; this keeps memory bounded but materially shortens wall time. Not a blocker for correctness.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/outline-backfill.ts` around lines 99 - 108, The current sequential loop over targets calling buildOne(path) is slow; change the loop to process targets with a small concurrency pool (e.g., 4–8) so multiple buildOne POSTs run in parallel: implement chunking or a simple worker-pool that launches up to N concurrent buildOne calls and uses Promise.all to await each batch, updating built/failed counters and logging inside the resolved promises; reference the existing variables/identifiers targets, buildOne(path), built, failed and the console.log messages to preserve result handling and error reporting.
70-93: 💤 Low valueConsider gating the
$HOMEroot behind an opt-in flag.Even with
depthCap = 2,walk(home, 0, 2)issuesreaddiragainst every non-hidden top-level dir under$HOME(Documents, Desktop, Downloads, source-tree roots, etc.) on every backfill run. For a typical developer machine this is a lot of stat traffic for what is usually a small set of CLAUDE.md/MEMORY.md/AGENTS.md files, and it can pull in personal/non-project markdown that the user may not want indexed via the API.Suggested options (any one is fine):
- Make
$HOMEopt-in via a CLI flag or env var (e.g.,BACKFILL_INCLUDE_HOME=1).- Replace the broad home walk with a curated list (e.g.,
~/dev,~/code,~/projects) discovered from config.- Print the resolved root list before scanning so the user can ^C if it surprises them.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/outline-backfill.ts` around lines 70 - 93, The current main function unconditionally includes the user's home directory in the roots array (roots, used by walk and depthCap) which causes broad readdir/stat traffic; make inclusion of home opt-in by checking a CLI flag or environment variable (e.g., BACKFILL_INCLUDE_HOME) before adding join(home, ...) to roots, or alternatively replace that array entry with a curated list when a different opt-in flag is absent; update the logic around where roots is built (referencing main, roots, depthCap, and walk) to print or log the final resolved roots before scanning so the user can abort if unexpected.src/functions/outline-build.ts (2)
208-218: ⚡ Quick win
mem::outline-sectionassumesartifact_idis a filesystem path.
docs/outline.mddescribesartifact_idas "absolute path or stable id", andmem::outline-buildaccepts an explicitartifact_iddistinct frompath. However, this handler then re-reads the source viafs.stat(data.artifact_id)/fs.readFile(data.artifact_id, ...), so any non-path id will always fail here as a (misleading) "outline stale, rebuild needed" error.Either persist the original
pathon the storedOutlineand re-read from that, or document thatartifact_idfor section retrieval must be the on-disk path. Worth a follow-up since it widens the working contract beyond what tests currently cover.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/functions/outline-build.ts` around lines 208 - 218, The handler in outline-build.ts treats data.artifact_id as a filesystem path (using fs.stat/fs.readFile) which breaks when artifact_id is a stable id rather than a path; update the code to read from the original on-disk path stored on the Outline (persist the original path when creating/storing the Outline in mem::outline-build) and use that stored path (e.g., outline.path or originalPath) when calling fs.stat/fs.readFile instead of data.artifact_id, or alternatively enforce/document that artifact_id must be a path and validate/throw a clear error if it is not; locate usages around the data.artifact_id reads in the outline-section/outline-build handlers and the Outline construction to make the change consistently.
139-146: ⚡ Quick winParallelize
fs.stat+fs.readFile.The two filesystem reads are independent; running them in parallel halves the IO wait on the hot path (build and section retrieval).
♻️ Proposed change for both call sites
- let stat; - let content: string; - try { - stat = await fs.stat(data.path); - content = await fs.readFile(data.path, "utf-8"); - } catch (err) { - return { success: false, error: `read failed: ${String(err)}` }; - } + let stat: Awaited<ReturnType<typeof fs.stat>>; + let content: string; + try { + [stat, content] = await Promise.all([ + fs.stat(data.path), + fs.readFile(data.path, "utf-8"), + ]); + } catch (err) { + return { success: false, error: `read failed: ${String(err)}` }; + }The same change applies to lines 208-218 in
mem::outline-section.As per coding guidelines: "Use parallel operations where possible with
Promise.all()for independent key-value writes and reads".Also applies to: 208-218
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/functions/outline-build.ts` around lines 139 - 146, Parallelize the independent filesystem operations by replacing the sequential await calls for fs.stat and fs.readFile in the outline build logic: run them via Promise.all (e.g., const [stat, content] = await Promise.all([fs.stat(data.path), fs.readFile(data.path, "utf-8")])), preserve the existing try/catch and return shape on error (return { success: false, error: `read failed: ${String(err)}` }), and update the analogous code in the mem::outline-section site to the same Promise.all pattern to avoid duplicate IO waits.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/outline.md`:
- Around line 86-90: Update the fenced code block containing the REST endpoints
(the lines starting with "POST /agentmemory/outline/build", "GET
/agentmemory/outline?artifact_id=<id>", "POST /agentmemory/outline/section") so
the opening fence is tagged with a language hint (e.g., replace the leading
"```" with "```text" or "```http") to satisfy markdownlint MD040 and prevent
code highlighting.
In `@src/functions/outline-build.ts`:
- Around line 132-166: The mem::outline-build handler in
registerOutlineFunctions currently writes to KV.outlines via kv.set but doesn't
call recordAudit; fix by capturing a single timestamp (const ts = new
Date().toISOString()), pass that ts into buildOutline (so outline.generated_at
is set from the same ts rather than internally) and after the successful kv.set
call invoke recordAudit(...) with operation "outline-build" (or add this
operation to AuditEntry.operation union in src/types.ts), the artifact_id,
actor/context info, and the same ts; update registerOutlineFunctions to perform
recordAudit before returning success and ensure buildOutline, kv.set,
recordAudit, and KV.outlines are the referenced spots to modify.
In `@src/hooks/outline-regen.ts`:
- Around line 36-41: Update shouldTrack to only treat whitelist entries as
matching path components by anchoring suffix checks at a path separator: instead
of using filePath.endsWith(p) use a separator-aware check (e.g., filePath === p
|| filePath.endsWith(path.sep + p) or use path.posix/path.sep normalization) and
ensure path is normalized before comparison; reference the shouldTrack function
and import/use Node's path utilities if needed so entries like "notes.md" only
match ".../notes.md" and not "...mynotes.md".
In `@src/mcp/server.ts`:
- Around line 974-1007: The MCP handler for "memory_build_outline" currently
forwards arbitrary paths to mem::outline-build; update the
"memory_build_outline" case in src/mcp/server.ts to validate args.path
endsWith(".md") (and is non-empty) before calling sdk.trigger, rejecting
requests with a 400 error if not; additionally, normalize and constrain the path
to a safe prefix (e.g., resolve and ensure it is inside the server's
working/project directory) or consult a configurable allowlist of allowed base
directories, and mention these same constraints in any error responses so
callers get clear feedback; keep the existing calls to mem::outline-build and
mem::outline-section but enforce this validation in the memory_build_outline
handler to prevent arbitrary file reads.
---
Outside diff comments:
In `@AGENTS.md`:
- Around line 108-113: Update the stale test counts in the "Current Stats"
block: replace the occurrences of "627 tests" and "(613+ tests)" with the
current total reported by the PR (e.g., "735 tests" and "(735 tests)" or
consistent phrasing used elsewhere), ensuring the strings matching "627 tests"
and "(613+ tests)" in the AGENTS.md "Current Stats" section are updated to
reflect the new total so counts and phrasing remain consistent with the PR
status.
In `@README.md`:
- Line 136: Update the stale shields.io badge that still reads "109-endpoints"
in the "Any agent" table cell: find the <img ...> tag whose src contains
"shields.io" and the text "109-endpoints" and change it to "112-endpoints" (and
update any alt/title text that mentions 109 accordingly); also update the other
badge referenced near the "line 747" mention the same way so all endpoint badges
consistently show 112.
- Around line 565-585: Update the stale tool-counts in the README so they match
the new total (46): change the heading text "### 43 Tools" to "### 46 Tools" and
update the collapsible details summary "Extended tools (43 total — set
AGENTMEMORY_TOOLS=all)" to "Extended tools (46 total — set
AGENTMEMORY_TOOLS=all)"; ensure both exact strings are replaced so the visible
counts are consistent.
In `@src/types.ts`:
- Around line 449-495: The AuditEntry.operation union in the AuditEntry
interface is missing entries for outline operations used elsewhere; update the
AuditEntry type (the operation union in the AuditEntry interface) to include
"outline_build", "outline_get", and "outline_section" so the functions that call
kv.set/kv.get and recordAudit() can log those operations (refer to AuditEntry
and its operation union to locate where to add the three new string literals).
---
Nitpick comments:
In `@scripts/outline-backfill.ts`:
- Around line 99-108: The current sequential loop over targets calling
buildOne(path) is slow; change the loop to process targets with a small
concurrency pool (e.g., 4–8) so multiple buildOne POSTs run in parallel:
implement chunking or a simple worker-pool that launches up to N concurrent
buildOne calls and uses Promise.all to await each batch, updating built/failed
counters and logging inside the resolved promises; reference the existing
variables/identifiers targets, buildOne(path), built, failed and the console.log
messages to preserve result handling and error reporting.
- Around line 70-93: The current main function unconditionally includes the
user's home directory in the roots array (roots, used by walk and depthCap)
which causes broad readdir/stat traffic; make inclusion of home opt-in by
checking a CLI flag or environment variable (e.g., BACKFILL_INCLUDE_HOME) before
adding join(home, ...) to roots, or alternatively replace that array entry with
a curated list when a different opt-in flag is absent; update the logic around
where roots is built (referencing main, roots, depthCap, and walk) to print or
log the final resolved roots before scanning so the user can abort if
unexpected.
In `@src/functions/outline-build.ts`:
- Around line 208-218: The handler in outline-build.ts treats data.artifact_id
as a filesystem path (using fs.stat/fs.readFile) which breaks when artifact_id
is a stable id rather than a path; update the code to read from the original
on-disk path stored on the Outline (persist the original path when
creating/storing the Outline in mem::outline-build) and use that stored path
(e.g., outline.path or originalPath) when calling fs.stat/fs.readFile instead of
data.artifact_id, or alternatively enforce/document that artifact_id must be a
path and validate/throw a clear error if it is not; locate usages around the
data.artifact_id reads in the outline-section/outline-build handlers and the
Outline construction to make the change consistently.
- Around line 139-146: Parallelize the independent filesystem operations by
replacing the sequential await calls for fs.stat and fs.readFile in the outline
build logic: run them via Promise.all (e.g., const [stat, content] = await
Promise.all([fs.stat(data.path), fs.readFile(data.path, "utf-8")])), preserve
the existing try/catch and return shape on error (return { success: false,
error: `read failed: ${String(err)}` }), and update the analogous code in the
mem::outline-section site to the same Promise.all pattern to avoid duplicate IO
waits.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 24327f8c-4747-468c-a9e2-fb9e75017554
📒 Files selected for processing (16)
AGENTS.mdREADME.mddocs/outline.mdplugin/.claude-plugin/plugin.jsonscripts/outline-backfill.tssrc/functions/outline-build.tssrc/hooks/outline-regen.tssrc/index.tssrc/mcp/server.tssrc/mcp/tools-registry.tssrc/state/schema.tssrc/triggers/api.tssrc/types.tstest/outline-build.test.tstest/outline-integration.test.tstest/outline-tools.test.ts
| ``` | ||
| POST /agentmemory/outline/build { path, artifact_id? } | ||
| GET /agentmemory/outline?artifact_id=<id> | ||
| POST /agentmemory/outline/section { artifact_id, node_id } | ||
| ``` |
There was a problem hiding this comment.
Add a language hint to the REST endpoints code block.
markdownlint flags MD040 for the fenced block on line 86. Tag it with text (or http) so renderers don't try to highlight it as code:
📝 Suggested change
-```
-POST /agentmemory/outline/build { path, artifact_id? }
-GET /agentmemory/outline?artifact_id=<id>
-POST /agentmemory/outline/section { artifact_id, node_id }
-```
+```text
+POST /agentmemory/outline/build { path, artifact_id? }
+GET /agentmemory/outline?artifact_id=<id>
+POST /agentmemory/outline/section { artifact_id, node_id }
+```📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ``` | |
| POST /agentmemory/outline/build { path, artifact_id? } | |
| GET /agentmemory/outline?artifact_id=<id> | |
| POST /agentmemory/outline/section { artifact_id, node_id } | |
| ``` |
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 86-86: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/outline.md` around lines 86 - 90, Update the fenced code block
containing the REST endpoints (the lines starting with "POST
/agentmemory/outline/build", "GET /agentmemory/outline?artifact_id=<id>", "POST
/agentmemory/outline/section") so the opening fence is tagged with a language
hint (e.g., replace the leading "```" with "```text" or "```http") to satisfy
markdownlint MD040 and prevent code highlighting.
| export function registerOutlineFunctions(sdk: ISdk, kv: StateKV): void { | ||
| sdk.registerFunction( | ||
| { id: "mem::outline-build" }, | ||
| async (data: { path: string; artifact_id?: string }) => { | ||
| if (!data.path || typeof data.path !== "string") { | ||
| return { success: false, error: "path is required" }; | ||
| } | ||
| let stat; | ||
| let content: string; | ||
| try { | ||
| stat = await fs.stat(data.path); | ||
| content = await fs.readFile(data.path, "utf-8"); | ||
| } catch (err) { | ||
| return { success: false, error: `read failed: ${String(err)}` }; | ||
| } | ||
|
|
||
| const artifact_id = data.artifact_id || data.path; | ||
| const outline = buildOutline( | ||
| content, | ||
| artifact_id, | ||
| stat.mtime.toISOString(), | ||
| stat.size, | ||
| ); | ||
|
|
||
| await kv.set(KV.outlines, artifact_id, outline); | ||
|
|
||
| return { | ||
| success: true, | ||
| artifact_id, | ||
| title: outline.title, | ||
| nodeCount: countNodes(outline.nodes), | ||
| rootCount: outline.nodes.length, | ||
| }; | ||
| }, | ||
| ); |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win
Missing recordAudit() on the state-changing build operation.
mem::outline-build writes to KV (kv.set(KV.outlines, ...) on line 156) but does not record an audit entry. As per coding guidelines, mem-functions should "validate inputs, do work via kv operations, record audit via recordAudit(), return success object" and "Use recordAudit() for all state-changing operations". Capture the timestamp once and reuse it for both outline.generated_at (currently set inside buildOutline) and the audit entry, per the "Capture timestamps once with new Date().toISOString()" guideline.
A new outline-build (or similar) operation will likely also need to be added to the AuditEntry.operation union in src/types.ts.
As per coding guidelines: "Register functions using the pattern: sdk.registerFunction(...) ... record audit via recordAudit()" and "Use recordAudit() for all state-changing operations".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/functions/outline-build.ts` around lines 132 - 166, The
mem::outline-build handler in registerOutlineFunctions currently writes to
KV.outlines via kv.set but doesn't call recordAudit; fix by capturing a single
timestamp (const ts = new Date().toISOString()), pass that ts into buildOutline
(so outline.generated_at is set from the same ts rather than internally) and
after the successful kv.set call invoke recordAudit(...) with operation
"outline-build" (or add this operation to AuditEntry.operation union in
src/types.ts), the artifact_id, actor/context info, and the same ts; update
registerOutlineFunctions to perform recordAudit before returning success and
ensure buildOutline, kv.set, recordAudit, and KV.outlines are the referenced
spots to modify.
| function shouldTrack(filePath: string, whitelist: string[]): boolean { | ||
| if (whitelist.length > 0) { | ||
| return whitelist.some((p) => filePath === p || filePath.endsWith(p)); | ||
| } | ||
| return DEFAULT_PATTERNS.some((re) => re.test(filePath)); | ||
| } |
There was a problem hiding this comment.
Anchor whitelist suffix matches at a path separator.
filePath.endsWith(p) matches any string suffix, not a path-component suffix. A whitelist line like notes.md will incorrectly match /repo/mynotes.md, and docs/api.md will match /repo/legacy-docs/api.md. Anchor at / to avoid this:
🛡️ Suggested fix
-function shouldTrack(filePath: string, whitelist: string[]): boolean {
- if (whitelist.length > 0) {
- return whitelist.some((p) => filePath === p || filePath.endsWith(p));
- }
- return DEFAULT_PATTERNS.some((re) => re.test(filePath));
-}
+function shouldTrack(filePath: string, whitelist: string[]): boolean {
+ if (whitelist.length > 0) {
+ return whitelist.some(
+ (p) => filePath === p || filePath.endsWith(`/${p}`),
+ );
+ }
+ return DEFAULT_PATTERNS.some((re) => re.test(filePath));
+}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/hooks/outline-regen.ts` around lines 36 - 41, Update shouldTrack to only
treat whitelist entries as matching path components by anchoring suffix checks
at a path separator: instead of using filePath.endsWith(p) use a separator-aware
check (e.g., filePath === p || filePath.endsWith(path.sep + p) or use
path.posix/path.sep normalization) and ensure path is normalized before
comparison; reference the shouldTrack function and import/use Node's path
utilities if needed so entries like "notes.md" only match ".../notes.md" and not
"...mynotes.md".
| case "memory_build_outline": { | ||
| if (typeof args.path !== "string" || !args.path.trim()) { | ||
| return { status_code: 400, body: { error: "path is required" } }; | ||
| } | ||
| const outlineBuildResult = await sdk.trigger("mem::outline-build", { | ||
| path: args.path, | ||
| artifact_id: typeof args.artifact_id === "string" ? args.artifact_id : undefined, | ||
| }); | ||
| return { status_code: 200, body: { content: [{ type: "text", text: JSON.stringify(outlineBuildResult, null, 2) }] } }; | ||
| } | ||
|
|
||
| case "memory_get_outline": { | ||
| if (typeof args.artifact_id !== "string" || !args.artifact_id.trim()) { | ||
| return { status_code: 400, body: { error: "artifact_id is required" } }; | ||
| } | ||
| const outlineGetResult = await sdk.trigger("mem::outline-get", { | ||
| artifact_id: args.artifact_id, | ||
| }); | ||
| return { status_code: 200, body: { content: [{ type: "text", text: JSON.stringify(outlineGetResult, null, 2) }] } }; | ||
| } | ||
|
|
||
| case "memory_get_section": { | ||
| if (typeof args.artifact_id !== "string" || !args.artifact_id.trim()) { | ||
| return { status_code: 400, body: { error: "artifact_id is required" } }; | ||
| } | ||
| if (typeof args.node_id !== "string" || !args.node_id.trim()) { | ||
| return { status_code: 400, body: { error: "node_id is required" } }; | ||
| } | ||
| const outlineSectionResult = await sdk.trigger("mem::outline-section", { | ||
| artifact_id: args.artifact_id, | ||
| node_id: args.node_id, | ||
| }); | ||
| return { status_code: 200, body: { content: [{ type: "text", text: JSON.stringify(outlineSectionResult, null, 2) }] } }; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check backend path validation in outline-build.ts
fd -e ts "outline-build" --exec cat {}Repository: rohitg00/agentmemory
Length of output: 10968
Add .md extension validation to memory_build_outline MCP handler to restrict file reads.
The backend function mem::outline-build in src/functions/outline-build.ts performs no validation on the path argument—it directly calls fs.stat() and fs.readFile() without extension or directory checks. Combined with the line offset storage in mem::outline-section, this enables arbitrary file reads:
- Call
memory_build_outline({ path: "/home/user/.ssh/config" })— any file with ATX-style headings (e.g.,# Host) gets indexed. - Retrieve the outline and node offsets.
- Call
memory_get_section()with those offsets to extract raw file content.
The MCP handler is the security boundary and must validate. At minimum, enforce .md extension:
case "memory_build_outline": {
if (typeof args.path !== "string" || !args.path.trim()) {
return { status_code: 400, body: { error: "path is required" } };
}
+ if (!args.path.trim().toLowerCase().endsWith(".md")) {
+ return { status_code: 400, body: { error: "path must point to a .md file" } };
+ }
const outlineBuildResult = await sdk.trigger("mem::outline-build", {A stronger fix restricts path against a configurable allowlist of safe prefixes (e.g., project working directory only).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/mcp/server.ts` around lines 974 - 1007, The MCP handler for
"memory_build_outline" currently forwards arbitrary paths to mem::outline-build;
update the "memory_build_outline" case in src/mcp/server.ts to validate
args.path endsWith(".md") (and is non-empty) before calling sdk.trigger,
rejecting requests with a 400 error if not; additionally, normalize and
constrain the path to a safe prefix (e.g., resolve and ensure it is inside the
server's working/project directory) or consult a configurable allowlist of
allowed base directories, and mention these same constraints in any error
responses so callers get clear feedback; keep the existing calls to
mem::outline-build and mem::outline-section but enforce this validation in the
memory_build_outline handler to prevent arbitrary file reads.
|
Thanks @MainQuest99 — outline indexing for long markdown artifacts is a great idea, and PageIndex is exactly the right reference. Three new MCP tools + REST endpoints + a hook + a backfill script is a lot of surface, but it all looks coherent. Two things before merge: 1. Rebase needed. Branch is currently CONFLICTING against 2. Consistency-rule audit per
I see a few of these touched in the diff; please confirm all are. Once rebased + green, this is in scope for Q2 (Depth) and we can take it. One scope question while you're here: is the |
Summary
memory_build_outline,memory_get_outline,memory_get_section) + 3 REST endpoints under/agentmemory/outline/*.CLAUDE.md/MEMORY.md/AGENTS.mdfiles in the user's repos.Schema
New KV scope
mem:outlineskeyed byartifact_id(absolute path). Outline stores the heading tree, sourcemtime+sizefor staleness detection, and positionalnode_ids like1.2.3. Markdown ATX headings only; fenced code blocks are skipped.Performance
Round-trip on a real 597-line / 37 KB CLAUDE.md averages 0.16 ms for parse + node lookup + line slice (well under the 100 ms target). Pure parser on a 600-line synthetic doc: <30 ms.
Test plan
npm test— 735 / 735 passing (16 new outline tests)npx tsx scripts/outline-backfill.tsonce the engine is up locallyOut of scope
No embeddings, no PDF support, no semantic summary on nodes (
summaryfield reserved). Seedocs/outline.mdfor full schema, hook setup, and limitations.🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
Documentation