Skip to content

Add agentic retrieval: session state, vector grep, virtual files, related command, telemetry#9

Merged
jpr5 merged 19 commits into
mainfrom
feature/agentic-retrieval
Apr 6, 2026
Merged

Add agentic retrieval: session state, vector grep, virtual files, related command, telemetry#9
jpr5 merged 19 commits into
mainfrom
feature/agentic-retrieval

Conversation

@jpr5

@jpr5 jpr5 commented Apr 6, 2026

Copy link
Copy Markdown
Contributor

Summary

Evolves mcp-docs bash tools from static in-memory filesystem exploration into an integrated agentic retrieval system that matches and exceeds Mintlify's ChromaFS approach. All features are opt-in via the bash: config subsection — existing configs work unchanged.

  • Session state: Persistent CWD tracking across commands. cd /docs && ls works across separate tool calls. Path validation rejects nonexistent directories.
  • Vector-backed grep: Configurable 3-pass search (semantic embeddings + ILIKE text + dedup) with graceful degradation when search infrastructure is unavailable. Falls back to in-memory grep for bash-only configs.
  • Virtual files: Auto-generated INDEX.md (file listing) and SEARCH_TIPS.md (usage guidance) injected into the virtual filesystem at startup.
  • Cross-paradigm hints: related /path/to/file command finds semantically similar files across all sources. Grep misses suggest companion search tools.
  • File metadata: buildFileMetadata and formatLsLong for ls -l style output with sizes and line counts.
  • Workspace tracker: Per-session writable /workspace/ directory with 1MB size cap.
  • Telemetry hooks: Tracks file access, grep misses, and commands with buffer overflow protection (10K cap).
  • Webhook refresh: Atomic bash instance swap when sources are reindexed via GitHub webhooks.
  • 154 tests covering all features, edge cases, and error paths.

Full design: Proposal on Notion

Test plan

  • All 154 tests pass (npx vitest run)
  • TypeScript compiles clean (npx tsc --noEmit)
  • 2-round CR loop with 7 agents — 7 bugs found and fixed, round 2 clean
  • Deploy to Railway staging and verify bash tools work with new config options
  • Verify related command returns meaningful results against CopilotKit docs
  • Verify vector grep returns results for common queries (useAction, streaming, etc.)

🤖 Generated with Claude Code

jpr5 added 5 commits April 6, 2026 13:23
BashOptionsSchema and BashCacheConfigSchema in types.ts, persistent CWD
tracking with SessionStateManager and WorkspaceTracker in bash-session.ts,
plus comprehensive tests for session state, workspace, tool config, and
bash tool integration.
parseGrepCommand and vectorGrep in bash-grep.ts with semantic + ILIKE text
+ dedup search passes, textSearchChunks with proper ILIKE escaping in
queries.ts, and comprehensive grep tests.
Virtual files (INDEX.md, SEARCH_TIPS.md) in bash-virtual-files.ts, file
metadata with buildFileMetadata and formatLsLong in bash-fs.ts, related
command with self-exclusion fix and grep-miss suggestions in bash-related.ts,
rebuildBashInstance webhook refresh, and tests for all features.
BashTelemetry in bash-telemetry.ts with buffer overflow protection and
comprehensive telemetry tests.
Integrate all bash tool features (cd interception, grep interception,
multi-source search, path validation) in bash.ts, session state wiring
in server.ts and index.ts, bash tool options in mcp-docs.yaml, and
documentation updates in README.md.

@mme mme left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

Reviewed by 5 specialized agents (code-reviewer, test-analyzer, silent-failure-hunter, type-design-analyzer, comment-analyzer) across all 33 changed files.


Critical Issues (3 found — must fix before merge)

1. buildBashFilesMap doesn't resolve git-based source paths

File: src/mcp/tools/bash-fs.ts:55

buildBashFilesMap resolves source paths using path.resolve(source.path), which resolves relative to process.cwd(). Production sources have repo: fields and are cloned by the indexing orchestrator into /tmp/mcp-repos/<repoName>/. The function doesn't account for this — it needs to prepend the clone directory and repo name for git-based sources.

Both explore-docs and explore-code in mcp-docs.yaml reference git-based sources, so both bash tools will have zero files in production.

Fix: Accept a cloneDir parameter and, for sources with a repo field, construct the path as path.join(cloneDir, repoName, source.path) — mirroring how SourceIndexer resolves repo paths.

2. Bash grep_strategy: vector/hybrid without search tools crashes at runtime

Files: src/index.ts:318-324, src/config.ts:49-53

DB initialization (initializeSchema()) is gated on hasSearchTools() || hasCollectTools(). But a bash-only config with grep_strategy: vector calls vectorGrepsearchChunksgetPool(), which throws because the pool was never created.

Fix: Add a hasBashVectorGrep() helper and include it in the DB init condition:

const needsRag = hasSearchTools() || hasBashVectorGrep();
const needsDb = needsRag || hasCollectTools();

3. Shared mutable Bash filesystem across sessions

Files: src/index.ts:36, src/mcp/server.ts:54

A single Bash instance per tool name is shared across all MCP sessions. The just-bash filesystem persists writes across exec() calls, so one session's echo "x" > /docs/file.mdx or rm /INDEX.md corrupts the shared instance for all concurrent and future sessions.

Fix: Consider per-session OverlayFs copy-on-write layers, per-session instances, or at minimum document this as intentional and accept that the FS is shared.


Important Issues (11 found — should fix)

4. vectorGrep silently degrades to single-backend search

src/mcp/tools/bash-grep.ts:72-87 — When one of semantic/text search fails, results are silently incomplete. Only a console.warn is emitted. The agent gets a normal-looking grep result with no indication it's partial.

5. Empty file map from bad paths silently produces useless tool

src/mcp/tools/bash-fs.ts:54-59 — Misconfigured paths or bad volume mounts → 0 files loaded with only a console.warn. The tool appears healthy but returns empty results.

6. refreshBashInstances is fire-and-forget with unreliable 5s timeout

src/index.ts:73-77 — Webhook returns 200, then refresh is scheduled via setTimeout(5000). If refresh fails, bash serves stale data indefinitely. The 5s heuristic is also likely too short for actual reindexing.

7. Config.databaseUrl set to empty string '' when not needed

src/config.ts:72-73 — Creates a latent crash if any code path reaches getPool() unexpectedly. Should use undefined instead.

8. parseGrepCommand mishandles flags with arguments

src/mcp/tools/bash-grep.ts:36-47grep -m 5 "pattern" treats 5 as the pattern. Flags like -e, -m, -f take arguments that aren't consumed by the parser.

9. Command injection in shell strings via unescaped paths

src/mcp/tools/bash.ts:88,111related command interpolates user paths into cat "${path}" without escaping. Limited blast radius (WASM sandbox) but still a bug. Use bash.exec('cat', { args: [resolvedPath] }) instead.

10. Misleading comment: "stateless per-request"

src/mcp/server.ts:10-14createMcpServer JSDoc says "stateless per-request" but the function creates per-session servers with shared state.

11. Misleading comment: "each exec starts from /"

src/index.ts:330 — False when session_state is enabled; commands start from the session's persisted CWD.

12. README documents workspace & telemetry as features, but neither is wired up

WorkspaceTracker and BashTelemetry are defined and tested but never instantiated in the actual tool execution path. The README presents them as working features.

13. BashOptionsSchema .default().partial() anti-pattern

src/types.ts:61-68 — The .default() calls on inner fields are unreachable after .partial() on the outer object. Consumers must use ?? everywhere. The schema promises defaults it cannot deliver.

14. Piped grep silently bypasses vector search

src/mcp/tools/bash.ts:124-139cat file | grep pattern falls through to in-memory bash, completely bypassing vector infrastructure with no indication to the agent.


Suggestions (8 found — nice to have)

  • TelemetryEvent should be a discriminated union instead of optional fields — currently allows invalid combinations (bash-telemetry.ts:1-7)
  • ParsedGrep should use discriminated union instead of isGrep: boolean — TypeScript can't narrow the type (bash-grep.ts:4-9)
  • No integration test for vector grep interception through registerBashTool — unit tests exist for both sides but wiring is untested (bash.ts:119-135)
  • No test for createMcpServer bash tool wiring — all MCP tests call registerBashTool directly (server.ts:51-73)
  • No test for related command interception through the registered tool (bash.ts:102-117)
  • textSearchChunks ILIKE escaping has no unit test (db/queries.ts:85-116)
  • SessionStateManager has no size cap — unbounded memory growth under burst load (bash-session.ts:22-45)
  • Vector grep ignores file path arguments from grep command — not documented (bash.ts:123-125)

Strengths

  • Clean discriminated union for tool types (AnyToolConfigSchema) with exhaustive switch
  • Thorough Zod cross-field validation in superRefine
  • Strong BashSessionState encapsulation with proper POSIX path handling
  • Excellent degradation testing in vectorGrep — all 4 backend combinations covered
  • Good MCP protocol integration tests using InMemoryTransport
  • 154 tests covering features, edge cases, and error paths
  • Config validation tests are especially thorough

Cross-cutting Theme

The most dangerous pattern across this PR is silent degradation presented as normal operation. Issues #4, #5, #6, and #14 all share the same problem: when infrastructure fails or is misconfigured, the system continues with reduced capability while giving the agent zero indication that results are incomplete or stale. In a retrieval system where the agent makes decisions based on what it finds, silently incomplete results lead to wrong conclusions with no reason to suspect the search infrastructure.

🤖 Reviewed with Claude Code

@marthakelly

Copy link
Copy Markdown
Contributor

In addition to the previous preview:

Review feedback

bash-related.ts:42 — self-exclusion false positive

The filePath.endsWith('/' + r.file_path) check will incorrectly drop any file sharing a basename with the source file. In the CopilotKit docs, index.mdx exists in many directories — calling related on /docs/components/index.mdx would exclude every index.mdx across the entire doc set. The existing review doesn't mention this. Fix is to compare full normalized paths rather than basename suffix match.

-l flag forwarding

The review notes "vector grep ignores file path arguments" only in the suggestions section, but the flag issue is equally concrete. grep -l "pattern" (list files only) is one of the most common grep invocations agents use to build a file list before reading. Getting full file:line:content output when you asked for filenames only will break any agent workflow that pipes the result into a next step.

Workspace and telemetry are uninstantiated

This is the point I'd most want to echo from the existing review. The README presents both as working features, but neither is hooked into the execution path.

@mme

mme commented Apr 6, 2026

Copy link
Copy Markdown
Contributor

Worth considering not to mess around with grep - agents use it all the time and expect the flags to work the way they specify.

Instead, expose a command line tool that is used in the real world for semantic search. https://github.com/tobi/qmd is well known, used in the memory engine of OpenClaw and feels like a great candidate.

jpr5 added 14 commits April 6, 2026 14:25
…n and command injection

- Remove grep interception from bash tool; grep now passes through to bash unchanged
- Add qmd command interception for vector-backed semantic search (inspired by tobi/qmd)
- Add parseQmdCommand() to bash-grep.ts with support for quoted and unquoted queries
- Fix parseGrepCommand to skip argument tokens for flags -e, -m, -f, -A, -B, -C
- Fix command injection in related command by escaping path before interpolation
- Fix related self-exclusion false positive: use exact vPath match instead of basename suffix
- Update grep-miss suggestion to recommend qmd alongside search tools
- Add tests for parseQmdCommand, flag-with-argument parsing, self-exclusion fix, grep passthrough
- buildBashFilesMap now resolves git-based source paths via cloneDir parameter
- Add hasBashSemanticSearch() and include it in needsDb condition
- Change Config.databaseUrl to string|undefined, remove empty-string fallback
- Add undefined guards in db/client.ts getPool/initializePGlite/initializeSchema
- Remove misleading .default() calls from BashOptionsSchema (negated by .partial())
- Fix stale comments in server.ts and index.ts about stateless behavior
- Increase webhook bash refresh delay from 5s to 30s with TODO for event-based
…e unadvertised features

- Replace grep_strategy description to document the qmd semantic search
  command instead of claiming grep is intercepted/routed through vector search
- Add qmd command usage example and description alongside existing related command
- Remove workspace and telemetry bullet points (WorkspaceTracker and
  BashTelemetry are defined but never instantiated in the execution path)
- Remove workspace: true from the YAML example
- Add shared filesystem known limitation note
- Update grep_strategy default comment in YAML example
…dup order, partial failure warnings, and self-exclusion test

- config.ts: parseConfig() needsDb now includes hasBashSemanticSearch() to match index.ts logic
- bash-grep.ts: iterate semantic results first in dedup to preserve similarity scores
- bash-grep.ts: add stderr warnings when one search backend fails but the other succeeds
- bash-related.test.ts: fix self-exclusion test filePath to match vPath construction (/${source_name}/${file_path})
- server.ts: Fix JSDoc to reflect that each bash tool gets its own FS instance
- README: Fix cd example (only bare cd persists CWD, not compound commands)
- README: Fix 3-pass to 2-pass search description
- README: Fix shared FS note (filesystem is read-only, no write path)
- README: Remove max_file_size and cache from bash YAML (dead/ignored config)
@jpr5 jpr5 merged commit 798ae16 into main Apr 6, 2026
@jpr5 jpr5 deleted the feature/agentic-retrieval branch April 6, 2026 21:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants