Skip to content

Deterministic insights and review-only suggestions#48

Open
tony wants to merge 11 commits into
streamline-04from
streamline-05
Open

Deterministic insights and review-only suggestions#48
tony wants to merge 11 commits into
streamline-04from
streamline-05

Conversation

@tony
Copy link
Copy Markdown
Owner

@tony tony commented Jun 6, 2026

Stacked on #47 — retarget to master after it merges; until then the diff below is insights/suggestions only.

Summary

  • Add deterministic feature and artifact storage: the DB index gains a record_features table (simhash/minhash/quality flags) plus insight-run, cluster, variant-edge, omission-finding, and suggestion schema. db sync --features defer|inline defers expensive feature extraction by default, with a batched missing-feature refresh API for insight runs.
  • Add a deterministic insights engine (ADR 0006): agentgrep insights analyze|list|explain computes similarity variant edges (simhash/minhash candidates confirmed by token-set Jaccard) and omission findings against a --target file, persisting runs, clusters, and evidence rows with provenance — no LLM involved.
  • Add live analyze progress: multi-line stderr progress with feature-refresh and artifact-write counters, Enter-to-exit, and clean JSON stdout.
  • Add bounded persisted listings: insights list pages with a default limit backed by a confidence-order index, explain is count-only, and listing payloads render as human summaries by default with explicit --json/--ndjson.
  • Add review-only suggestion artifacts (ADR 0007): agentgrep suggestions list|show|render turns insight evidence into persisted instruction-change suggestions for AGENTS.md/skill surfaces. Suggestions render for human review; agentgrep never edits instruction files or calls an LLM automatically.
  • Add read-only MCP tools insights_list and suggestions_list with bounded listing payloads and result totals.

Changes by area

Storage

src/agentgrep/db.py: feature builders, the feature/artifact schema, --features sync modes with deferred counters, refresh_missing_features with progress reporting, and similarity-row iteration.

Engines

src/agentgrep/insights.py: InsightEngine with similarity and omission analysis, bounded listings, and count APIs. src/agentgrep/suggestions.py: SuggestionEngine producing persisted, review-only SuggestionArtifact rows tied to insight evidence.

CLI

src/agentgrep/cli/parser.py, src/agentgrep/cli/render.py: the insights and suggestions command groups, analyze progress rendering, bounded listing and suggestion formatters, parse-time --target requirement for omission runs.

MCP

src/agentgrep/mcp/tools/insight_tools.py, src/agentgrep/mcp/models.py: two read-only tools with pydantic response models and capability registration.

Docs

ADRs 0006–0007; per-command CLI pages for insights and suggestions; the /insights/ feature section; MCP tool documentation.

Design decisions

  • Deterministic before semantic: similarity uses simhash/minhash candidate generation confirmed by Jaccard overlap, so insight runs are reproducible and evidence-backed; semantic backends such as LanceDB remain an optional later addition, not a required dependency (ADR 0006).
  • Hard review boundary for suggestions: suggestion workflows never edit AGENTS.md/skills and never invoke an LLM on their own — an LLM may only reach this surface by explicitly calling the CLI/MCP tools (ADR 0007).
  • Features are deferred secondary state: sync writes the ledger, records, and FTS index fast by default and lets insight runs batch-refresh missing features.
  • Listings are bounded by construction: variant edges grow superlinearly on duplicate-heavy histories, so list surfaces page with a default limit and report totals instead of materializing full result sets.

Test plan

  • tests/test_db_index.py — feature modes, batched feature refresh with progress, deferred counters
  • tests/test_cache_cli.pyinsights/suggestions CLI contracts: bounded listings, human summaries, analyze progress, early exit, --target requirement, JSON/NDJSON stdout
  • tests/test_agentgrep_mcp.py — bounded MCP listing payloads with totals
  • tests/test_widgets.py — rendered command-page docs for both groups
  • Full gate per commit: ruff check / ruff format, ty check, uv run pytest (incl. doctests), just build-docs

@tony tony force-pushed the streamline-05 branch from 1553fd6 to 63b7302 Compare June 6, 2026 22:14
@tony tony force-pushed the streamline-05 branch from 4ab05ce to d7376df Compare June 6, 2026 22:38
@tony tony force-pushed the streamline-05 branch from d7376df to d695b0d Compare June 6, 2026 23:31
@tony tony force-pushed the streamline-04 branch from d1d9c97 to 456ea4a Compare June 7, 2026 00:15
@tony tony force-pushed the streamline-05 branch from 9df8d3a to 22ee543 Compare June 7, 2026 00:16
@tony tony force-pushed the streamline-05 branch from 22ee543 to 0fdd15c Compare June 7, 2026 01:13
@tony tony force-pushed the streamline-04 branch from 8f15333 to c4b8fdc Compare June 7, 2026 01:38
tony added 11 commits June 6, 2026 20:40
why: The insights analysis and suggestion workflows have different
ownership boundaries and failure modes from the DB cache. Recording
them as separate ADRs keeps the architecture reviewable while
preserving the deterministic-first insight decision and the review-only
suggestion boundary.

what:
- Add ADR 0006 for deterministic insights, variants, omissions, and
  optional semantic backends.
- Add ADR 0007 for reviewable suggestion skills and AGENTS.md/skill
  change semantics.
- Register both ADRs in the development architecture decision index.
why: ADR 0006 needs the DB index to carry deterministic similarity
features and persisted artifact tables before the insight engines can
land. Deferring feature extraction by default keeps db sync responsive
while inline mode populates the feature cache during sync itself.

what:
- Add the record_features table plus insight-run, cluster, variant-edge,
  omission-finding, and suggestion artifact schema to DbStore.
- Add simhash/minhash/quality-flag feature builders with a batched
  missing-feature refresh API and feature-refresh progress reporting.
- Add db sync --features defer|inline, features-deferred counters in
  sync results, progress lines, and human summaries.
- Extend db status and the db_status MCP payload with feature and
  artifact counts, and cover feature modes and refresh with tests.
why: ADRs 0006 and 0007 call for deterministic insight outputs and
review-only instruction suggestions over the DB index. The engines
generate persisted, evidence-backed artifacts without calling an LLM
and without editing instruction files.

what:
- Add InsightEngine with similarity variant edges, omission findings,
  bounded persisted listings, count APIs, and live analyze progress
  with feature-refresh and artifact-write counters.
- Add the review-only SuggestionEngine with persisted suggestion
  artifacts rendered for human review.
- Add agentgrep insights analyze|list|explain and suggestions
  list|show|render with human summaries by default and explicit JSON
  and NDJSON modes.
- Add the read-only insights_list and suggestions_list MCP tools with
  bounded payloads and totals.
- Cover engines, CLI contracts, progress rendering, and MCP payloads
  with regression tests.
why: The insights and suggestions command groups should follow the
existing CLI documentation contract, with feature explanations under
/insights/ and the MCP tool surface documented alongside the other
read-only tools.

what:
- Add argparse-backed CLI pages for insights analyze|list|explain and
  suggestions list|show|render with grid cards and toctree entries.
- Add the /insights/ feature guide and suggestions workflow page.
- Register insights_list and suggestions_list across the MCP docs,
  reference pages, and docs-build shim.
- Extend rendered-doc regression coverage to the new command pages.
why: The insight and suggestion listing tools opened the cache through
the migration path without closing it, leaking one SQLite connection
per MCP call and writing schema metadata from surfaces documented as
read-only. The insights and suggestions CLI commands leaked their
per-call connections the same way.

what:
- Open the insights_list and suggestions_list helpers read-only and
  close them via the runtime context manager.
- Close the per-call runtime in run_insights_command and
  run_suggestions_command on every exit path.
- Assert both tool helpers and the CLI path leave their connections
  closed.
why: The doctest guidance scopes runnable examples to pure helpers
with no external state. The insight count formatters, confidence
formatter, and identifier shortener qualify and had none.

what:
- Add Examples sections to the five insight count formatters,
  _format_confidence, and _short_identifier.
why: The schema-version-mismatch rebuild dropped only the base tables,
so feature rows and insight artifacts survived a rebuild with stale
contents keyed to regenerated record ids — breaking ADR 0008's promise
that a mismatch recreates the schema empty.

what:
- Extend the rebuild drop list with the feature and artifact tables,
  children before parents.
- Seed features and variant edges in the rebuild test, assert every
  status count is zero afterward, and assert sqlite_master matches a
  fresh create so future drop/create drift fails mechanically.
why: insights list and explain, suggestions show and render, and
suggestions list without a target only read persisted artifacts, yet
they opened the cache through the migration path, which writes schema
metadata on every call and creates a missing cache as a side effect.
The db status surfaces and the MCP listing tools already follow the
read-only contract.

what:
- Route the read-only insights and suggestions actions through a
  shared read-only open with the db-status semantics: empty payloads
  for missing caches without creating the file, and clean errors for
  foreign files.
- Keep writable opens for insights analyze and suggestions list with a
  target.
- Cover missing-cache list behavior for both command groups and pin
  the read actions to read-only captured opens.
why: Suggestion listings fetched every persisted row while the sibling
insight listings were explicitly bounded with totals and truncation
flags. Suggestion rows accumulate across targets and runs, so the list
surfaces need the same page discipline.

what:
- Add count_suggestions and a limit parameter to list_suggestions,
  backed by a confidence-order index.
- Add --limit to suggestions list with parse-time validation, and emit
  a bounded payload with total, returned, and truncated fields plus a
  matching human summary.
- Extend the suggestions_list MCP tool and response model with limit,
  total, and truncation fields.
- Cover parser cases, the bounded CLI payload, and the bounded MCP
  payload, and document the new flag and response shape.
why: The lazy-import rule reserves function-local imports for heavy
modules that hurt CLI cold-start. json is a cheap stdlib module, and
insights.py is itself only imported by insight commands.

what:
- Move the json import from InsightEngine._record_run to module level.
why: simhash_hex, minhash_signature, and format_db_deferred_count are
pure, offline-runnable helpers whose siblings all carry Examples
blocks; the doctest guidance covers them the same way.

what:
- Add Examples sections to the two feature-signature helpers and the
  deferred-count formatter.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant