feat(dsl): SYS EXPLAIN REMEMBER dry-runs the pipeline by KailasMahavarkar · Pull Request #151 · orkait/graphstore

KailasMahavarkar · 2026-04-20T07:02:00Z

Step 2 of retrieval-observability. `SYS EXPLAIN REMEMBER "q"` runs gather + fuse + temporal filter without materializing nodes, running the reranker, expanding nucleus, or mutating recall counts. Returns candidate slot ids with per-signal scores + the Step 1 telemetry block.

Example

```python
r = gs.execute('SYS EXPLAIN REMEMBER "European capitals" LIMIT 3')

r.kind # 'plan'
r.data

{

"verb": "REMEMBER",

"query": "European capitals",

"limit": 3,

"candidates": [

{"slot": 9, "id": "mem1", "fused_score": 0.42,

"vector_sim": 0.52, "bm25_score": 0.0, "recency_score": 1.0,

"graph_score": 0.0, "co_bonus": 0.0, "recall_boost": 0.0},

...

]

}

r.meta["signals"] # same shape as real REMEMBER: fusion / recency /
# stages / reranker / nucleus / sentence-query-expansion
```

Why

Tuning fusion weights or diagnosing "why did this rank where it did" required running real REMEMBER, which bumps `recall_count` on every candidate. EXPLAIN is side-effect free - safe to call repeatedly during interactive debugging.

Also the foundation for Step 3 (`ANSWER` verb), which will consume the same candidate plan before handing context to a reader LLM.

Implementation

`RememberQuery` handler gains internal `_plan_only=False` kwarg. When True: returns after fusion + temporal filter as `Result(kind="plan", data={candidates}, meta={signals})`. Skips materialization, rerank, nucleus, recall-count bumps, similarity buffer.
`SYS EXPLAIN` handler dispatches `RememberQuery` inners via `self._executor._remember(inner, _plan_only=True)`.
New `_executor` back-reference on `SystemExecutor`, wired from `store.py` at construction.
Empty-store and empty-gather branches also respect `_plan_only`.

Tests

`test_sys_explain_remember_returns_plan_without_side_effects` - asserts plan shape + no recall-count mutation
`test_sys_explain_remember_empty_store_returns_empty_plan` - degenerate case

Full suite

1758 passed, 101 skipped (+2 new), zero regressions.

Follow-up

Step 3: `ANSWER` verb. Retrieves + formats context + calls reader LLM via `litellm`. Returns `{answer, cited_slots, signals}`. Reuses the plan-only path internally to grab candidates before generation.

🤖 Generated with Claude Code

Step 2 of the retrieval-observability effort. `SYS EXPLAIN REMEMBER "q"` runs the gather + fuse + temporal pipeline without materializing nodes, running the reranker, expanding nucleus, or mutating recall counts. Returns a plan listing candidate slot ids with per-signal scores, plus the same `meta["signals"]` telemetry shipped in Step 1. Usage: gs.execute('SYS EXPLAIN REMEMBER "European capitals" LIMIT 3') # kind="plan" # data.candidates = [ # {slot, id, fused_score, vector_sim, bm25_score, recency_score, # graph_score, co_bonus, recall_boost}, # ... # ] # meta.signals = {fusion, recency, stages, reranker, nucleus, ...} Implementation: - RememberQuery handler gains an internal `_plan_only=False` kwarg. When True, returns after fusion + temporal filter with a `Result(kind="plan", data={candidates}, meta={signals})`. Skips: materialization, rerank, nucleus expansion, recall-count bumps, similarity buffer. - SYS EXPLAIN handler (sys/queries.py) dispatches `RememberQuery` inners to `self._executor._remember(inner, _plan_only=True)` via a new `_executor` back-reference on `SystemExecutor`. Wired in store.py at construction. - Empty-store and empty-gather branches also respect `_plan_only` and return plan-shaped results. Why: - Callers tuning fusion weights or debugging "why did this rank where it did" needed the signal breakdown without paying recall-count mutation cost on every inspection. - Foundation for Step 3 (ANSWER verb) which will reuse the dry-run path to fetch candidates before handing them to a reader LLM. Tests: test_sys_explain_remember_returns_plan_without_side_effects test_sys_explain_remember_empty_store_returns_empty_plan Full suite: 1758 passed, 101 skipped (+2 new), zero regressions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

We shipped three retrieval-observability features in #150/#151/#152 but the skills, docs, and README said nothing about them. An LLM loading graphstore-dsl today wouldn't know ANSWER exists; a human reading the README wouldn't either. Fix: graphstore-dsl SKILL: - ANSWER verb in the reads table + dedicated subsection explaining reader resolution + error capture semantics - Full per-node signal list (old doc predicted _graph_score / _recall before they shipped; now they do and we have _co_bonus, _recall_boost, _rank_stage, _fusion_score, _rerank_score) - meta["signals"] telemetry block documented - SYS EXPLAIN REMEMBER dry-run subsection - Added ANSWER + SYS EXPLAIN rows to query-generation pattern table graphstore-builder SKILL: - q.answer(...) row in the reads table - Debugging section expanded: full signal list + meta block JSON + q.sys.explain(inner) dry-run example + q.answer() end-to-end example + named-reader A/B + reader resolution order website/docs/dsl/reference.md: - ANSWER examples (bare + USING "reader") - New subsections on signal scores, SYS EXPLAIN REMEMBER dry-run, and ANSWER retrieval-augmented synthesis website/docs/query-builder.md: - q.answer(...) row in reads table - Retrieval-pipeline observability section - Retrieval + reader synthesis section with named-reader A/B pattern README.md: - REMEMBER section expanded: per-node scores + meta["signals"] + SYS EXPLAIN REMEMBER dry-run example - New ANSWER section showing reader wiring, cited_slots, named readers Every remaining claim verified against code. Em dash sweep clean (Rule 9). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(dsl): ANSWER verb - retrieval + pluggable reader LLM Step 3 of the retrieval-observability effort. Full-loop: retrieve with REMEMBER + synthesize with a user-configured reader callable. Grammar: answer_q: "ANSWER" STRING at_clause? tokens_clause? limit_clause? where_clause? using_reader? using_reader: "USING" STRING Usage: def my_reader(prompt: str, max_tokens: int = 1000) -> str: ... gs = GraphStore(reader=my_reader) gs.execute('ANSWER "What is the capital of France?" LIMIT 3') # Result( # kind="answer", # data={ # "answer": "Paris", # "cited_slots": ["n0", "n1", "n2"], # "candidates": [<full REMEMBER nodes>], # "reader": None, # }, # count=1, # meta={"signals": <REMEMBER's full meta block>}, # ) Named reader registry for A/B'ing reader LLMs: gs = GraphStore(readers={"fast": a, "careful": b}) gs.execute('ANSWER "q" LIMIT 3 USING "careful"') q.answer("q", limit=3, using="fast") Implementation: - Grammar rule `answer_q` mirrors `remember_q` shape + adds optional `USING "reader-name"` suffix. - New `AnswerQuery` dataclass in ast_nodes.py. - Transformer wires `answer_q` + `using_reader` Lark rules. - Handler `_answer` in intelligence.py: - Resolves reader: `USING name` looks up `self._readers[name]`; else falls back to `self._reader` (default); else to sole entry of `_readers` if exactly one; else raise GraphStoreError. - Builds equivalent RememberQuery with same limit / where / at / tokens; calls real `_remember` (bumps recall counts - intentional). - Formats retrieved passages as numbered context blocks with source ids. Empty retrieval still surfaces "(no retrieved context)" to reader so it can say "no information available". - Reader exception caught: returns Result with data["error"] and empty answer. Callers inspect without try/except on execute. - Builder `q.answer(text, limit, tokens, at, where, using)` added to reads.py. Registered on `q` namespace. Parser-roundtrip-verified. - GraphStore gains `reader` and `readers` kwargs. Validated callable. Held as live refs on the executor; not in the config layer (callables are not serialisable). Zero LLM dependency in core. graphstore ships no HTTP client, no litellm, no openai. Bring-your-own reader. Tests (tests/test_answer.py): - test_answer_end_to_end - test_answer_without_reader_raises - test_answer_picks_named_reader_via_using - test_answer_unknown_named_reader_raises - test_answer_reader_exception_surfaced_in_result - test_answer_builder_roundtrip_matches_string_dsl - test_answer_builder_compiles_full_surface - test_answer_on_empty_store_still_calls_reader Also: test_query_coverage.py EXPECTED_VERBS += "answer". Full suite: 1766 passed, 101 skipped, zero regressions. Next (Step 4): temporal anchor extraction at query time. Auto-add AT clauses when the question has a date. Targets temporal F1 (weakest LoCoMo category for us). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: surface-sync for REMEMBER telemetry + EXPLAIN + ANSWER (Steps 1-3) We shipped three retrieval-observability features in #150/#151/#152 but the skills, docs, and README said nothing about them. An LLM loading graphstore-dsl today wouldn't know ANSWER exists; a human reading the README wouldn't either. Fix: graphstore-dsl SKILL: - ANSWER verb in the reads table + dedicated subsection explaining reader resolution + error capture semantics - Full per-node signal list (old doc predicted _graph_score / _recall before they shipped; now they do and we have _co_bonus, _recall_boost, _rank_stage, _fusion_score, _rerank_score) - meta["signals"] telemetry block documented - SYS EXPLAIN REMEMBER dry-run subsection - Added ANSWER + SYS EXPLAIN rows to query-generation pattern table graphstore-builder SKILL: - q.answer(...) row in the reads table - Debugging section expanded: full signal list + meta block JSON + q.sys.explain(inner) dry-run example + q.answer() end-to-end example + named-reader A/B + reader resolution order website/docs/dsl/reference.md: - ANSWER examples (bare + USING "reader") - New subsections on signal scores, SYS EXPLAIN REMEMBER dry-run, and ANSWER retrieval-augmented synthesis website/docs/query-builder.md: - q.answer(...) row in reads table - Retrieval-pipeline observability section - Retrieval + reader synthesis section with named-reader A/B pattern README.md: - REMEMBER section expanded: per-node scores + meta["signals"] + SYS EXPLAIN REMEMBER dry-run example - New ANSWER section showing reader wiring, cited_slots, named readers Every remaining claim verified against code. Em dash sweep clean (Rule 9). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

v0.4 ships retrieval observability triangle: - REMEMBER signal telemetry + rich meta["signals"] (#150) - SYS EXPLAIN REMEMBER dry-run (#151) - ANSWER verb with pluggable reader LLM (#152) Plus: - Skills split: graphstore-dsl (runtime) + graphstore-builder (Python) (#148) - Skill-guided LLM ingest adapter + LoCoMo wiring fix (#149) - Docusaurus docs site @ graphstore-docs.orkait.com (#142-147) Breaking changes: none. All additions are additive. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(release): bump v0.3.0 -> v0.4.0 v0.4 ships retrieval observability triangle: - REMEMBER signal telemetry + rich meta["signals"] (#150) - SYS EXPLAIN REMEMBER dry-run (#151) - ANSWER verb with pluggable reader LLM (#152) Plus: - Skills split: graphstore-dsl (runtime) + graphstore-builder (Python) (#148) - Skill-guided LLM ingest adapter + LoCoMo wiring fix (#149) - Docusaurus docs site @ graphstore-docs.orkait.com (#142-147) Breaking changes: none. All additions are additive. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(release): bump pyproject.toml version to 0.4.0 Missed in 07c9986. Pairs with src/graphstore/__init__.py bump. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

KailasMahavarkar merged commit 77ba749 into main Apr 20, 2026
4 checks passed

KailasMahavarkar deleted the feat/explain-remember branch April 20, 2026 07:02

KailasMahavarkar mentioned this pull request Apr 20, 2026

chore(release): v0.4.0 #153

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(dsl): SYS EXPLAIN REMEMBER dry-runs the pipeline#151

feat(dsl): SYS EXPLAIN REMEMBER dry-runs the pipeline#151
KailasMahavarkar merged 1 commit intomainfrom
feat/explain-remember

KailasMahavarkar commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

KailasMahavarkar commented Apr 20, 2026

Example

{

"verb": "REMEMBER",

"query": "European capitals",

"limit": 3,

"candidates": [

{"slot": 9, "id": "mem1", "fused_score": 0.42,

"vector_sim": 0.52, "bm25_score": 0.0, "recency_score": 1.0,

"graph_score": 0.0, "co_bonus": 0.0, "recall_boost": 0.0},

...

]

}

Why

Implementation

Tests

Full suite

Follow-up

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant