Skip to content

feat(dsl): ANSWER verb - retrieval + pluggable reader LLM#152

Merged
KailasMahavarkar merged 2 commits intomainfrom
feat/answer-verb
Apr 20, 2026
Merged

feat(dsl): ANSWER verb - retrieval + pluggable reader LLM#152
KailasMahavarkar merged 2 commits intomainfrom
feat/answer-verb

Conversation

@KailasMahavarkar
Copy link
Copy Markdown
Contributor

Step 3 of retrieval-observability. Full-loop: retrieve with REMEMBER + synthesize via user-configured reader callable.

Shape

```python
def my_reader(prompt: str, max_tokens: int = 1000) -> str: ...

gs = GraphStore(reader=my_reader)

r = gs.execute('ANSWER "What is the capital of France?" LIMIT 3')

Result(

kind="answer",

data={

"answer": "Paris",

"cited_slots": ["n0", "n1", "n2"],

"candidates": [],

"reader": None,

},

count=1,

meta={"signals": },

)

```

Named registry for A/B'ing reader LLMs:

```python
gs = GraphStore(readers={"fast": a, "careful": b})
gs.execute('ANSWER "q" LIMIT 3 USING "careful"')
q.answer("q", limit=3, using="fast")
```

Grammar

```lark
answer_q: "ANSWER" STRING at_clause? tokens_clause? limit_clause? where_clause? using_reader?
using_reader: "USING" STRING
```

Mirrors `remember_q` shape + adds optional reader pick.

Zero LLM dep in core

graphstore ships no HTTP client, no litellm, no openai. Bring-your-own reader. Reader is a plain callable, not a config setting - held as a live reference on the executor.

Reader resolution order

  1. `USING "name"` → `self._readers[name]` (raise if not found)
  2. `self._reader` (default configured at construction)
  3. Sole entry of `self._readers` if exactly one registered
  4. None → `GraphStoreError`

Reader errors don't raise

Reader exceptions are caught. Result surfaces `data["error"]` with empty `answer`. Caller inspects without try/except on execute. Retrieval state is still returned so the caller can see what would've been fed to a working reader.

Tests

  • `test_answer_end_to_end` - real retrieval + fake scripted reader
  • `test_answer_without_reader_raises` - no reader configured
  • `test_answer_picks_named_reader_via_using` - registry lookup
  • `test_answer_unknown_named_reader_raises` - missing name
  • `test_answer_reader_exception_surfaced_in_result` - isolation
  • `test_answer_builder_roundtrip_matches_string_dsl` - builder parity
  • `test_answer_builder_compiles_full_surface` - all clauses
  • `test_answer_on_empty_store_still_calls_reader` - degenerate case

Full suite

1766 passed, 101 skipped, zero regressions.

What this enables

  • LoCoMo + LongMemEval runners can replace their hand-rolled retrieve→format→LLM loops with a single `ANSWER` call.
  • A/B over reader models becomes a one-line config.
  • Attribution (which slots informed the answer) is first-class.
  • Signal telemetry (Step 1) + plan dry-run (Step 2) + answer synthesis (Step 3) form a coherent observability + retrieval + synthesis triangle.

Next

Step 4: temporal anchor extraction. If the question contains a date ("in May 2024", "last Tuesday"), auto-add the `AT` clause. Targets temporal F1 - our weakest LoCoMo category.

🤖 Generated with Claude Code

KailasMahavarkar and others added 2 commits April 20, 2026 12:38
Step 3 of the retrieval-observability effort. Full-loop: retrieve with
REMEMBER + synthesize with a user-configured reader callable.

Grammar:
    answer_q: "ANSWER" STRING at_clause? tokens_clause?
              limit_clause? where_clause? using_reader?
    using_reader: "USING" STRING

Usage:

    def my_reader(prompt: str, max_tokens: int = 1000) -> str: ...

    gs = GraphStore(reader=my_reader)
    gs.execute('ANSWER "What is the capital of France?" LIMIT 3')
    # Result(
    #   kind="answer",
    #   data={
    #     "answer": "Paris",
    #     "cited_slots": ["n0", "n1", "n2"],
    #     "candidates": [<full REMEMBER nodes>],
    #     "reader": None,
    #   },
    #   count=1,
    #   meta={"signals": <REMEMBER's full meta block>},
    # )

Named reader registry for A/B'ing reader LLMs:

    gs = GraphStore(readers={"fast": a, "careful": b})
    gs.execute('ANSWER "q" LIMIT 3 USING "careful"')
    q.answer("q", limit=3, using="fast")

Implementation:

- Grammar rule `answer_q` mirrors `remember_q` shape + adds optional
  `USING "reader-name"` suffix.
- New `AnswerQuery` dataclass in ast_nodes.py.
- Transformer wires `answer_q` + `using_reader` Lark rules.
- Handler `_answer` in intelligence.py:
    - Resolves reader: `USING name` looks up `self._readers[name]`;
      else falls back to `self._reader` (default); else to sole entry
      of `_readers` if exactly one; else raise GraphStoreError.
    - Builds equivalent RememberQuery with same limit / where / at /
      tokens; calls real `_remember` (bumps recall counts - intentional).
    - Formats retrieved passages as numbered context blocks with source
      ids. Empty retrieval still surfaces "(no retrieved context)" to
      reader so it can say "no information available".
    - Reader exception caught: returns Result with data["error"] and
      empty answer. Callers inspect without try/except on execute.
- Builder `q.answer(text, limit, tokens, at, where, using)` added to
  reads.py. Registered on `q` namespace. Parser-roundtrip-verified.
- GraphStore gains `reader` and `readers` kwargs. Validated callable.
  Held as live refs on the executor; not in the config layer
  (callables are not serialisable).

Zero LLM dependency in core. graphstore ships no HTTP client, no
litellm, no openai. Bring-your-own reader.

Tests (tests/test_answer.py):
- test_answer_end_to_end
- test_answer_without_reader_raises
- test_answer_picks_named_reader_via_using
- test_answer_unknown_named_reader_raises
- test_answer_reader_exception_surfaced_in_result
- test_answer_builder_roundtrip_matches_string_dsl
- test_answer_builder_compiles_full_surface
- test_answer_on_empty_store_still_calls_reader

Also: test_query_coverage.py EXPECTED_VERBS += "answer".

Full suite: 1766 passed, 101 skipped, zero regressions.

Next (Step 4): temporal anchor extraction at query time. Auto-add AT
clauses when the question has a date. Targets temporal F1 (weakest
LoCoMo category for us).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
We shipped three retrieval-observability features in #150/#151/#152 but
the skills, docs, and README said nothing about them. An LLM loading
graphstore-dsl today wouldn't know ANSWER exists; a human reading the
README wouldn't either. Fix:

graphstore-dsl SKILL:
  - ANSWER verb in the reads table + dedicated subsection explaining
    reader resolution + error capture semantics
  - Full per-node signal list (old doc predicted _graph_score / _recall
    before they shipped; now they do and we have _co_bonus, _recall_boost,
    _rank_stage, _fusion_score, _rerank_score)
  - meta["signals"] telemetry block documented
  - SYS EXPLAIN REMEMBER dry-run subsection
  - Added ANSWER + SYS EXPLAIN rows to query-generation pattern table

graphstore-builder SKILL:
  - q.answer(...) row in the reads table
  - Debugging section expanded: full signal list + meta block JSON +
    q.sys.explain(inner) dry-run example + q.answer() end-to-end
    example + named-reader A/B + reader resolution order

website/docs/dsl/reference.md:
  - ANSWER examples (bare + USING "reader")
  - New subsections on signal scores, SYS EXPLAIN REMEMBER dry-run, and
    ANSWER retrieval-augmented synthesis

website/docs/query-builder.md:
  - q.answer(...) row in reads table
  - Retrieval-pipeline observability section
  - Retrieval + reader synthesis section with named-reader A/B pattern

README.md:
  - REMEMBER section expanded: per-node scores + meta["signals"] +
    SYS EXPLAIN REMEMBER dry-run example
  - New ANSWER section showing reader wiring, cited_slots, named readers

Every remaining claim verified against code. Em dash sweep clean (Rule 9).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@KailasMahavarkar KailasMahavarkar merged commit 3c6b552 into main Apr 20, 2026
4 checks passed
@KailasMahavarkar KailasMahavarkar deleted the feat/answer-verb branch April 20, 2026 07:21
KailasMahavarkar added a commit that referenced this pull request Apr 20, 2026
v0.4 ships retrieval observability triangle:
- REMEMBER signal telemetry + rich meta["signals"] (#150)
- SYS EXPLAIN REMEMBER dry-run (#151)
- ANSWER verb with pluggable reader LLM (#152)

Plus:
- Skills split: graphstore-dsl (runtime) + graphstore-builder (Python) (#148)
- Skill-guided LLM ingest adapter + LoCoMo wiring fix (#149)
- Docusaurus docs site @ graphstore-docs.orkait.com (#142-147)

Breaking changes: none. All additions are additive.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
KailasMahavarkar added a commit that referenced this pull request Apr 20, 2026
* chore(release): bump v0.3.0 -> v0.4.0

v0.4 ships retrieval observability triangle:
- REMEMBER signal telemetry + rich meta["signals"] (#150)
- SYS EXPLAIN REMEMBER dry-run (#151)
- ANSWER verb with pluggable reader LLM (#152)

Plus:
- Skills split: graphstore-dsl (runtime) + graphstore-builder (Python) (#148)
- Skill-guided LLM ingest adapter + LoCoMo wiring fix (#149)
- Docusaurus docs site @ graphstore-docs.orkait.com (#142-147)

Breaking changes: none. All additions are additive.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(release): bump pyproject.toml version to 0.4.0

Missed in 07c9986. Pairs with src/graphstore/__init__.py bump.

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant