Skip to content

feat: lexical citation verifier — flag bogus [N] cites in synthesized answers#22

Merged
askalf merged 1 commit into
masterfrom
claude/plan-deep-dive-next-7b8Lx
May 5, 2026
Merged

feat: lexical citation verifier — flag bogus [N] cites in synthesized answers#22
askalf merged 1 commit into
masterfrom
claude/plan-deep-dive-next-7b8Lx

Conversation

@askalf
Copy link
Copy Markdown
Owner

@askalf askalf commented May 5, 2026

Summary

After the final synthesis, deepdive now checks every [N] reference in the answer body against the extracted text of source N. Catches the dominant failure mode of cited-answer tools: a confident sentence pointing at a source that doesn't actually support it.

  • Pure-function lexical scoring, no second LLM "judge" pass (would reintroduce the very hallucination class we're catching) and no new runtime deps.
  • Multi-cite rule: require all. A sentence with [1][3] is supported only when every cited source clears the recall threshold — a bogus [3] buried in an otherwise-true sentence is still flagged.
  • Tokenization: lowercase, drop ~80 stop-words, split on non-alphanumeric and digit/letter boundaries (5h["5","h"], 5-hour["5","hour"]) so numeric anchors survive paraphrase. Numbers and length-≥3 alphas survive; everything else gets dropped.
  • Quiet by default. Clean runs print nothing extra. When something fails: a small ## Citation health footer in the markdown, one warning line per unsupported sentence in --verbose, a verification key in the --json payload.

What's added

  • src/verify.tsverifyCitations + four exported helpers, all pure
  • New CLI flags: --strict-cites, --cite-min-recall=<0..1> (default 0.4), --no-verify-cites
  • Env vars: DEEPDIVE_STRICT_CITES, DEEPDIVE_CITE_MIN_RECALL, DEEPDIVE_NO_VERIFY_CITES
  • New verify.done agent event
  • AgentResult.verification: VerificationReport | undefined + usage.{citationsTotal, citationsSupported} for library consumers
  • README "Citation verification" section explaining what it catches and what it doesn't (explicitly: not a semantic judge — flags hallucinated names/numbers/dates with high precision, can flag paraphrased-but-truthful sentences below threshold)

What it explicitly is not (v1)

  • A semantic / embedding-based judge
  • Run inside the deep loop (post-final-synthesis only — feeding unsupported cites back to the critic is a v2 conversation)
  • Detection of uncited claims (orthogonal feature)

Test plan

  • 39 new unit tests in test/verify.test.mjs covering all five exported helpers + faithful/hallucinated/multi-cite/threshold/strip-sources cases
  • 2 new agent-loop integration tests: end-to-end bogus-cite flagging (synth output cites [1] for content not in the source → unsupported.length === 1) and verifyCitations: false skip path
  • 4 new CLI/config flag-plumbing tests
  • npm run typecheck clean
  • npm run build clean
  • npm test — 251/251 passing (up from 212)
  • --help smoke test shows the three new flags

… answers

After the final synthesis, every [N] reference in the answer body is checked
against the extracted text of source N. Sentences are split, claim tokens
are extracted (lowercased, stop-words dropped, numbers and digit/letter
boundaries preserved), and recall is scored against each cited source.

Multi-cite sentences ([1][3]) are supported only when EVERY cited source
clears the threshold — a bogus cite buried in an otherwise-true sentence
is still flagged. No second LLM "judge" pass; this is pure-function lexical
scoring with zero new deps.

New flags: --strict-cites, --cite-min-recall=<0..1> (default 0.4),
--no-verify-cites. Env vars mirror. Verbose stream gets a verify line per
unsupported sentence; --json output gains a `verification` key; markdown
output appends a "## Citation health" footer only when something fails
(clean runs stay clean). AgentResult.verification + usage.{citationsTotal,
citationsSupported} added for library consumers.

39 new unit tests across all five exported helpers in verify.ts, plus
2 agent-loop integration tests (end-to-end bogus-cite flagging and
verifyCitations:false skip path), plus 4 CLI/config tests covering the
new flag plumbing. Suite goes from 212 → 251.

v0.5.0.
@askalf askalf merged commit 74316dc into master May 5, 2026
5 checks passed
@askalf askalf deleted the claude/plan-deep-dive-next-7b8Lx branch May 5, 2026 21:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants