Conversation
The UNCERTAIN_MARKERS list contained the bare token "check", which fired on routine medical phrases like "check-up with the doctor" and routed those facts to clarification instead of ADD. Replace it with a pair of regexes that only match real uncertainty wording: - "need/needs/needed/will/should to check" - "check later/tomorrow/again/back" Adds a regression test covering "Sam had a check-up with Sam's doctor". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tion Adds a default-off CHUNKED_EXTRACTION_FALLBACK_ENABLED flag. When normal extraction returns zero facts on a conversation longer than the configured chunk size, the consensus path now retries with chunked extraction. This recovers from extraction failures on long inputs without enabling chunked extraction unconditionally. Refactors chunkedExtractFacts to take its config as an explicit argument instead of reading the module-level singleton, so per-request runtime overrides flow through. extractOnce now also branches directly on runtime extractionCacheEnabled rather than always routing through cachedExtractFacts (which reads the singleton internally) — this lets config_override.extractionCacheEnabled actually take effect during benchmark sweeps. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds an observed_at companion to created_at on stored memory rows and threads logical session timestamps from the ingest path through to canonical fact storage, projection storage, supersede, and clarification writes. Without this, mutations recorded during transcript replay (and benchmark sweeps with explicit session timestamps) all stamped wall-clock created_at, breaking temporal ordering and ranking on those slices. - StoreMemoryInput accepts observedAt; defaults to createdAt at the repository layer. - storeProjection groups its trailing arguments into a StoreProjectionOptions object (cmoId, logicalTimestamp, workspace) — now that the call sites need three optional fields, the bag avoids adding a fifth positional argument. - AUDN clarify, opinion-confidence-collapse, and supersede write paths pass through the logical timestamp. - Composite generation in ingest-post-write picks up observedAt to match created_at. - Unit tests cover the timestamp threading at storage, ingest, AUDN, composite, and integration (temporal-mutation-regression) layers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a query-aware temporal endpoint block to tiered injection. Recognizes repeated-event temporal questions (e.g. "How many months between the first and second appointment?") and emits a compact two-endpoint summary plus the elapsed duration when retrieved memories contain two distinct dates matching the queried event terms. - temporal-endpoint-evidence.ts: identifies the question shape, scores candidate memories by event-term overlap, picks the earliest two distinct dates, emits the block. Synonym table covers the common appointment/doctor surface; reverse index resolves plurals back to the canonical singular so "first and second appointments" expands via "appointment". - retrieval-format.ts: builds the endpoint block before tier assignment so its tokens are subtracted from the assignment budget and counted in estimatedContextTokens. Otherwise the appended block would silently exceed the caller's budget and underreport packaged tokens. - query-term-visibility.ts: extracted from retrieval-format. Same upgrade-tier-when-query-terms-hidden logic, just split out so retrieval-format stays under the file-size guideline and the helper is independently testable. - temporal-format.ts: shared formatDateLabel + formatDuration so retrieval-format and temporal-endpoint-evidence don't each carry their own copy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #46's paydown landed StoreMemoryInput centralization, response-schema namespace imports, and the postJson test helper into main while this branch was open. Rebasing onto the new main shifted line numbers in the accepted-debt entries; regenerate the baselines so fallow audit compares against the post-paydown layout. The original baseline-refresh commit from this branch was dropped during rebase since #46's baselines superseded it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ndidates Previously buildRepeatedEventEndpointBlock flattened every query event term and its synonyms into one list, then accepted any memory with at least one match. For "first and second doctor appointment", a memory mentioning only "doctor" and another mentioning only "appointment" would falsely become the two endpoints — neither proves the combined event happened. Group synonyms by canonical concept (doctor synonyms vs appointment synonyms), and require a candidate to hit at least one synonym in EVERY group. Adds a regression test covering the partial-match false-positive case codex flagged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…t, refactor to policy chain
Two related changes:
1. CLARIFY decisions with an explicit replacement signal ("replacing X",
"no longer Y", "instead of Z", "correction: ...") were being promoted
to ADD. promoteToAdd clears targetMemoryId, leaving the stale memory
active alongside the new one — which silently fails the user's
explicit replacement intent for current-state facts like "replacing
Alice Morgan with Bob Chen". When AUDN identified a target, upgrade
to SUPERSEDE so the stale memory is expired. When no target was
identified, keep CLARIFY rather than fall through to ADD; the user
asked for a replacement we can't pin down, so defer to them.
2. applyClarificationOverrides was at 14 cyclomatic / 19 cognitive,
above fallow's moderate threshold and growing every time a new
policy was added. Refactor into a POLICIES list of small, named
transformers; the dispatcher becomes a 6-line loop. Each policy
returns null to defer or a transformed AUDNDecision to commit.
Adds two regression tests: CLARIFY+target+replacement → SUPERSEDE,
and CLARIFY+no-target+replacement → keep CLARIFY.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…aseline The applyClarificationOverrides refactor in the previous commit cleared the moderate-complexity finding for conflict-policy.ts (the dispatcher went from 14 cyclomatic to ~3, with each policy now a separately-named small function). Re-save the baseline to remove the stale entry — per the workspace rule, fix fallow findings rather than baseline them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…seding on CLARIFY The previous fix upgraded CLARIFY+explicit-replacement to SUPERSEDE whenever decision.targetMemoryId was set, but didn't verify the target was actually in the candidate set. If AUDN returned a stale or invalid target ID, memory-audn rejects the SUPERSEDE (target not found) and falls back to canonical storage, which silently leaves the old memory active — the same bug the SUPERSEDE upgrade was meant to fix. Check candidates.some(c => c.id === targetMemoryId) before superseding; keep CLARIFY otherwise. Adds a regression test using a target ID that's not in the candidate set. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…seline ratchet CI's shrink-only baseline ratchet flagged a +1 dupes entry on this branch. The cause: PR #46 unified StoreMemoryInput in repository-types and re-exported it from stores.ts and repository-write.ts, but missed memory-repository.ts which still carried its own local copy. With this branch's observedAt addition, the local copy fell back into clone-group overlap with the centralized type. Drop the local StoreMemoryInput from memory-repository.ts and import the centralized one from repository-types instead. Regenerate baselines to reflect the current line numbers. Net: dupes baseline 30 -> 29 entries (matches origin/main); ratchet script reports "dupes: 29 -> 29 (unchanged)". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Four feature concerns shipped as one branch, each with regression tests:
1. Repeated-event temporal endpoint formatting (`feat(retrieval)`)
Adds a query-aware temporal endpoint block to tiered injection. Recognizes "first ... second" temporal questions (e.g. "How many months between the first and second appointment?") and emits a compact two-endpoint block with elapsed duration when the retrieved memories contain two distinct dates matching the queried event terms.
2. AUDN session timestamp threading (`feat(audn)`)
Adds an `observed_at` companion to `created_at` on stored memory rows. Logical session timestamps from ingest now flow through canonical fact storage, projection storage, supersede, and clarification writes. Without this, mutations recorded during transcript replay all stamped wall-clock `created_at`, breaking temporal ordering on those slices.
3. Chunked extraction fallback (`feat(extraction)`)
Adds a default-off `CHUNKED_EXTRACTION_FALLBACK_ENABLED` flag. When normal extraction returns zero facts on a conversation longer than the configured chunk size, the consensus path retries with chunked extraction.
Also fixes a runtime-config bug: `extractOnce` now branches directly on `config.extractionCacheEnabled` rather than always routing through `cachedExtractFacts` (which reads the singleton and silently ignored `config_override.extractionCacheEnabled=false`).
4. Conflict policy fixes (`fix(conflict-policy)`)
Codex review trail
Codex reviewed in 5 passes. All findings addressed:
Verification
🤖 Generated with Claude Code