feat: temporal repeated-event emphasis + AUDN session timestamps + extraction fallback by ethanj · Pull Request #47 · atomicmemory/atomicmemory-core

ethanj · 2026-04-25T06:30:03Z

Summary

Four feature concerns shipped as one branch, each with regression tests:

1. Repeated-event temporal endpoint formatting (`feat(retrieval)`)

Adds a query-aware temporal endpoint block to tiered injection. Recognizes "first ... second" temporal questions (e.g. "How many months between the first and second appointment?") and emits a compact two-endpoint block with elapsed duration when the retrieved memories contain two distinct dates matching the queried event terms.

Concept-group matching: a candidate must hit a synonym in EVERY query concept group (doctor AND appointment), not just one — partial matches like "only doctor" + "only appointment" no longer falsely become endpoints.
Endpoint block tokens are subtracted from the tier-assignment budget up front and counted in `estimatedContextTokens`, so the block never silently exceeds the caller's budget.
Plural↔canonical synonym resolution via reverse index (`appointments` → `appointment`).
Extracted `query-term-visibility.ts` and `temporal-format.ts` modules to keep `retrieval-format.ts` focused.

2. AUDN session timestamp threading (`feat(audn)`)

Adds an `observed_at` companion to `created_at` on stored memory rows. Logical session timestamps from ingest now flow through canonical fact storage, projection storage, supersede, and clarification writes. Without this, mutations recorded during transcript replay all stamped wall-clock `created_at`, breaking temporal ordering on those slices.

`StoreMemoryInput` accepts `observedAt` (default = `createdAt`).
`storeProjection` groups its trailing args into `StoreProjectionOptions` (`cmoId`, `logicalTimestamp`, `workspace`) — avoids a fifth positional argument.
AUDN clarify, opinion-confidence-collapse, and supersede write paths thread the timestamp.
Composite generation in `ingest-post-write` picks up `observedAt`.

3. Chunked extraction fallback (`feat(extraction)`)

Adds a default-off `CHUNKED_EXTRACTION_FALLBACK_ENABLED` flag. When normal extraction returns zero facts on a conversation longer than the configured chunk size, the consensus path retries with chunked extraction.

Also fixes a runtime-config bug: `extractOnce` now branches directly on `config.extractionCacheEnabled` rather than always routing through `cachedExtractFacts` (which reads the singleton and silently ignored `config_override.extractionCacheEnabled=false`).

4. Conflict policy fixes (`fix(conflict-policy)`)

Stop matching medical "check-up" wording as uncertainty (the bare "check" marker fired on routine medical phrases). Replaced with regexes that only match real uncertainty: `need/needs/will/should to check`, `check later/tomorrow/again/back`.
CLARIFY + explicit replacement signal ("replacing X", "no longer Y", "correction: ...") now upgrades to SUPERSEDE only when the target ID is present in the candidate set; otherwise stays CLARIFY. The previous fall-through to ADD silently kept the stale memory active alongside the new one. Stale/invalid target IDs that AUDN may return now keep CLARIFY rather than producing a SUPERSEDE that downstream rejects.
Refactored `applyClarificationOverrides` from a 14-cyclomatic if/else chain into a POLICIES list of small named transformers; dispatcher is a 6-line loop.

Codex review trail

Codex reviewed in 5 passes. All findings addressed:

Pass 1 (HIGH): `extractionCacheEnabled` runtime override bypass — fixed in `extractOnce`.
Pass 1 (MEDIUM): endpoint block tokens unbudgeted — counted in `estimatedContextTokens`, subtracted from assignment budget up front.
Pass 1 (MEDIUM): plural-asymmetric synonym expansion — reverse index added.
Pass 2 (MEDIUM): partial-match false endpoints — concept-group requirement.
Pass 2 (MEDIUM): CLARIFY + explicit replacement → ADD silently kept stale memory — upgrades to SUPERSEDE on valid target.
Pass 2 (LOW): new fallow complexity_moderate baseline entry — refactored function below threshold, baseline cleaned.
Pass 3 (MEDIUM): SUPERSEDE on stale targetMemoryId not in candidates — added `candidates.some` check.

Verification

`npx tsc --noEmit` clean
`npx fallow audit` clean ("No issues in 23 changed files")
`npm run build` succeeds
Focused tests: 83/83 passing across conflict-policy, temporal-endpoint-evidence, retrieval-format, consensus-extraction-runtime-config, audn-workspace-scope-fence, ingest-post-write, memory-ingest-runtime-config, memory-storage-runtime-config, memory-route-config-override

🤖 Generated with Claude Code

The UNCERTAIN_MARKERS list contained the bare token "check", which fired on routine medical phrases like "check-up with the doctor" and routed those facts to clarification instead of ADD. Replace it with a pair of regexes that only match real uncertainty wording: - "need/needs/needed/will/should to check" - "check later/tomorrow/again/back" Adds a regression test covering "Sam had a check-up with Sam's doctor". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…tion Adds a default-off CHUNKED_EXTRACTION_FALLBACK_ENABLED flag. When normal extraction returns zero facts on a conversation longer than the configured chunk size, the consensus path now retries with chunked extraction. This recovers from extraction failures on long inputs without enabling chunked extraction unconditionally. Refactors chunkedExtractFacts to take its config as an explicit argument instead of reading the module-level singleton, so per-request runtime overrides flow through. extractOnce now also branches directly on runtime extractionCacheEnabled rather than always routing through cachedExtractFacts (which reads the singleton internally) — this lets config_override.extractionCacheEnabled actually take effect during benchmark sweeps. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds an observed_at companion to created_at on stored memory rows and threads logical session timestamps from the ingest path through to canonical fact storage, projection storage, supersede, and clarification writes. Without this, mutations recorded during transcript replay (and benchmark sweeps with explicit session timestamps) all stamped wall-clock created_at, breaking temporal ordering and ranking on those slices. - StoreMemoryInput accepts observedAt; defaults to createdAt at the repository layer. - storeProjection groups its trailing arguments into a StoreProjectionOptions object (cmoId, logicalTimestamp, workspace) — now that the call sites need three optional fields, the bag avoids adding a fifth positional argument. - AUDN clarify, opinion-confidence-collapse, and supersede write paths pass through the logical timestamp. - Composite generation in ingest-post-write picks up observedAt to match created_at. - Unit tests cover the timestamp threading at storage, ingest, AUDN, composite, and integration (temporal-mutation-regression) layers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds a query-aware temporal endpoint block to tiered injection. Recognizes repeated-event temporal questions (e.g. "How many months between the first and second appointment?") and emits a compact two-endpoint summary plus the elapsed duration when retrieved memories contain two distinct dates matching the queried event terms. - temporal-endpoint-evidence.ts: identifies the question shape, scores candidate memories by event-term overlap, picks the earliest two distinct dates, emits the block. Synonym table covers the common appointment/doctor surface; reverse index resolves plurals back to the canonical singular so "first and second appointments" expands via "appointment". - retrieval-format.ts: builds the endpoint block before tier assignment so its tokens are subtracted from the assignment budget and counted in estimatedContextTokens. Otherwise the appended block would silently exceed the caller's budget and underreport packaged tokens. - query-term-visibility.ts: extracted from retrieval-format. Same upgrade-tier-when-query-terms-hidden logic, just split out so retrieval-format stays under the file-size guideline and the helper is independently testable. - temporal-format.ts: shared formatDateLabel + formatDuration so retrieval-format and temporal-endpoint-evidence don't each carry their own copy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

PR #46's paydown landed StoreMemoryInput centralization, response-schema namespace imports, and the postJson test helper into main while this branch was open. Rebasing onto the new main shifted line numbers in the accepted-debt entries; regenerate the baselines so fallow audit compares against the post-paydown layout. The original baseline-refresh commit from this branch was dropped during rebase since #46's baselines superseded it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ndidates Previously buildRepeatedEventEndpointBlock flattened every query event term and its synonyms into one list, then accepted any memory with at least one match. For "first and second doctor appointment", a memory mentioning only "doctor" and another mentioning only "appointment" would falsely become the two endpoints — neither proves the combined event happened. Group synonyms by canonical concept (doctor synonyms vs appointment synonyms), and require a candidate to hit at least one synonym in EVERY group. Adds a regression test covering the partial-match false-positive case codex flagged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…t, refactor to policy chain Two related changes: 1. CLARIFY decisions with an explicit replacement signal ("replacing X", "no longer Y", "instead of Z", "correction: ...") were being promoted to ADD. promoteToAdd clears targetMemoryId, leaving the stale memory active alongside the new one — which silently fails the user's explicit replacement intent for current-state facts like "replacing Alice Morgan with Bob Chen". When AUDN identified a target, upgrade to SUPERSEDE so the stale memory is expired. When no target was identified, keep CLARIFY rather than fall through to ADD; the user asked for a replacement we can't pin down, so defer to them. 2. applyClarificationOverrides was at 14 cyclomatic / 19 cognitive, above fallow's moderate threshold and growing every time a new policy was added. Refactor into a POLICIES list of small, named transformers; the dispatcher becomes a 6-line loop. Each policy returns null to defer or a transformed AUDNDecision to commit. Adds two regression tests: CLARIFY+target+replacement → SUPERSEDE, and CLARIFY+no-target+replacement → keep CLARIFY. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…aseline The applyClarificationOverrides refactor in the previous commit cleared the moderate-complexity finding for conflict-policy.ts (the dispatcher went from 14 cyclomatic to ~3, with each policy now a separately-named small function). Re-save the baseline to remove the stale entry — per the workspace rule, fix fallow findings rather than baseline them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…seding on CLARIFY The previous fix upgraded CLARIFY+explicit-replacement to SUPERSEDE whenever decision.targetMemoryId was set, but didn't verify the target was actually in the candidate set. If AUDN returned a stale or invalid target ID, memory-audn rejects the SUPERSEDE (target not found) and falls back to canonical storage, which silently leaves the old memory active — the same bug the SUPERSEDE upgrade was meant to fix. Check candidates.some(c => c.id === targetMemoryId) before superseding; keep CLARIFY otherwise. Adds a regression test using a target ID that's not in the candidate set. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…seline ratchet CI's shrink-only baseline ratchet flagged a +1 dupes entry on this branch. The cause: PR #46 unified StoreMemoryInput in repository-types and re-exported it from stores.ts and repository-write.ts, but missed memory-repository.ts which still carried its own local copy. With this branch's observedAt addition, the local copy fell back into clone-group overlap with the centralized type. Drop the local StoreMemoryInput from memory-repository.ts and import the centralized one from repository-types instead. Regenerate baselines to reflect the current line numbers. Net: dupes baseline 30 -> 29 entries (matches origin/main); ratchet script reports "dupes: 29 -> 29 (unchanged)". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ethanj and others added 10 commits April 24, 2026 23:10

ethanj merged commit eb43e1b into main Apr 25, 2026
1 check passed

ethanj deleted the benchmark/temporal-repeated-event-emphasis branch April 25, 2026 06:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: temporal repeated-event emphasis + AUDN session timestamps + extraction fallback#47

feat: temporal repeated-event emphasis + AUDN session timestamps + extraction fallback#47
ethanj merged 10 commits intomainfrom
benchmark/temporal-repeated-event-emphasis

ethanj commented Apr 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ethanj commented Apr 25, 2026

Summary

1. Repeated-event temporal endpoint formatting (`feat(retrieval)`)

2. AUDN session timestamp threading (`feat(audn)`)

3. Chunked extraction fallback (`feat(extraction)`)

4. Conflict policy fixes (`fix(conflict-policy)`)

Codex review trail

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant