fix(memory): harden curation tier — no silent LLM no-op + lossless update guard by Aaronontheweb · Pull Request #1242 · netclaw-dev/netclaw

Aaronontheweb · 2026-05-31T02:30:45Z

Two safety fixes to the memory curation tier (MemoryCurationActor + helpers), both about the tier failing quietly. Surfaced while measuring the dedup path against a real store — context and the broader plan in #1241.

1. The LLM curation tier silently no-ops on reasoning models

TryLlmEvaluationAsync called the model with MaxOutputTokens = 50. A reasoning model spends that budget on hidden thinking and returns empty content, so ParseResponse got nothing and the tier silently fell through to deterministic auto-resolution — the curation LLM effectively disabled, with no signal. That's a "no silent fallback" violation.

Fix

Bump the curation call to MaxOutputTokens = 512 so a thinking model has room to emit the bare keyword the prompt asks for; non-reasoning models stop after the keyword and pay nothing extra.
ParseResponse strips inline <think>…</think> traces before matching, for stacks that inline reasoning in the content.
When nothing parses, log curation_llm_no_decision (Warning) instead of returning null silently — the deterministic fallback is now observable.

Fully suppressing reasoning at the provider level is a follow-up; this hardens the failure mode.

2. UPDATE can silently overwrite richer memories with narrower proposals

The UPDATE path does markdown_body = excluded.markdown_body — a wholesale replace. When a proposal is narrower or divergent than the memory it updates, the existing content is destroyed. I measured this clobbering dated point-in-time readings (newer overwriting older) — real, irreversible data loss.

Fix — CurationRulesEvaluator.GuardDestructiveUpdate, applied to both rules- and LLM-tier UPDATE decisions: an UPDATE only proceeds when the proposal preserves the existing content (the existing body is wholly contained in the proposal). Otherwise it downgrades to Skip and keeps the existing memory.

Deliberate tradeoff (Option A): when a proposal isn't a superset, its new detail is dropped in favor of keeping the existing memory intact — a dropped line is recoverable, destroyed content is not. Note this also means a genuine value-replace ("v1.0" → "v2.0") will now Skip rather than overwrite (keeping the old value), since "2.0" doesn't contain "1.0". Clean handling of value-replacements and lossless enrich both need the LLM-synthesis / record-classification work tracked in #1241; this is the minimal stop-the-data-loss guard until then.

Relation to #1241 — partial step, does not close it

This is intentionally not a resolution of #1241; it stays open after this merges. It ships only the data-loss stopgap from #1241's Problem 2 — the destructive-overwrite guard (Option A), which keeps the existing memory rather than letting a narrower auto-curation proposal clobber it.

Still open in #1241:

Problem 1 (coverage) — embedding-based candidate retrieval so paraphrase duplicates reach the gate at all. Untouched here.
Problem 2, next tranche — the lossless merge/enrich that lets the agent reword and consolidate without dropping facts. The guard here deliberately blocks reword-overwrites in the automatic pipeline; proper consolidation needs LLM synthesis + a fact-preservation check, not a string guard.

(Scope note: the guard applies only to the automatic curation pipeline. Explicit update_memory edits go through a separate find-and-replace path and are not affected.)

Tests / gates

New unit tests: CurationPromptBuilderTests (think-block stripping, unclosed think), CurationRulesEvaluatorTests (guard: drops-content→skip, superset→allow, whitespace/case-insensitive, non-update passthrough, missing-target passthrough).
slopwatch analyze clean; copyright headers present.
Touches the memory pipeline, so ./evals/run-evals.sh should run in CI before merge.

…date guard 1. Silent LLM no-op: the curation call used MaxOutputTokens=50, so a reasoning model burned the budget on hidden thinking and returned empty content — the tier then silently fell through to deterministic resolution. Bump to 512, strip inline <think>...</think> in ParseResponse, and log curation_llm_no_decision instead of returning null silently. 2. Destructive UPDATE: the update path overwrote markdown_body wholesale, so a narrower/divergent proposal destroyed the existing memory's content. GuardDestructiveUpdate now only allows an UPDATE when the proposal preserves (is a superset of) the existing content; otherwise it downgrades to Skip and keeps the existing memory. Applied to both rules- and LLM-tier decisions. Refs netclaw-dev#1241

Aaronontheweb

Relatively targeted change and preventing useful memories from being damaged by updates and for keeping garbage tokens out of the memory observer stream in the first place.

Aaronontheweb · 2026-05-31T13:56:51Z

        return null;
    }

+    private static string StripThinkBlocks(string text)


Only impacts thinking models, obviously, but will probably need to be hardened against more variants (i.e. incomplete thinking blocks) in the future. This is more of an issue for self-hosted inference than it is for cloud-hosted.

Aaronontheweb · 2026-05-31T14:10:30Z

+    /// divergent proposal's new detail, which is recoverable; destroying accumulated
+    /// content is not. Non-UPDATE decisions pass through unchanged.
+    /// </summary>
+    public static CurationDecision GuardDestructiveUpdate(


What this does is essentially stops old memories from being overwritten by something different unless the new content is a word for word superset (proposed by the LLM) of the old memory. This is designed to just prevent valuable information from being lost.

Aaronontheweb · 2026-05-31T14:11:08Z

+                // real signal (empty/garbled output, or a reasoning model that consumed
+                // its token budget). Surface it so the deterministic fallback that
+                // follows is observable rather than invisible.
+                log.Warning(


not a lot a user can do it about this other than let us know - it's a model-alignment problem rather than a technical / code issue.

Aaronontheweb commented May 31, 2026

View reviewed changes

Merge branch 'dev' into fix/harden-memory-curation

f6eb1d0

Aaronontheweb added memory Memory formation, recall, curation pipeline enhancement New feature or request labels May 31, 2026

Aaronontheweb enabled auto-merge (squash) May 31, 2026 14:12

Aaronontheweb merged commit 6f63db3 into netclaw-dev:dev May 31, 2026
14 checks passed

Aaronontheweb deleted the fix/harden-memory-curation branch May 31, 2026 14:21

This was referenced May 31, 2026

feat(memory): dedup — the gate misses paraphrase duplicates, and the merge path can overwrite data #1241

Open

Prepare release v0.22.0 #1256

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(memory): harden curation tier — no silent LLM no-op + lossless update guard#1242

fix(memory): harden curation tier — no silent LLM no-op + lossless update guard#1242
Aaronontheweb merged 2 commits into
netclaw-dev:devfrom
Aaronontheweb:fix/harden-memory-curation

Aaronontheweb commented May 31, 2026 •

edited

Loading

Uh oh!

Aaronontheweb left a comment

Uh oh!

Aaronontheweb May 31, 2026

Uh oh!

Aaronontheweb May 31, 2026

Uh oh!

Aaronontheweb May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Aaronontheweb commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. The LLM curation tier silently no-ops on reasoning models

2. UPDATE can silently overwrite richer memories with narrower proposals

Relation to #1241 — partial step, does not close it

Tests / gates

Uh oh!

Aaronontheweb left a comment

Choose a reason for hiding this comment

Uh oh!

Aaronontheweb May 31, 2026

Choose a reason for hiding this comment

Uh oh!

Aaronontheweb May 31, 2026

Choose a reason for hiding this comment

Uh oh!

Aaronontheweb May 31, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Aaronontheweb commented May 31, 2026 •

edited

Loading