Skip to content

feat(cli/mem): add mem patch, mem hash, and reject empty-stdin in mem set#627

Merged
tlongwell-block merged 4 commits into
mainfrom
dawn/mem-patch-and-empty-guard
May 20, 2026
Merged

feat(cli/mem): add mem patch, mem hash, and reject empty-stdin in mem set#627
tlongwell-block merged 4 commits into
mainfrom
dawn/mem-patch-and-empty-guard

Conversation

@tlongwell-block
Copy link
Copy Markdown
Collaborator

What

Three changes to make sprout mem safe to use as an edit pipeline. Closes the silent-data-loss footgun where sprout mem get <slug> | <failing transform> | sprout mem set <slug> - commits an empty value when the transform errors, and adds mem patch as the safe primitive for editing memory slugs without rewriting the whole value.

Changes

1. mem set rejects empty stdin by default (the bug fix).
A zero-byte stdin read is now refused with a clear error. Pass --allow-empty to opt in. The literal positional argument form (mem set slug "") is still accepted — that's explicit intent, no pipeline involved. This closes a real footgun: an upstream Python/sed script that exits non-zero would previously close the pipe and mem set would commit empty content, silently destroying the slug.

2. mem patch <slug> — apply a unified diff to the current value (stdin or --patch-file).

Safety properties:

  • Strict context match. Powered by diffy (pure-Rust, MIT). Hunks whose context lines don't match the current value verbatim are rejected. Positional offset is fine (lines added before the hunk don't break it); content fuzz is not.
  • --base-hash <hex> required unless --no-base-hash is passed. The hash is checked against the live value before applying, so concurrent edits between two agents surface as a Conflict (exit 5) instead of silently overwriting.
  • Multi-file patches refused. A memory slug is one virtual file; a patch with multiple --- headers is ambiguous and almost certainly an operator mistake (e.g. piping a multi-file git diff).
  • Empty results refused unless --allow-empty.
  • --dry-run echoes the input patch verbatim (not regenerated, per review) plus the resulting sha256, then exits without writing.
  • On a real write, the new sha256 is printed to stderr so callers can chain edits without re-fetching.

3. mem hash <slug> — print sha256(value) in hex. Matches sprout mem get <slug> | sha256sum byte-for-byte, so the base-hash is trivially verifiable from the shell.

Why this design

Reviewed by Max (@npub1mprn...) before implementation. Key choices we landed on:

  • patch over replace as the primitive. A unified diff encodes intent + context, so a context mismatch fails loudly. Once patch exists, replace/edit can be sugar on top in follow-up PRs.
  • diffy over hand-rolled. Pure Rust, no shellout, strict apply or error. Hand-rolled patch parsers tend to become bug farms.
  • Base hash required by default. Concurrent-edit safety should be the default, not an opt-in flag agents have to remember.
  • Echo input patch verbatim in dry-run rather than regenerating from create_patch — regenerated form can reorder/normalize and confuse review.

Verification

End-to-end against a live relay (sacrificial slugs):

  • Seed → mem hash → diff → --dry-run → apply → re-hash → tombstone ✅
  • mem hash output matches mem get | shasum -a 256 byte-for-byte ✅
  • All five failure modes behave correctly:
    • Empty stdin to set without --allow-empty → user_error, slug untouched ✅
    • Stale base-hash → Conflict (exit 5) ✅
    • Missing --base-hash → user_error explaining how to capture one ✅
    • Mismatched patch context → user_error, slug untouched ✅
    • Empty patch stdin → user_error ✅
    • Multi-file patch → user_error ✅
    • --base-hash + --no-base-hash → user_error (mutually exclusive) ✅

Unit tests added:

  • SHA-256 vectors (empty, abc, abc\n) — lock that we hash bytes verbatim
  • Diffy strict-context refusal (proves diffy refuses content fuzz)
  • Diffy round-trip preserves content
  • Multi-file --- header counter

cargo test -p sprout-cli --lib (64 passed) + cargo clippy -p sprout-cli --all-targets -- -D warnings clean.

Out of scope (follow-up PRs)

  • mem edit — ergonomic $EDITOR wrapper around mem patch.
  • mem replace — exact-string-replace sugar. Likely not needed once edit exists.

DCO / authorship

Signed-off-by: tlongwell-block
Co-authored-by: Dawn (sprout agent)

… `mem set`

Three changes that make `sprout mem` safer to use as an edit pipeline,
addressing the silent-data-loss footgun where `sprout mem get <slug> |
<failing transform> | sprout mem set <slug> -` would commit an empty
value when the transform errored:

1. **`mem set` rejects empty stdin by default** — pass `--allow-empty`
   to opt in to the rare case where you really do want to clear a slug.
   A literal `""` positional argument is still accepted (explicit
   intent, no pipeline involved). This is the urgent fix.

2. **`mem patch <slug>`** — apply a unified diff (stdin or
   `--patch-file`) to the current value, with three safety properties
   the raw `get | transform | set` pipeline can't offer:
   - **Strict context match.** Hunks whose context lines don't match
     the current value verbatim are rejected. No fuzz on content;
     positional offset is fine.
   - **`--base-hash <hex>` is required** (unless `--no-base-hash`).
     The hash is checked against the live value before applying, so
     concurrent edits between two agents fail with a Conflict instead
     of silently overwriting each other.
   - **Refuses empty results** (same `--allow-empty` semantics as set).
   `--dry-run` shows the resulting diff without writing; on a real
   write, the new sha256 is printed to stderr so callers can chain
   edits without re-fetching.

3. **`mem hash <slug>`** — print sha256(value) in hex. Matches
   `sprout mem get <slug> | sha256sum`, so the value used for
   `--base-hash` is trivially verifiable from the shell.

Implementation uses the `diffy` crate (pure-Rust, MIT) for unified-diff
parsing and strict application. Unit tests pin the strict-context
behavior so a future diffy upgrade can't loosen it without us noticing.

End-to-end verified against a live relay: seed → hash → diff → dry-run
→ apply, plus all five failure modes (stale base-hash, missing
base-hash, mismatched context, empty patch stdin, conflicting flags).

Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
Co-authored-by: Dawn (sprout agent) <c6237ef84fa537c78dcee78efd2d4e59f728859c7f194da42ac51ededfa0be05@sprout-oss.stage.blox.sqprod.co>
tlongwell-block and others added 3 commits May 20, 2026 11:47
Max caught this on review of PR #627: diffy::apply is strict on context
*content* but will slide a hunk forward/backward through the file to
find a position where the preimage matches. With our advertised "no
fuzz, no offset" semantics that's a real correctness gap — a patch
generated against `zero\nalpha\nbeta\ngamma\n` claiming to modify line
1 would silently land at line 2 instead of refusing.

Add `verify_hunks_at_declared_position` which, before calling
diffy::apply, checks that each hunk's preimage (Context + Delete lines)
matches the current value byte-for-byte at the line number the hunk
declares. Drift in line numbers means the file changed structurally
since the patch was generated; the strict mode wants the operator to
regenerate the patch rather than risk a silent off-target apply.

Test coverage:
- Max's exact case (offset-slide rejected; diffy would have accepted)
- Exact-position match (still accepted)
- Pure insertion into empty value (`@@ -0,0 +1,N @@`)
- No-trailing-newline value (matches diffy's strip-newline behavior on
  `\\ No newline at end of file` marker)

E2E verified against the live relay: offset-slide patch is rejected
with a precise diagnostic ("hunk #1 preimage mismatch at line 1: patch
expects ... but value has ..."), and the correctly-positioned patch
applies cleanly.

Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
Co-authored-by: Dawn (sprout agent) <c6237ef84fa537c78dcee78efd2d4e59f728859c7f194da42ac51ededfa0be05@sprout-oss.stage.blox.sqprod.co>
Max called out "for multiple hunks, track the cumulative line delta from
previous hunks or validate against the original preimage." Each unified-
diff hunk's `@@ -N,M @@` references line numbers in the *original* file,
not in the file as modified by previous hunks — so validating each
hunk's preimage against the unmodified `current_lines` at the declared
position is correct, no cumulative-delta tracking needed.

E2E confirmed against the live relay: a two-hunk patch (`@@ -1 @@` +
`@@ -10 @@`) applies both edits cleanly.

Add a unit test that pins this so a future "be helpful and track
deltas" refactor doesn't accidentally break it.

Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
Co-authored-by: Dawn (sprout agent) <c6237ef84fa537c78dcee78efd2d4e59f728859c7f194da42ac51ededfa0be05@sprout-oss.stage.blox.sqprod.co>
Max called out on PR #627 review: a pure-insertion hunk into a
non-empty value (`@@ -N,0 +N,M @@` with `N > 0`) is rejected because
there's no preimage to position-check against. The strict-mode safe
default is "refuse" rather than "land at an unverified position."

`diff -u` includes context lines by default, so users only hit this if
they hand-author a no-context insertion. Failure mode is rejection,
not corruption.

Document this in the code so the next person doesn't have to re-derive
it, and improve the error message to point at the workaround.

Signed-off-by: tlongwell-block <109685178+tlongwell-block@users.noreply.github.com>
Co-authored-by: Dawn (sprout agent) <c6237ef84fa537c78dcee78efd2d4e59f728859c7f194da42ac51ededfa0be05@sprout-oss.stage.blox.sqprod.co>
@tlongwell-block tlongwell-block merged commit c9dd97b into main May 20, 2026
15 checks passed
@tlongwell-block tlongwell-block deleted the dawn/mem-patch-and-empty-guard branch May 20, 2026 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant