[#17] Auto-snapshot data/ before destructive ops + xbrain snapshot subcommand by VGonPa · Pull Request #22 · VGonPa/xbrain

VGonPa · 2026-05-21T20:25:27Z

Closes #17.

Summary

A safety net for the three destructive XBrain commands. Before any of
vocab --regenerate, topics --resynth or fetch --force writes a byte,
the full data/ directory is copied to data/snapshots/<UTC-ts>-pre-<op>/
with a snapshot.json manifest. A bad re-run is now one command away from
recovery: xbrain snapshot restore <name>.

This is also the foundation for #18 (xbrain diff — compare two
snapshots, surface unexpected drift). Without snapshots, no diff.

What ships

New xbrain.snapshot module — pure lifecycle: snapshot_create,
snapshot_pre, snapshot_list, snapshot_show, snapshot_restore,
snapshot_prune. Pydantic manifest with counts, command, UTC timestamp,
xbrain_version.
Five new CLI verbs: xbrain snapshot {create,list,show,restore,prune}.
Auto-snapshot hooks wired into the three destructive code paths in
cli.py. A snapshot failure propagates and aborts the destructive op —
never silently skipped.
Vault not snapshotted (it is fully derivable from data/ via xbrain generate).

Spec deviation (worth flagging)

The PRD listed four destructive sites including enrich --regenerate. The
CLI has no such flag — re-enrichment is triggered via vocab --regenerate,
which already calls _mark_for_regenerate. Three sites in total, not four.

Tests

Suite	Count	Why
`tests/test_snapshot.py`	14	Unit tests on the module. Empty data, full data, naming, list ordering, restore round-trip, restore deletes a live file missing from the snapshot, prune keep-last, prune=0, prune rejects negative, show on unknown raises.
`tests/test_snapshot_auto.py`	10	Integration via `CliRunner`. Each destructive flag creates the expected `pre-<op>` snapshot; absence of the flag creates none; every `snapshot` verb exercised end-to-end.

Total: 245 tests (up from 235). Coverage: 87% (no regression).
uv run poe check: all-green (the existing Radon C and Deptry warns
are pre-existing on develop, not introduced here).

Docs

README: new snapshot row in Commands; new "Snapshots & safety" section
with xbrain snapshot list/create/restore/prune examples; linked from TOC.
ARCHITECTURE: new invariant WS3 — enrichment evaluation harness #8 ("destructive ops are reversible"); each
destructive step card carries a "Snapshots data/ before --flag" bullet;
snapshot.py listed in "Where things live".

Specs

PRD: vault/zz-support-files/docs/prds/2026-05-21-xbrain-17-auto-snapshot.md
Plan: vault/zz-support-files/docs/implementation-plans/2026-05-21-xbrain-17-auto-snapshot.md

Test plan

Unit tests on the snapshot module
Integration tests for auto-snapshot wiring on each destructive command
CLI verbs exercised end-to-end via CliRunner
uv run poe check all-green
README + ARCHITECTURE updated

🤖 Generated with Claude Code

…bcommand Adds an automatic safety net: every destructive command (`vocab --regenerate`, `topics --resynth`, `fetch --force`) snapshots the full `data/` directory to `data/snapshots/<UTC-ts>-pre-<command>/` *before* it writes anything. If the re-run produces worse results, `xbrain snapshot restore <name>` brings the previous good state back. Foundation for #18 (`xbrain diff`). Implementation: - New `xbrain.snapshot` module with a pure, fully-typed lifecycle API: `snapshot_create`, `snapshot_pre`, `snapshot_list`, `snapshot_show`, `snapshot_restore`, `snapshot_prune`. Each artifact is copied via `shutil.copy2`, the manifest is written through the existing atomic-write helper. Counts in the manifest come from parsing the live files (yaml for vocab, json for the rest). - Five new CLI verbs under `xbrain snapshot {create,list,show,restore,prune}`. - Auto-snapshot wired into the three destructive code paths in `cli.py`: `_run_fetch` (when `--force`), `_vocab_apply` / `_vocab_run` (when `--regenerate`), `_topics_run` (when `--resynth`). A failed snapshot propagates and aborts the destructive op — never silently skipped. - The vault is intentionally NOT snapshotted (it is fully derivable from `data/` via `xbrain generate`). Spec deviation worth flagging: the PRD mentioned `enrich --regenerate` as a fourth destructive site, but the CLI has no such flag — re-enrichment is triggered via `vocab --regenerate`, which already calls `_mark_for_regenerate`. Three destructive sites in total (vocab/topics/fetch), not four. Tests: - 14 unit tests on the snapshot module (empty data, full data, naming rules, list ordering, restore round-trip, restore deletes a live file missing from the snapshot, restore does not touch unrelated files, prune keep-last, prune=0 clears everything, prune rejects negative, show on unknown raises). - 10 integration tests via `CliRunner`: every destructive flag creates the expected `pre-<op>` snapshot; the same flag absent creates none; every new `xbrain snapshot` CLI verb is exercised end-to-end. Total: 245 tests (up from 235), 87% coverage, `uv run poe check` all-green. Docs: - README: new `snapshot` row in the Commands table + new "Snapshots & safety" section linked from the TOC. - ARCHITECTURE: new invariant #8 ("destructive ops are reversible"); each of the three destructive cards gains a "Snapshots `data/` before <flag>" note; `snapshot.py` listed in "Where things live". Closes #17. PRD: vault/zz-support-files/docs/prds/2026-05-21-xbrain-17-auto-snapshot.md Plan: vault/zz-support-files/docs/implementation-plans/2026-05-21-xbrain-17-auto-snapshot.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ening Addresses every HIGH/CRITICAL finding from the review pipeline on PR #22 (code-reviewer, silent-failure-hunter, pr-test-analyzer, python-code-reviewer, code-simplifier, spec-compliance): snapshot.py: - snapshot_create returns (Path, SnapshotManifest) — callers (incl. _auto_snapshot) now print the item count from the manifest just written, matching PRD §5 observability ("Snapshot created: <path> (N items)"). - New `dir_label` parameter separates directory naming from manifest.command: manifest now records "vocab-regenerate" (the op name) while the directory carries the `pre-` prefix. Fixes the dual-purpose smell flagged by code-reviewer. - snapshot_pre removed — inlined in _auto_snapshot (code-simplifier). - Timestamp gains millisecond precision (`%Y-%m-%dT%H-%M-%S-NNNZ`). Eliminates the same-second collision bug flagged by pr-test-analyzer #9 and python-code-reviewer #2. As a side effect, the suite no longer needs `time.sleep(1.1)` between snapshots — total test runtime dropped from 7s to <1s. - snapshot_restore now uses `shutil.copy2` symmetrically with snapshot_create instead of the text round-trip via `_atomic_write`. Binary-safe, metadata- preserving, and no longer asymmetric (code-reviewer/python-code-reviewer both flagged this as the must-fix-before-merge issue). - snapshot_restore returns a list of (artifact, action) tuples — RESTORE_COPIED, RESTORE_DELETED, RESTORE_SKIPPED. The CLI prints every action, so a deletion from a "missing in snapshot" artifact is never silent (silent-failure-hunter #1 HIGH). - snapshot_list now returns rows with `manifest=None` for corrupt directories instead of silently dropping them; the CLI marks those as CORRUPT on stderr (silent-failure-hunter #2 HIGH). - _count_* helpers now propagate exceptions instead of swallowing them — a corrupt items.json aborts the snapshot, not records a lying count=0 (silent-failure-hunter #3, code-reviewer #3). Inlined the trivial _count_items/_count_topics wrappers (code-simplifier #1). - All imports (json, yaml, importlib.metadata) at the module top (code-reviewer #3). - _count_items/_count_topics removed (one-line wrappers, code-simplifier #1). cli.py: - _OPERATOR_ERRORS now includes OSError (covers PermissionError, FileExistsError, IsADirectoryError) so snapshot I/O failures surface as clean exit-1 instead of raw tracebacks (silent-failure-hunter #4). - _auto_snapshot now reads the count from the manifest and emits the spec- mandated English message: `Snapshot created: <dir> (N items)` (pr-test-analyzer #10, python-code-reviewer #3). - snapshot_restore_cmd echoes every per-artifact action. - snapshot_list_cmd handles `manifest=None` rows as CORRUPT (to stderr). - snapshot_create_cmd uses the new (path, manifest) return shape and passes `command="manual"` + `dir_label=name`. - Strings translated to English (the whole new subcommand group; the rest of the CLI stays Spanish — out of scope here). Tests: - test_snapshot.py: 21 unit tests (up from 14). New: round-trip across ALL FOUR artifacts (pr-test-analyzer #1 CRITICAL), millisecond-collision regression, corrupt-JSON-aborts-snapshot, dir_label separation from command, shutil.copy2-preserves-bytes (binary-safety smoke test), per-artifact action codes, xbrain_version assertion (pr-test-analyzer #5), prune-with-fewer-than- keep_last (pr-test-analyzer #6). - test_snapshot_auto.py: 16 integration tests (up from 10). New: snapshot- taken-before-mutation-when-op-fails (pr-test-analyzer #2 CRITICAL — uses monkeypatch to force `_mark_for_regenerate` to raise, asserts snapshot already on disk + items.json unchanged), snapshot-failure-aborts-destructive-op (pr-test-analyzer #3 CRITICAL — monkeypatch snapshot_create to raise OSError, assert fetch --force aborts and nothing is mutated), snapshot show CLI (pr-test-analyzer #7), restore-via-CLI-with-missing-artifact (pr-test-analyzer #8), corrupt-dirs-marked-via-CLI, stdout-includes-item-count (pr-test-analyzer #10). - All 258 tests pass; coverage 87%. CONTRIBUTING.md: - Added a "Safety: destructive operations auto-snapshot" section (spec-compliance #FAIL — closes the doc gap). Deviation log unchanged: 3 destructive sites (`vocab --regenerate`, `topics --resynth`, `fetch --force`) — `enrich --regenerate` does not exist as a CLI flag, re-enrichment happens via `vocab --regenerate` which is already covered. Spec-compliance reviewer confirmed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

VGonPa and others added 2 commits May 21, 2026 22:25

VGonPa merged commit 0cf7f64 into develop May 21, 2026
1 check passed

VGonPa deleted the ws-17-auto-snapshot branch May 21, 2026 20:35

This was referenced May 21, 2026

Auto-snapshot data/ before destructive operations #17

Closed

Wave: #17 auto-snapshot + #16 configurable output language #25

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[#17] Auto-snapshot data/ before destructive ops + xbrain snapshot subcommand#22

[#17] Auto-snapshot data/ before destructive ops + xbrain snapshot subcommand#22
VGonPa merged 2 commits into
developfrom
ws-17-auto-snapshot

VGonPa commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

VGonPa commented May 21, 2026

Summary

What ships

Spec deviation (worth flagging)

Tests

Docs

Specs

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant