Skip to content

WS2 Fase 1: enrichment core pipeline#1

Merged
VGonPa merged 23 commits into
developfrom
ws2-fase1-enrichment-core
May 18, 2026
Merged

WS2 Fase 1: enrichment core pipeline#1
VGonPa merged 23 commits into
developfrom
ws2-fase1-enrichment-core

Conversation

@VGonPa
Copy link
Copy Markdown
Owner

@VGonPa VGonPa commented May 18, 2026

WS2 Fase 1 — Enrichment core pipeline

Turns the paused enrich stub into a real, two-track enrichment pipeline.

What this adds

  • Data model: Enrichment.primary_topic, new Topic, Item.bookmark_folder (drops unused note_worthiness).
  • Config: [enrich] / [vocab] sections; claude-code is the default executor.
  • Declarative artifacts: rubrics/rubric-{summary,topics,vocab}.md + guardrails.yaml, packaged inside src/xbrain/.
  • validate.py: mechanical validator — the LLM emits only judgment, code enforces structure.
  • vocab: map-reduce taxonomy induction into data/vocab.yaml.
  • Two executor tracks: api (Anthropic API, in-process) and the worksheet handoff (manual/claude-code, no API cost), plus the enriching-x-knowledge skill.
  • CLI: new xbrain vocab; rewritten xbrain enrich (--executor, --apply).
  • generate: topics + bookmark folder as Obsidian frontmatter tags.

Quality

18 tasks via subagent-driven-development; per-task + holistic review; 99 tests, ruff clean. Bookmark-folder extraction is a separate spike-driven follow-up (X internals need live confirmation).

Plan: zz-support-files/docs/implementation-plans/2026-05-17-xbrain-ws2-fase1-enrichment-core.md

🤖 Generated with Claude Code

VGonPa and others added 23 commits May 18, 2026 08:51
…r validation, robust JSON, dup topics

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ypes, dedup cleanups

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ient

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@VGonPa VGonPa merged commit 50b941e into develop May 18, 2026
@VGonPa VGonPa deleted the ws2-fase1-enrichment-core branch May 18, 2026 08:59
VGonPa added a commit that referenced this pull request May 21, 2026
…ening

Addresses every HIGH/CRITICAL finding from the review pipeline on PR #22
(code-reviewer, silent-failure-hunter, pr-test-analyzer, python-code-reviewer,
code-simplifier, spec-compliance):

snapshot.py:
- snapshot_create returns (Path, SnapshotManifest) — callers (incl. _auto_snapshot)
  now print the item count from the manifest just written, matching PRD §5
  observability ("Snapshot created: <path> (N items)").
- New `dir_label` parameter separates directory naming from manifest.command:
  manifest now records "vocab-regenerate" (the op name) while the directory
  carries the `pre-` prefix. Fixes the dual-purpose smell flagged by code-reviewer.
- snapshot_pre removed — inlined in _auto_snapshot (code-simplifier).
- Timestamp gains millisecond precision (`%Y-%m-%dT%H-%M-%S-NNNZ`). Eliminates
  the same-second collision bug flagged by pr-test-analyzer #9 and
  python-code-reviewer #2. As a side effect, the suite no longer needs
  `time.sleep(1.1)` between snapshots — total test runtime dropped from 7s to <1s.
- snapshot_restore now uses `shutil.copy2` symmetrically with snapshot_create
  instead of the text round-trip via `_atomic_write`. Binary-safe, metadata-
  preserving, and no longer asymmetric (code-reviewer/python-code-reviewer
  both flagged this as the must-fix-before-merge issue).
- snapshot_restore returns a list of (artifact, action) tuples — RESTORE_COPIED,
  RESTORE_DELETED, RESTORE_SKIPPED. The CLI prints every action, so a deletion
  from a "missing in snapshot" artifact is never silent (silent-failure-hunter
  #1 HIGH).
- snapshot_list now returns rows with `manifest=None` for corrupt directories
  instead of silently dropping them; the CLI marks those as CORRUPT on stderr
  (silent-failure-hunter #2 HIGH).
- _count_* helpers now propagate exceptions instead of swallowing them — a
  corrupt items.json aborts the snapshot, not records a lying count=0
  (silent-failure-hunter #3, code-reviewer #3). Inlined the trivial
  _count_items/_count_topics wrappers (code-simplifier #1).
- All imports (json, yaml, importlib.metadata) at the module top (code-reviewer #3).
- _count_items/_count_topics removed (one-line wrappers, code-simplifier #1).

cli.py:
- _OPERATOR_ERRORS now includes OSError (covers PermissionError, FileExistsError,
  IsADirectoryError) so snapshot I/O failures surface as clean exit-1 instead
  of raw tracebacks (silent-failure-hunter #4).
- _auto_snapshot now reads the count from the manifest and emits the spec-
  mandated English message: `Snapshot created: <dir> (N items)` (pr-test-analyzer
  #10, python-code-reviewer #3).
- snapshot_restore_cmd echoes every per-artifact action.
- snapshot_list_cmd handles `manifest=None` rows as CORRUPT (to stderr).
- snapshot_create_cmd uses the new (path, manifest) return shape and
  passes `command="manual"` + `dir_label=name`.
- Strings translated to English (the whole new subcommand group; the rest of
  the CLI stays Spanish — out of scope here).

Tests:
- test_snapshot.py: 21 unit tests (up from 14). New: round-trip across ALL FOUR
  artifacts (pr-test-analyzer #1 CRITICAL), millisecond-collision regression,
  corrupt-JSON-aborts-snapshot, dir_label separation from command,
  shutil.copy2-preserves-bytes (binary-safety smoke test), per-artifact action
  codes, xbrain_version assertion (pr-test-analyzer #5), prune-with-fewer-than-
  keep_last (pr-test-analyzer #6).
- test_snapshot_auto.py: 16 integration tests (up from 10). New: snapshot-
  taken-before-mutation-when-op-fails (pr-test-analyzer #2 CRITICAL — uses
  monkeypatch to force `_mark_for_regenerate` to raise, asserts snapshot
  already on disk + items.json unchanged), snapshot-failure-aborts-destructive-op
  (pr-test-analyzer #3 CRITICAL — monkeypatch snapshot_create to raise OSError,
  assert fetch --force aborts and nothing is mutated), snapshot show CLI
  (pr-test-analyzer #7), restore-via-CLI-with-missing-artifact
  (pr-test-analyzer #8), corrupt-dirs-marked-via-CLI, stdout-includes-item-count
  (pr-test-analyzer #10).
- All 258 tests pass; coverage 87%.

CONTRIBUTING.md:
- Added a "Safety: destructive operations auto-snapshot" section
  (spec-compliance #FAIL — closes the doc gap).

Deviation log unchanged: 3 destructive sites (`vocab --regenerate`,
`topics --resynth`, `fetch --force`) — `enrich --regenerate` does not exist
as a CLI flag, re-enrichment happens via `vocab --regenerate` which is
already covered. Spec-compliance reviewer confirmed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
VGonPa added a commit that referenced this pull request May 22, 2026
Addresses every HIGH/MEDIUM finding from the 6-reviewer panel on PR #28
(code-reviewer + python-code-reviewer + spec-compliance + test-analyzer
APPROVED; silent-failure-hunter and simplifier flagged actionable items):

silent-failure-hunter MEDIUM #1: silent empty-diff on missing dirs.
- `diff_snapshots` now validates both directories exist on disk; raises
  FileNotFoundError with the missing path if not.
- Validates that at least one artifact exists on either side; if both are
  fully empty, raises FileNotFoundError naming both dirs (guards against
  the "data/ deleted out-of-band" scenario where diff would otherwise
  silently report 'everything was removed').

silent-failure-hunter MEDIUM #2: corrupt-file errors lacked context.
- Each loader call inside `diff_snapshots` is wrapped to add the path to
  the ValueError message ("failed to load <path>: <orig msg>"), so a
  malformed items.json / vocab.yaml / topics.json surfaces with the
  specific file rather than a bare pydantic / json traceback.

simplifier #1: `_tfidf_cosine` renamed to `_tf_cosine`.
- The function uses plain TF cosine, not TF-IDF (with only 2 documents,
  IDF degenerates). Docstring already explained this; renaming the
  symbol stops the name from lying. Module docstring + every call site
  + import updated.

simplifier #2: `VocabDiff.unchanged: list[str]` was only ever consumed
as `len(...)`. Replaced with `unchanged_count: int`. JSON output is
slightly smaller on large vocabs and the data shape stops promising
information the consumers don't read.

spec-compliance follow-up: `diff.py` added to the "Where things live"
tree in ARCHITECTURE.md.

Tests:
- Three new tests for the validation: missing dir → FileNotFoundError,
  both-empty → FileNotFoundError, corrupt items.json → ValueError with
  path in message.
- Existing tests updated for the `unchanged_count` rename.

Skipped (out of scope for this round, documented in task #88):
- test-analyzer polish (French tokenizer test, JSON schema-stability
  deeper assertions, secondary-topic-no-reassign test) — improvements
  not blockers.
- simplifier #3, #4 (drop TopicChange.unchanged, drop DiffSummary) —
  borderline derivability vs. JSON consumer ergonomics.
- simplifier #5, #6 (Literal["text","json"] dispatch, drop
  diff_snapshots threshold kwargs) — internal-only style.
- code-reviewer/python-code-reviewer naming nit (`reassigned_pct`
  carrying a fraction) — purely cosmetic, internal consistency intact.

Total: 329 tests (up from 326), coverage 89%, `uv run poe check`
all-green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
VGonPa added a commit that referenced this pull request May 23, 2026
Addresses the actionable findings from PR #31's 3-reviewer panel
(code-reviewer SHIP IT; silent-failure-hunter + test-analyzer
flagged improvements):

silent-failure-hunter M2 — partial-failure summary line was
indistinguishable from the per-item `warn:` lines that precede it. A
1000-item batch with 50 failures buried the single summary at the end
of 50 noise lines.
- Prefix with `SUMMARY:` so the line stands out. Trivial.
- Existing partial-failure asserts updated accordingly.

test-analyzer gap #1 (rating 7) — KeyboardInterrupt propagation was
implicitly correct (KeyboardInterrupt is a BaseException, not Exception,
so the narrow tuple naturally excludes it) but no regression test
pinned it. A future refactor that widens the catch to BaseException
would silently break Ctrl-C.
- New `test_*_propagates_keyboard_interrupt` for both modules.

test-analyzer improvement #2 (rating 5) — no negative assert that the
all-failed branch does NOT print the summary line (it raises first; a
future refactor that moves the print above the raise would slip).
- New `test_*_emits_no_summary_on_total_failure` for both modules.

test-analyzer improvement #3 (rating 4) — no negative assert that the
all-succeed path stays silent on stderr.
- New `test_*_emits_no_summary_when_all_succeed` for both modules.

Skipped: silent-failure M1 (`except ImportError` opacity for a
misinstalled anthropic SDK). The reviewer themselves noted it is
"actually safe (loud failure)" since downstream code would propagate
the SDK error as a bug rather than catch it silently. Borderline,
non-blocker.

Total: 358 tests (up from 352), coverage 89%, `uv run poe check`
all-green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
VGonPa added a commit that referenced this pull request May 24, 2026
- Strip `Phase A`, `Phase B`, `(#33)`, `pre-#33`, `pre-Phase-A` markers
  from new/modified code, docstrings, and tests. Also strip `(#17)`,
  `(#19)`, `(#20)`, `(#24)` references added in this PR's diff. The PR
  description carries the issue link; code should describe lasting
  invariants, not the PR that introduced them.
- Rewrite `_TRANSIENT_MEDIA_FAILURES` cross-reference comment: was
  `_TRANSIENT_FAILURES` (the actual symbol in fetch.py).
- Reword `Failed(transient)` "terminal-ish" phrasing in ARCHITECTURE.md
  `### media` — `Failed(transient)` IS auto-retried; not terminal.
- Reconcile snapshot/media docs: snapshots cover only the four JSON
  artifacts; photo bytes under `data/media/` are NOT snapshotted today.
  Updated `config.py:media_dir` docstring and ARCHITECTURE.md to make
  the carve-out explicit; re-downloading is the recovery path.
- Trim ARCHITECTURE.md invariant #10 to match the brief style of
  invariants #1-#9. Retry contract and storage layout moved to the
  `### media` section above (now with storage layout subsection).
- Trim the `media` row in README.md Commands table to one sentence with
  a link to `Local media storage`. Mark the disk-budget numbers as
  approximate.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant