Promote to main: WS2 Fase 1+2, quality gate, vocab worksheet, Safari auth, README by VGonPa · Pull Request #9 · VGonPa/xbrain

VGonPa · 2026-05-19T16:13:26Z

Promotes everything built since the OSS-prep from develop to main.

What lands on `main`

Quality gate (PR Quality gate — poe + scripts/check.sh (10 checks) #2) — poe check / scripts/check.sh, 10 checks, CI on every PR.
WS2 Fase 1 (PR WS2 Fase 1: enrichment core pipeline #1) — enrichment core: vocab stage, api executor, worksheet hand-off, declarative rubrics + guardrails, mechanical validator, xbrain enrich.
WS2 Fase 2 (PR WS2 Fase 2 — topics stage + fetch hardening #3) — topics stage (xbrain topics: post lists + synthesized overviews + topic pages) and fetch hardening (structured failure evidence, optional Firecrawl fallback, real X-article fetch via fetch_x.py).
vocab worksheet track (PR vocab: worksheet executor track (no API key needed) #5) — vocab works with no API key, like enrich/topics.
Safari session importer (PR Safari session importer #4) — scripts/import_safari_session.py.
Comprehensive README (PR docs: comprehensive README #6) — layers, pipeline, commands, execution modes, diagrams.

Every change was reviewed in its own PR into develop and CI-green. This is the integration PR — CI runs the full quality gate again here.

🤖 Generated with Claude Code

…op note_worthiness

…rompt)

…e parsing

…rack

… provenance

…r validation, robust JSON, dup topics Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ypes, dedup cleanups Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ient Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

WS2 Fase 1: enrichment core pipeline

…ge) at ratchet baselines

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Narrow string params to their Literal types instead of suppressing: - Add SourceName Literal alias to models; propagate it through parse_tweets -> collect_new_items -> extract_source -> cli source sets. - Annotate Media `kind` as Literal["photo","video"]. - enrich.py: guard `topics` with isinstance(list) (validate_judgment already proves it) and cast the runtime-validated executor name to ExecutorName. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…nce, URL routing) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… complexity, dedup) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ge error

WS2 Fase 2 — topics stage + fetch hardening

Safari session importer

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ply)

…, strict keys) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vocab: worksheet executor track (no API key needed)

…des, diagrams

docs: comprehensive README

…ening Addresses every HIGH/CRITICAL finding from the review pipeline on PR #22 (code-reviewer, silent-failure-hunter, pr-test-analyzer, python-code-reviewer, code-simplifier, spec-compliance): snapshot.py: - snapshot_create returns (Path, SnapshotManifest) — callers (incl. _auto_snapshot) now print the item count from the manifest just written, matching PRD §5 observability ("Snapshot created: <path> (N items)"). - New `dir_label` parameter separates directory naming from manifest.command: manifest now records "vocab-regenerate" (the op name) while the directory carries the `pre-` prefix. Fixes the dual-purpose smell flagged by code-reviewer. - snapshot_pre removed — inlined in _auto_snapshot (code-simplifier). - Timestamp gains millisecond precision (`%Y-%m-%dT%H-%M-%S-NNNZ`). Eliminates the same-second collision bug flagged by pr-test-analyzer #9 and python-code-reviewer #2. As a side effect, the suite no longer needs `time.sleep(1.1)` between snapshots — total test runtime dropped from 7s to <1s. - snapshot_restore now uses `shutil.copy2` symmetrically with snapshot_create instead of the text round-trip via `_atomic_write`. Binary-safe, metadata- preserving, and no longer asymmetric (code-reviewer/python-code-reviewer both flagged this as the must-fix-before-merge issue). - snapshot_restore returns a list of (artifact, action) tuples — RESTORE_COPIED, RESTORE_DELETED, RESTORE_SKIPPED. The CLI prints every action, so a deletion from a "missing in snapshot" artifact is never silent (silent-failure-hunter #1 HIGH). - snapshot_list now returns rows with `manifest=None` for corrupt directories instead of silently dropping them; the CLI marks those as CORRUPT on stderr (silent-failure-hunter #2 HIGH). - _count_* helpers now propagate exceptions instead of swallowing them — a corrupt items.json aborts the snapshot, not records a lying count=0 (silent-failure-hunter #3, code-reviewer #3). Inlined the trivial _count_items/_count_topics wrappers (code-simplifier #1). - All imports (json, yaml, importlib.metadata) at the module top (code-reviewer #3). - _count_items/_count_topics removed (one-line wrappers, code-simplifier #1). cli.py: - _OPERATOR_ERRORS now includes OSError (covers PermissionError, FileExistsError, IsADirectoryError) so snapshot I/O failures surface as clean exit-1 instead of raw tracebacks (silent-failure-hunter #4). - _auto_snapshot now reads the count from the manifest and emits the spec- mandated English message: `Snapshot created: <dir> (N items)` (pr-test-analyzer #10, python-code-reviewer #3). - snapshot_restore_cmd echoes every per-artifact action. - snapshot_list_cmd handles `manifest=None` rows as CORRUPT (to stderr). - snapshot_create_cmd uses the new (path, manifest) return shape and passes `command="manual"` + `dir_label=name`. - Strings translated to English (the whole new subcommand group; the rest of the CLI stays Spanish — out of scope here). Tests: - test_snapshot.py: 21 unit tests (up from 14). New: round-trip across ALL FOUR artifacts (pr-test-analyzer #1 CRITICAL), millisecond-collision regression, corrupt-JSON-aborts-snapshot, dir_label separation from command, shutil.copy2-preserves-bytes (binary-safety smoke test), per-artifact action codes, xbrain_version assertion (pr-test-analyzer #5), prune-with-fewer-than- keep_last (pr-test-analyzer #6). - test_snapshot_auto.py: 16 integration tests (up from 10). New: snapshot- taken-before-mutation-when-op-fails (pr-test-analyzer #2 CRITICAL — uses monkeypatch to force `_mark_for_regenerate` to raise, asserts snapshot already on disk + items.json unchanged), snapshot-failure-aborts-destructive-op (pr-test-analyzer #3 CRITICAL — monkeypatch snapshot_create to raise OSError, assert fetch --force aborts and nothing is mutated), snapshot show CLI (pr-test-analyzer #7), restore-via-CLI-with-missing-artifact (pr-test-analyzer #8), corrupt-dirs-marked-via-CLI, stdout-includes-item-count (pr-test-analyzer #10). - All 258 tests pass; coverage 87%. CONTRIBUTING.md: - Added a "Safety: destructive operations auto-snapshot" section (spec-compliance #FAIL — closes the doc gap). Deviation log unchanged: 3 destructive sites (`vocab --regenerate`, `topics --resynth`, `fetch --force`) — `enrich --regenerate` does not exist as a CLI flag, re-enrichment happens via `vocab --regenerate` which is already covered. Spec-compliance reviewer confirmed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Strip `Phase A`, `Phase B`, `(#33)`, `pre-#33`, `pre-Phase-A` markers from new/modified code, docstrings, and tests. Also strip `(#17)`, `(#19)`, `(#20)`, `(#24)` references added in this PR's diff. The PR description carries the issue link; code should describe lasting invariants, not the PR that introduced them. - Rewrite `_TRANSIENT_MEDIA_FAILURES` cross-reference comment: was `_TRANSIENT_FAILURES` (the actual symbol in fetch.py). - Reword `Failed(transient)` "terminal-ish" phrasing in ARCHITECTURE.md `### media` — `Failed(transient)` IS auto-retried; not terminal. - Reconcile snapshot/media docs: snapshots cover only the four JSON artifacts; photo bytes under `data/media/` are NOT snapshotted today. Updated `config.py:media_dir` docstring and ARCHITECTURE.md to make the carve-out explicit; re-downloading is the recovery path. - Trim ARCHITECTURE.md invariant #10 to match the brief style of invariants #1-#9. Retry contract and storage layout moved to the `### media` section above (now with storage layout subsection). - Trim the `media` row in README.md Commands table to one sentence with a link to `Local media storage`. Mark the disk-budget numbers as approximate.

VGonPa and others added 30 commits May 18, 2026 08:51

[WS2-F1] chore: add anthropic SDK and pyyaml dependencies

22472e5

[WS2-F1] feat(models): primary_topic, Topic, Item.bookmark_folder; dr…

2f036a4

…op note_worthiness

[WS2-F1] feat(config): [enrich] and [vocab] pipeline settings

78b8197

[WS2-F1] feat(rubrics): summary/topics/vocab rubrics (anti-misc rule)

90ca758

[WS2-F1] feat(guardrails): mechanical enrichment constraints

364f9bd

[WS2-F1] feat(rubrics): loader for rubrics, guardrails and vocab

1d12d44

[WS2-F1] feat(validate): mechanical guardrail validator

88bc6ac

[WS2-F1] feat(executors): executor Protocol and judgment type

e059f52

[WS2-F1] feat(executors): Anthropic API executor (links + folder in p…

d1dfbe7

…rompt)

[WS2-F1] fix(executors): defensive response parsing + error-path tests

be10fe6

[WS2-F1] feat(worksheet): manual/claude-code handoff export+import

097d806

[WS2-F1] feat(vocab): map-reduce taxonomy induction

b199dc6

[WS2-F1] refactor: shared llm_json.extract_json; harden vocab respons…

60439f1

…e parsing

[WS2-F1] feat(cli): xbrain vocab command

1b8389f

[WS2-F1] feat(enrich): API-executor track + worksheet-apply track

851a6d2

[WS2-F1] feat(cli): real enrich command (api track + --apply worksheet)

bb8c73b

[WS2-F1] fix(enrich): null-safe worksheet topics + help-text markup

3f45af3

[WS2-F1] feat(generate): topics + bookmark folder as Obsidian tags

6d9505c

[WS2-F1] feat(skill): enriching-x-knowledge drives the subscription t…

66b9493

…rack

[WS2-F1] fix: package rubric/guardrails artifacts; worksheet executor…

cc77bb1

… provenance

[WS2-F1] fix(review): harden LLM boundary — vocab map errors, executo…

5a603db

…r validation, robust JSON, dup topics Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

[WS2-F1] fix(review): shared ExecutorName, stricter judgment/config t…

a3ce8cf

…ypes, dedup cleanups Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

[WS2-F1] test(review): cover executor/CLI error paths, shared fake cl…

7b2d448

…ient Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Merge pull request #1 from VGonPa/ws2-fase1-enrichment-core

50b941e

WS2 Fase 1: enrichment core pipeline

[quality-gate] chore: add quality-tool dev dependencies

08b7852

[quality-gate] chore: tool configs (mypy, interrogate, bandit, covera…

255869d

…ge) at ratchet baselines

[quality-gate] feat: scripts/check.sh quality gate (10 checks)

c0247dd

[quality-gate] feat: poe task table

a16c560

[quality-gate] style: apply ruff format

4e56724

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

VGonPa and others added 26 commits May 18, 2026 22:06

feat: categorize fetch failures with structured FetchResult evidence

83c524b

feat: add optional Firecrawl fallback for JS-rendered pages

270fcbc

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat: fetch X tweets/threads and X articles instead of deferring them

fb7060a

feat: wire X-article fetch into the CLI and render broken-link evidence

15712bb

feat: add TopicPage model and data/topics.json store

c9f8cc8

feat: compute mechanical primary / also-relevant topic post lists

6973159

feat: add topic-overview rubric, guardrails and validator

e70a070

feat: add OverviewJudgment and the api topic-synthesis track

998c82a

test: cover the topic-overview worksheet track

b4412c0

refactor: extract shared markdown helpers into notes_io

2b84937

feat: render topic pages with derived staleness

9980a17

feat: add the xbrain topics command and [topics] config

cd4f918

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

docs: document the topic-overview worksheet flow in the skill

9c9b887

fix: address fetch-domain review findings (work-loss, Firecrawl evide…

2c42a8a

…nce, URL routing) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

fix: address topics/validation review findings (validator strictness,…

84653d3

… complexity, dedup) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

test: regression test for _run_fetch persisting partial work on a sta…

992423f

…ge error

Merge pull request #3 from VGonPa/ws2-fase2-topics-fetch

0bafaf9

WS2 Fase 2 — topics stage + fetch hardening

feat: add Safari session importer (scripts/import_safari_session.py)

dbcfc46

Merge pull request #4 from VGonPa/safari-session-import

0962c22

Safari session importer

feat: add vocab worksheet export/import/apply functions

b633e56

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat: vocab command gains worksheet executor track (--executor / --ap…

21d144e

…ply)

docs: document the vocab worksheet flow in the skill

9917d44

fix: address PR #5 review findings (worksheet root guard, write order…

2c64667

…, strict keys) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Merge pull request #5 from VGonPa/vocab-worksheet-track

d93159f

vocab: worksheet executor track (no API key needed)

docs: comprehensive README — layers, pipeline, commands, execution mo…

a5b4970

…des, diagrams

Merge pull request #6 from VGonPa/readme-overhaul

7e33a0e

docs: comprehensive README

VGonPa merged commit d47547b into main May 19, 2026
1 check passed

VGonPa mentioned this pull request May 24, 2026

[#33] Phase A: download X-post photos and render in notes #35

Merged

20 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Promote to main: WS2 Fase 1+2, quality gate, vocab worksheet, Safari auth, README#9

Promote to main: WS2 Fase 1+2, quality gate, vocab worksheet, Safari auth, README#9
VGonPa merged 64 commits into
mainfrom
develop

VGonPa commented May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

VGonPa commented May 19, 2026

What lands on main

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

What lands on `main`