Conversation
…op note_worthiness
…r validation, robust JSON, dup topics Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ypes, dedup cleanups Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ient Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WS2 Fase 1: enrichment core pipeline
…ge) at ratchet baselines
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Narrow string params to their Literal types instead of suppressing: - Add SourceName Literal alias to models; propagate it through parse_tweets -> collect_new_items -> extract_source -> cli source sets. - Annotate Media `kind` as Literal["photo","video"]. - enrich.py: guard `topics` with isinstance(list) (validate_judgment already proves it) and cast the runtime-validated executor name to ExecutorName. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nce, URL routing) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… complexity, dedup) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WS2 Fase 2 — topics stage + fetch hardening
Safari session importer
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…, strict keys) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
vocab: worksheet executor track (no API key needed)
docs: comprehensive README
VGonPa
added a commit
that referenced
this pull request
May 21, 2026
…ening Addresses every HIGH/CRITICAL finding from the review pipeline on PR #22 (code-reviewer, silent-failure-hunter, pr-test-analyzer, python-code-reviewer, code-simplifier, spec-compliance): snapshot.py: - snapshot_create returns (Path, SnapshotManifest) — callers (incl. _auto_snapshot) now print the item count from the manifest just written, matching PRD §5 observability ("Snapshot created: <path> (N items)"). - New `dir_label` parameter separates directory naming from manifest.command: manifest now records "vocab-regenerate" (the op name) while the directory carries the `pre-` prefix. Fixes the dual-purpose smell flagged by code-reviewer. - snapshot_pre removed — inlined in _auto_snapshot (code-simplifier). - Timestamp gains millisecond precision (`%Y-%m-%dT%H-%M-%S-NNNZ`). Eliminates the same-second collision bug flagged by pr-test-analyzer #9 and python-code-reviewer #2. As a side effect, the suite no longer needs `time.sleep(1.1)` between snapshots — total test runtime dropped from 7s to <1s. - snapshot_restore now uses `shutil.copy2` symmetrically with snapshot_create instead of the text round-trip via `_atomic_write`. Binary-safe, metadata- preserving, and no longer asymmetric (code-reviewer/python-code-reviewer both flagged this as the must-fix-before-merge issue). - snapshot_restore returns a list of (artifact, action) tuples — RESTORE_COPIED, RESTORE_DELETED, RESTORE_SKIPPED. The CLI prints every action, so a deletion from a "missing in snapshot" artifact is never silent (silent-failure-hunter #1 HIGH). - snapshot_list now returns rows with `manifest=None` for corrupt directories instead of silently dropping them; the CLI marks those as CORRUPT on stderr (silent-failure-hunter #2 HIGH). - _count_* helpers now propagate exceptions instead of swallowing them — a corrupt items.json aborts the snapshot, not records a lying count=0 (silent-failure-hunter #3, code-reviewer #3). Inlined the trivial _count_items/_count_topics wrappers (code-simplifier #1). - All imports (json, yaml, importlib.metadata) at the module top (code-reviewer #3). - _count_items/_count_topics removed (one-line wrappers, code-simplifier #1). cli.py: - _OPERATOR_ERRORS now includes OSError (covers PermissionError, FileExistsError, IsADirectoryError) so snapshot I/O failures surface as clean exit-1 instead of raw tracebacks (silent-failure-hunter #4). - _auto_snapshot now reads the count from the manifest and emits the spec- mandated English message: `Snapshot created: <dir> (N items)` (pr-test-analyzer #10, python-code-reviewer #3). - snapshot_restore_cmd echoes every per-artifact action. - snapshot_list_cmd handles `manifest=None` rows as CORRUPT (to stderr). - snapshot_create_cmd uses the new (path, manifest) return shape and passes `command="manual"` + `dir_label=name`. - Strings translated to English (the whole new subcommand group; the rest of the CLI stays Spanish — out of scope here). Tests: - test_snapshot.py: 21 unit tests (up from 14). New: round-trip across ALL FOUR artifacts (pr-test-analyzer #1 CRITICAL), millisecond-collision regression, corrupt-JSON-aborts-snapshot, dir_label separation from command, shutil.copy2-preserves-bytes (binary-safety smoke test), per-artifact action codes, xbrain_version assertion (pr-test-analyzer #5), prune-with-fewer-than- keep_last (pr-test-analyzer #6). - test_snapshot_auto.py: 16 integration tests (up from 10). New: snapshot- taken-before-mutation-when-op-fails (pr-test-analyzer #2 CRITICAL — uses monkeypatch to force `_mark_for_regenerate` to raise, asserts snapshot already on disk + items.json unchanged), snapshot-failure-aborts-destructive-op (pr-test-analyzer #3 CRITICAL — monkeypatch snapshot_create to raise OSError, assert fetch --force aborts and nothing is mutated), snapshot show CLI (pr-test-analyzer #7), restore-via-CLI-with-missing-artifact (pr-test-analyzer #8), corrupt-dirs-marked-via-CLI, stdout-includes-item-count (pr-test-analyzer #10). - All 258 tests pass; coverage 87%. CONTRIBUTING.md: - Added a "Safety: destructive operations auto-snapshot" section (spec-compliance #FAIL — closes the doc gap). Deviation log unchanged: 3 destructive sites (`vocab --regenerate`, `topics --resynth`, `fetch --force`) — `enrich --regenerate` does not exist as a CLI flag, re-enrichment happens via `vocab --regenerate` which is already covered. Spec-compliance reviewer confirmed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
VGonPa
added a commit
that referenced
this pull request
May 24, 2026
- Strip `Phase A`, `Phase B`, `(#33)`, `pre-#33`, `pre-Phase-A` markers from new/modified code, docstrings, and tests. Also strip `(#17)`, `(#19)`, `(#20)`, `(#24)` references added in this PR's diff. The PR description carries the issue link; code should describe lasting invariants, not the PR that introduced them. - Rewrite `_TRANSIENT_MEDIA_FAILURES` cross-reference comment: was `_TRANSIENT_FAILURES` (the actual symbol in fetch.py). - Reword `Failed(transient)` "terminal-ish" phrasing in ARCHITECTURE.md `### media` — `Failed(transient)` IS auto-retried; not terminal. - Reconcile snapshot/media docs: snapshots cover only the four JSON artifacts; photo bytes under `data/media/` are NOT snapshotted today. Updated `config.py:media_dir` docstring and ARCHITECTURE.md to make the carve-out explicit; re-downloading is the recovery path. - Trim ARCHITECTURE.md invariant #10 to match the brief style of invariants #1-#9. Retry contract and storage layout moved to the `### media` section above (now with storage layout subsection). - Trim the `media` row in README.md Commands table to one sentence with a link to `Local media storage`. Mark the disk-budget numbers as approximate.
20 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Promotes everything built since the OSS-prep from
developtomain.What lands on
mainpoe check/scripts/check.sh, 10 checks, CI on every PR.vocabstage,apiexecutor, worksheet hand-off, declarative rubrics + guardrails, mechanical validator,xbrain enrich.xbrain topics: post lists + synthesized overviews + topic pages) and fetch hardening (structured failure evidence, optional Firecrawl fallback, real X-article fetch viafetch_x.py).vocabworksheet track (PR vocab: worksheet executor track (no API key needed) #5) —vocabworks with no API key, likeenrich/topics.scripts/import_safari_session.py.Every change was reviewed in its own PR into
developand CI-green. This is the integration PR — CI runs the full quality gate again here.🤖 Generated with Claude Code