Skip to content

Promote to main: WS2 Fase 1+2, quality gate, vocab worksheet, Safari auth, README#9

Merged
VGonPa merged 64 commits into
mainfrom
develop
May 19, 2026
Merged

Promote to main: WS2 Fase 1+2, quality gate, vocab worksheet, Safari auth, README#9
VGonPa merged 64 commits into
mainfrom
develop

Conversation

@VGonPa
Copy link
Copy Markdown
Owner

@VGonPa VGonPa commented May 19, 2026

Promotes everything built since the OSS-prep from develop to main.

What lands on main

Every change was reviewed in its own PR into develop and CI-green. This is the integration PR — CI runs the full quality gate again here.

🤖 Generated with Claude Code

VGonPa and others added 30 commits May 18, 2026 08:51
…r validation, robust JSON, dup topics

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ypes, dedup cleanups

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ient

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WS2 Fase 1: enrichment core pipeline
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Narrow string params to their Literal types instead of suppressing:
- Add SourceName Literal alias to models; propagate it through
  parse_tweets -> collect_new_items -> extract_source -> cli source sets.
- Annotate Media `kind` as Literal["photo","video"].
- enrich.py: guard `topics` with isinstance(list) (validate_judgment
  already proves it) and cast the runtime-validated executor name to
  ExecutorName.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
VGonPa and others added 26 commits May 18, 2026 22:06
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nce, URL routing)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… complexity, dedup)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WS2 Fase 2 — topics stage + fetch hardening
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…, strict keys)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
vocab: worksheet executor track (no API key needed)
@VGonPa VGonPa merged commit d47547b into main May 19, 2026
1 check passed
VGonPa added a commit that referenced this pull request May 21, 2026
…ening

Addresses every HIGH/CRITICAL finding from the review pipeline on PR #22
(code-reviewer, silent-failure-hunter, pr-test-analyzer, python-code-reviewer,
code-simplifier, spec-compliance):

snapshot.py:
- snapshot_create returns (Path, SnapshotManifest) — callers (incl. _auto_snapshot)
  now print the item count from the manifest just written, matching PRD §5
  observability ("Snapshot created: <path> (N items)").
- New `dir_label` parameter separates directory naming from manifest.command:
  manifest now records "vocab-regenerate" (the op name) while the directory
  carries the `pre-` prefix. Fixes the dual-purpose smell flagged by code-reviewer.
- snapshot_pre removed — inlined in _auto_snapshot (code-simplifier).
- Timestamp gains millisecond precision (`%Y-%m-%dT%H-%M-%S-NNNZ`). Eliminates
  the same-second collision bug flagged by pr-test-analyzer #9 and
  python-code-reviewer #2. As a side effect, the suite no longer needs
  `time.sleep(1.1)` between snapshots — total test runtime dropped from 7s to <1s.
- snapshot_restore now uses `shutil.copy2` symmetrically with snapshot_create
  instead of the text round-trip via `_atomic_write`. Binary-safe, metadata-
  preserving, and no longer asymmetric (code-reviewer/python-code-reviewer
  both flagged this as the must-fix-before-merge issue).
- snapshot_restore returns a list of (artifact, action) tuples — RESTORE_COPIED,
  RESTORE_DELETED, RESTORE_SKIPPED. The CLI prints every action, so a deletion
  from a "missing in snapshot" artifact is never silent (silent-failure-hunter
  #1 HIGH).
- snapshot_list now returns rows with `manifest=None` for corrupt directories
  instead of silently dropping them; the CLI marks those as CORRUPT on stderr
  (silent-failure-hunter #2 HIGH).
- _count_* helpers now propagate exceptions instead of swallowing them — a
  corrupt items.json aborts the snapshot, not records a lying count=0
  (silent-failure-hunter #3, code-reviewer #3). Inlined the trivial
  _count_items/_count_topics wrappers (code-simplifier #1).
- All imports (json, yaml, importlib.metadata) at the module top (code-reviewer #3).
- _count_items/_count_topics removed (one-line wrappers, code-simplifier #1).

cli.py:
- _OPERATOR_ERRORS now includes OSError (covers PermissionError, FileExistsError,
  IsADirectoryError) so snapshot I/O failures surface as clean exit-1 instead
  of raw tracebacks (silent-failure-hunter #4).
- _auto_snapshot now reads the count from the manifest and emits the spec-
  mandated English message: `Snapshot created: <dir> (N items)` (pr-test-analyzer
  #10, python-code-reviewer #3).
- snapshot_restore_cmd echoes every per-artifact action.
- snapshot_list_cmd handles `manifest=None` rows as CORRUPT (to stderr).
- snapshot_create_cmd uses the new (path, manifest) return shape and
  passes `command="manual"` + `dir_label=name`.
- Strings translated to English (the whole new subcommand group; the rest of
  the CLI stays Spanish — out of scope here).

Tests:
- test_snapshot.py: 21 unit tests (up from 14). New: round-trip across ALL FOUR
  artifacts (pr-test-analyzer #1 CRITICAL), millisecond-collision regression,
  corrupt-JSON-aborts-snapshot, dir_label separation from command,
  shutil.copy2-preserves-bytes (binary-safety smoke test), per-artifact action
  codes, xbrain_version assertion (pr-test-analyzer #5), prune-with-fewer-than-
  keep_last (pr-test-analyzer #6).
- test_snapshot_auto.py: 16 integration tests (up from 10). New: snapshot-
  taken-before-mutation-when-op-fails (pr-test-analyzer #2 CRITICAL — uses
  monkeypatch to force `_mark_for_regenerate` to raise, asserts snapshot
  already on disk + items.json unchanged), snapshot-failure-aborts-destructive-op
  (pr-test-analyzer #3 CRITICAL — monkeypatch snapshot_create to raise OSError,
  assert fetch --force aborts and nothing is mutated), snapshot show CLI
  (pr-test-analyzer #7), restore-via-CLI-with-missing-artifact
  (pr-test-analyzer #8), corrupt-dirs-marked-via-CLI, stdout-includes-item-count
  (pr-test-analyzer #10).
- All 258 tests pass; coverage 87%.

CONTRIBUTING.md:
- Added a "Safety: destructive operations auto-snapshot" section
  (spec-compliance #FAIL — closes the doc gap).

Deviation log unchanged: 3 destructive sites (`vocab --regenerate`,
`topics --resynth`, `fetch --force`) — `enrich --regenerate` does not exist
as a CLI flag, re-enrichment happens via `vocab --regenerate` which is
already covered. Spec-compliance reviewer confirmed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
VGonPa added a commit that referenced this pull request May 24, 2026
- Strip `Phase A`, `Phase B`, `(#33)`, `pre-#33`, `pre-Phase-A` markers
  from new/modified code, docstrings, and tests. Also strip `(#17)`,
  `(#19)`, `(#20)`, `(#24)` references added in this PR's diff. The PR
  description carries the issue link; code should describe lasting
  invariants, not the PR that introduced them.
- Rewrite `_TRANSIENT_MEDIA_FAILURES` cross-reference comment: was
  `_TRANSIENT_FAILURES` (the actual symbol in fetch.py).
- Reword `Failed(transient)` "terminal-ish" phrasing in ARCHITECTURE.md
  `### media` — `Failed(transient)` IS auto-retried; not terminal.
- Reconcile snapshot/media docs: snapshots cover only the four JSON
  artifacts; photo bytes under `data/media/` are NOT snapshotted today.
  Updated `config.py:media_dir` docstring and ARCHITECTURE.md to make
  the carve-out explicit; re-downloading is the recovery path.
- Trim ARCHITECTURE.md invariant #10 to match the brief style of
  invariants #1-#9. Retry contract and storage layout moved to the
  `### media` section above (now with storage layout subsection).
- Trim the `media` row in README.md Commands table to one sentence with
  a link to `Local media storage`. Mark the disk-budget numbers as
  approximate.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant