[#19] Retry transiently-failed fetches without --force by VGonPa · Pull Request #26 · VGonPa/xbrain

VGonPa · 2026-05-22T16:43:31Z

Closes #19.

Summary

xbrain fetch now auto-retries items whose only recorded failures were transient (timeout, dns_error). Items with success or with terminal failures (not_found, paywall, forbidden, js_required, empty_content) stay skipped until --force is set. The --force semantics are unchanged.

What ships

_TRANSIENT_FAILURES: frozenset[FailureReason] = frozenset({"timeout", "dns_error"})
_should_refetch(content, force) -> bool — pure helper, truth-table-driven.
One-line wiring in fetch_pending replacing the old if item.content is not None and not force: continue skip.
README + ARCHITECTURE updated.

Decisions (in the PRD)

dns_error = transient. Cost of being wrong = one extra HTTP probe per run per truly-dead domain. Cheap.
empty_content = terminal. Trafilatura returned no body — extraction improvement is a code change, the user can --force then.
No new CLI flag. Behaviour falls out of the existing xbrain fetch invocation.

Scope (out of scope intentionally)

fetch_x.py (x.com articles + threads) — different failure-reason set, its own skip logic; separate issue if needed.
ContentSource.attempts integration — recorded but not consulted in the skip decision.
No retry budget / backoff cap.

Tests

Suite	Count	What it covers
Truth-table on `_should_refetch`	13	Every cell from PRD §5 matrix
Integration on `fetch_pending`	5	Each destructive vs benign path + filter precedence

Total: 296 tests (up from 278), coverage 88%, uv run poe check all-green.

Specs

PRD: vault/zz-support-files/docs/prds/2026-05-22-xbrain-19-retry-transient-fetches.md
Plan: vault/zz-support-files/docs/implementation-plans/2026-05-22-xbrain-19-retry-transient-fetches.md

Test plan

Truth-table unit tests for every (content state × force) cell
Integration tests for transient retry / terminal skip / --force override / since/until precedence
uv run poe check all-green
README + ARCHITECTURE updated

🤖 Generated with Claude Code

Today `xbrain fetch` skips every item whose `content` field is non-empty, regardless of whether the recorded sources succeeded. To retry a single network blip the user has to use `--force`, which re-fetches *everything* (wasteful, rate-limit risk). This change adds an automatic retry path for items whose only recorded failures are TRANSIENT (timeout, dns_error). Items with success or with TERMINAL failures (not_found, paywall, forbidden, js_required, empty_content) stay skipped until `--force` is set. Implementation: - New constant `_TRANSIENT_FAILURES: frozenset[FailureReason] = frozenset({"timeout", "dns_error"})` in `xbrain.fetch`. - New pure helper `_should_refetch(content, force) -> bool` that encapsulates the skip decision (truth-table-driven). - One-line wiring in `fetch_pending`: the `if item.content is not None and not force: continue` line is replaced with `if not _should_refetch(item.content, force): continue`. Scope: - Only `xbrain.fetch` (external_article links). `fetch_x.py` (x.com articles + threads) has a different failure-reason set and its own skip logic — out of scope here, separate issue if needed. - No new CLI flag — the new behaviour falls out of the existing `xbrain fetch` invocation. - `--force` semantics UNCHANGED (still refetches everything). Spec decisions documented in the PRD: - `dns_error` classified as transient (DNS provider blips happen; dead domains will cost one extra HTTP probe per run — acceptable). - `empty_content` classified as terminal (Trafilatura returned no body; if extraction improves, the user can `--force`). Tests (18 new): - 13 truth-table tests on `_should_refetch` covering every cell from the PRD §5 matrix (content=None, force ON/OFF, ok, transient, terminal, mixed transient+terminal, mixed transient+success, only-x sources). - 5 integration tests on `fetch_pending`: - timeout item retried without `--force`. - not_found item skipped without `--force`. - not_found item retried with `--force`. - successful fetch skipped without `--force`. - timeout item outside `since`/`until` still skipped (filter precedence). Total: 296 tests (up from 278), coverage 88%, `uv run poe check` all-green. Docs: - README Commands table: `fetch` row gains a sentence describing the transient/terminal distinction. - ARCHITECTURE step-by-step `fetch` card: new "Transient retries" bullet. PRD: vault/zz-support-files/docs/prds/2026-05-22-xbrain-19-retry-transient-fetches.md Plan: vault/zz-support-files/docs/implementation-plans/2026-05-22-xbrain-19-retry-transient-fetches.md Closes #19. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Addresses every HIGH/MEDIUM finding from the 4-reviewer panel on PR #26 (code-reviewer + spec-compliance APPROVED; silent-failure-hunter and pr-test-analyzer flagged the same issue plus minor cosmetics): silent-failure-hunter HIGH (also pr-test-analyzer M1 + code-reviewer nit): - ContentSource(ok=False, failure_reason=None) was silently classified as terminal. Pre-Fase-2 records (where failure_reason defaulted None) and any future code path that records a failure without categorising it (e.g. an uncaught extractor exception via `error="..."`) would stay invisibly stuck — exactly the silent-failure shape the spec invariant tries to prevent. - Fix: treat failure_reason=None as transient. The predicate becomes `failure_reason is None or failure_reason in _TRANSIENT_FAILURES`. Rationale: an `ok=False` without a categorised reason is anomalous; re-fetching gives it a chance to land on a categorised result rather than getting permanently stuck. - Docstring updated to explain the policy. pr-test-analyzer M2: missing test for mixed external_article + x_article. - Added `test_should_refetch_external_transient_alongside_xcom_source`: a transient-failed external_article + an x_article (any state) must retry on the external (x.com is fetch_x's job, must not block). pr-test-analyzer M3: missing test for "re-fetch replaces, not appends". - Added `test_fetch_pending_replaces_sources_does_not_append`: a transient-failed item with 1 source stays at 1 source after retry, not 2. Also added `test_should_refetch_uncategorized_failure_treated_as_transient` to pin the new failure_reason=None behaviour explicitly. Cosmetic cleanup (pr-test-analyzer + code-reviewer): - Hoisted `from xbrain.models import Content, ContentSource` and `from xbrain.fetch import _should_refetch` to the file top — dropped 4 inline imports inside test bodies + 1 redundant local wrapper. spec-compliance noted (NO change needed): the x-only + force case where the helper returns True but fetch_pending filters x-only items before consulting _should_refetch. User-visible behaviour matches the spec; helper alone returning True is a curiosity, not a bug. code-reviewer APPROVED with no blockers; radon B(8) on _should_refetch is unchanged from the helper's original shape. Total: 299 tests (up from 296), coverage 88% unchanged, all-green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

VGonPa and others added 2 commits May 22, 2026 18:43

VGonPa merged commit 2a614ee into develop May 22, 2026
1 check passed

VGonPa deleted the ws-19-retry-transient branch May 22, 2026 16:47

This was referenced May 22, 2026

fetch: retry transiently-failed items without --force #19

Closed

Wave: #19 transient retry + #13 hashtags + #18 xbrain diff + #20 tagged unions #30

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[#19] Retry transiently-failed fetches without --force#26

[#19] Retry transiently-failed fetches without --force#26
VGonPa merged 2 commits into
developfrom
ws-19-retry-transient

VGonPa commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

VGonPa commented May 22, 2026

Summary

What ships

Decisions (in the PRD)

Scope (out of scope intentionally)

Tests

Specs

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant