[#19] Retry transiently-failed fetches without --force#26
Merged
Conversation
Today `xbrain fetch` skips every item whose `content` field is non-empty,
regardless of whether the recorded sources succeeded. To retry a single
network blip the user has to use `--force`, which re-fetches *everything*
(wasteful, rate-limit risk).
This change adds an automatic retry path for items whose only recorded
failures are TRANSIENT (timeout, dns_error). Items with success or with
TERMINAL failures (not_found, paywall, forbidden, js_required, empty_content)
stay skipped until `--force` is set.
Implementation:
- New constant `_TRANSIENT_FAILURES: frozenset[FailureReason] =
frozenset({"timeout", "dns_error"})` in `xbrain.fetch`.
- New pure helper `_should_refetch(content, force) -> bool` that
encapsulates the skip decision (truth-table-driven).
- One-line wiring in `fetch_pending`: the `if item.content is not None
and not force: continue` line is replaced with
`if not _should_refetch(item.content, force): continue`.
Scope:
- Only `xbrain.fetch` (external_article links). `fetch_x.py` (x.com
articles + threads) has a different failure-reason set and its own
skip logic — out of scope here, separate issue if needed.
- No new CLI flag — the new behaviour falls out of the existing
`xbrain fetch` invocation.
- `--force` semantics UNCHANGED (still refetches everything).
Spec decisions documented in the PRD:
- `dns_error` classified as transient (DNS provider blips happen; dead
domains will cost one extra HTTP probe per run — acceptable).
- `empty_content` classified as terminal (Trafilatura returned no body;
if extraction improves, the user can `--force`).
Tests (18 new):
- 13 truth-table tests on `_should_refetch` covering every cell from the
PRD §5 matrix (content=None, force ON/OFF, ok, transient, terminal,
mixed transient+terminal, mixed transient+success, only-x sources).
- 5 integration tests on `fetch_pending`:
- timeout item retried without `--force`.
- not_found item skipped without `--force`.
- not_found item retried with `--force`.
- successful fetch skipped without `--force`.
- timeout item outside `since`/`until` still skipped (filter precedence).
Total: 296 tests (up from 278), coverage 88%, `uv run poe check`
all-green.
Docs:
- README Commands table: `fetch` row gains a sentence describing the
transient/terminal distinction.
- ARCHITECTURE step-by-step `fetch` card: new "Transient retries" bullet.
PRD: vault/zz-support-files/docs/prds/2026-05-22-xbrain-19-retry-transient-fetches.md
Plan: vault/zz-support-files/docs/implementation-plans/2026-05-22-xbrain-19-retry-transient-fetches.md
Closes #19.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses every HIGH/MEDIUM finding from the 4-reviewer panel on PR #26 (code-reviewer + spec-compliance APPROVED; silent-failure-hunter and pr-test-analyzer flagged the same issue plus minor cosmetics): silent-failure-hunter HIGH (also pr-test-analyzer M1 + code-reviewer nit): - ContentSource(ok=False, failure_reason=None) was silently classified as terminal. Pre-Fase-2 records (where failure_reason defaulted None) and any future code path that records a failure without categorising it (e.g. an uncaught extractor exception via `error="..."`) would stay invisibly stuck — exactly the silent-failure shape the spec invariant tries to prevent. - Fix: treat failure_reason=None as transient. The predicate becomes `failure_reason is None or failure_reason in _TRANSIENT_FAILURES`. Rationale: an `ok=False` without a categorised reason is anomalous; re-fetching gives it a chance to land on a categorised result rather than getting permanently stuck. - Docstring updated to explain the policy. pr-test-analyzer M2: missing test for mixed external_article + x_article. - Added `test_should_refetch_external_transient_alongside_xcom_source`: a transient-failed external_article + an x_article (any state) must retry on the external (x.com is fetch_x's job, must not block). pr-test-analyzer M3: missing test for "re-fetch replaces, not appends". - Added `test_fetch_pending_replaces_sources_does_not_append`: a transient-failed item with 1 source stays at 1 source after retry, not 2. Also added `test_should_refetch_uncategorized_failure_treated_as_transient` to pin the new failure_reason=None behaviour explicitly. Cosmetic cleanup (pr-test-analyzer + code-reviewer): - Hoisted `from xbrain.models import Content, ContentSource` and `from xbrain.fetch import _should_refetch` to the file top — dropped 4 inline imports inside test bodies + 1 redundant local wrapper. spec-compliance noted (NO change needed): the x-only + force case where the helper returns True but fetch_pending filters x-only items before consulting _should_refetch. User-visible behaviour matches the spec; helper alone returning True is a curiosity, not a bug. code-reviewer APPROVED with no blockers; radon B(8) on _should_refetch is unchanged from the helper's original shape. Total: 299 tests (up from 296), coverage 88% unchanged, all-green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 22, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #19.
Summary
xbrain fetchnow auto-retries items whose only recorded failures were transient (timeout,dns_error). Items with success or with terminal failures (not_found,paywall,forbidden,js_required,empty_content) stay skipped until--forceis set. The--forcesemantics are unchanged.What ships
_TRANSIENT_FAILURES: frozenset[FailureReason] = frozenset({"timeout", "dns_error"})_should_refetch(content, force) -> bool— pure helper, truth-table-driven.fetch_pendingreplacing the oldif item.content is not None and not force: continueskip.Decisions (in the PRD)
dns_error= transient. Cost of being wrong = one extra HTTP probe per run per truly-dead domain. Cheap.empty_content= terminal. Trafilatura returned no body — extraction improvement is a code change, the user can--forcethen.xbrain fetchinvocation.Scope (out of scope intentionally)
fetch_x.py(x.com articles + threads) — different failure-reason set, its own skip logic; separate issue if needed.ContentSource.attemptsintegration — recorded but not consulted in the skip decision.Tests
_should_refetchfetch_pendingTotal: 296 tests (up from 278), coverage 88%,
uv run poe checkall-green.Specs
vault/zz-support-files/docs/prds/2026-05-22-xbrain-19-retry-transient-fetches.mdvault/zz-support-files/docs/implementation-plans/2026-05-22-xbrain-19-retry-transient-fetches.mdTest plan
--forceoverride /since/untilprecedenceuv run poe checkall-green🤖 Generated with Claude Code