Skip to content

[#19] Retry transiently-failed fetches without --force#26

Merged
VGonPa merged 2 commits into
developfrom
ws-19-retry-transient
May 22, 2026
Merged

[#19] Retry transiently-failed fetches without --force#26
VGonPa merged 2 commits into
developfrom
ws-19-retry-transient

Conversation

@VGonPa
Copy link
Copy Markdown
Owner

@VGonPa VGonPa commented May 22, 2026

Closes #19.

Summary

xbrain fetch now auto-retries items whose only recorded failures were transient (timeout, dns_error). Items with success or with terminal failures (not_found, paywall, forbidden, js_required, empty_content) stay skipped until --force is set. The --force semantics are unchanged.

What ships

  • _TRANSIENT_FAILURES: frozenset[FailureReason] = frozenset({"timeout", "dns_error"})
  • _should_refetch(content, force) -> bool — pure helper, truth-table-driven.
  • One-line wiring in fetch_pending replacing the old if item.content is not None and not force: continue skip.
  • README + ARCHITECTURE updated.

Decisions (in the PRD)

  • dns_error = transient. Cost of being wrong = one extra HTTP probe per run per truly-dead domain. Cheap.
  • empty_content = terminal. Trafilatura returned no body — extraction improvement is a code change, the user can --force then.
  • No new CLI flag. Behaviour falls out of the existing xbrain fetch invocation.

Scope (out of scope intentionally)

  • fetch_x.py (x.com articles + threads) — different failure-reason set, its own skip logic; separate issue if needed.
  • ContentSource.attempts integration — recorded but not consulted in the skip decision.
  • No retry budget / backoff cap.

Tests

Suite Count What it covers
Truth-table on _should_refetch 13 Every cell from PRD §5 matrix
Integration on fetch_pending 5 Each destructive vs benign path + filter precedence

Total: 296 tests (up from 278), coverage 88%, uv run poe check all-green.

Specs

  • PRD: vault/zz-support-files/docs/prds/2026-05-22-xbrain-19-retry-transient-fetches.md
  • Plan: vault/zz-support-files/docs/implementation-plans/2026-05-22-xbrain-19-retry-transient-fetches.md

Test plan

  • Truth-table unit tests for every (content state × force) cell
  • Integration tests for transient retry / terminal skip / --force override / since/until precedence
  • uv run poe check all-green
  • README + ARCHITECTURE updated

🤖 Generated with Claude Code

VGonPa and others added 2 commits May 22, 2026 18:43
Today `xbrain fetch` skips every item whose `content` field is non-empty,
regardless of whether the recorded sources succeeded. To retry a single
network blip the user has to use `--force`, which re-fetches *everything*
(wasteful, rate-limit risk).

This change adds an automatic retry path for items whose only recorded
failures are TRANSIENT (timeout, dns_error). Items with success or with
TERMINAL failures (not_found, paywall, forbidden, js_required, empty_content)
stay skipped until `--force` is set.

Implementation:
- New constant `_TRANSIENT_FAILURES: frozenset[FailureReason] =
  frozenset({"timeout", "dns_error"})` in `xbrain.fetch`.
- New pure helper `_should_refetch(content, force) -> bool` that
  encapsulates the skip decision (truth-table-driven).
- One-line wiring in `fetch_pending`: the `if item.content is not None
  and not force: continue` line is replaced with
  `if not _should_refetch(item.content, force): continue`.

Scope:
- Only `xbrain.fetch` (external_article links). `fetch_x.py` (x.com
  articles + threads) has a different failure-reason set and its own
  skip logic — out of scope here, separate issue if needed.
- No new CLI flag — the new behaviour falls out of the existing
  `xbrain fetch` invocation.
- `--force` semantics UNCHANGED (still refetches everything).

Spec decisions documented in the PRD:
- `dns_error` classified as transient (DNS provider blips happen; dead
  domains will cost one extra HTTP probe per run — acceptable).
- `empty_content` classified as terminal (Trafilatura returned no body;
  if extraction improves, the user can `--force`).

Tests (18 new):
- 13 truth-table tests on `_should_refetch` covering every cell from the
  PRD §5 matrix (content=None, force ON/OFF, ok, transient, terminal,
  mixed transient+terminal, mixed transient+success, only-x sources).
- 5 integration tests on `fetch_pending`:
  - timeout item retried without `--force`.
  - not_found item skipped without `--force`.
  - not_found item retried with `--force`.
  - successful fetch skipped without `--force`.
  - timeout item outside `since`/`until` still skipped (filter precedence).

Total: 296 tests (up from 278), coverage 88%, `uv run poe check`
all-green.

Docs:
- README Commands table: `fetch` row gains a sentence describing the
  transient/terminal distinction.
- ARCHITECTURE step-by-step `fetch` card: new "Transient retries" bullet.

PRD:  vault/zz-support-files/docs/prds/2026-05-22-xbrain-19-retry-transient-fetches.md
Plan: vault/zz-support-files/docs/implementation-plans/2026-05-22-xbrain-19-retry-transient-fetches.md

Closes #19.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses every HIGH/MEDIUM finding from the 4-reviewer panel on PR #26
(code-reviewer + spec-compliance APPROVED; silent-failure-hunter and
pr-test-analyzer flagged the same issue plus minor cosmetics):

silent-failure-hunter HIGH (also pr-test-analyzer M1 + code-reviewer nit):
- ContentSource(ok=False, failure_reason=None) was silently classified
  as terminal. Pre-Fase-2 records (where failure_reason defaulted None)
  and any future code path that records a failure without categorising
  it (e.g. an uncaught extractor exception via `error="..."`) would stay
  invisibly stuck — exactly the silent-failure shape the spec invariant
  tries to prevent.
- Fix: treat failure_reason=None as transient. The predicate becomes
  `failure_reason is None or failure_reason in _TRANSIENT_FAILURES`.
  Rationale: an `ok=False` without a categorised reason is anomalous;
  re-fetching gives it a chance to land on a categorised result rather
  than getting permanently stuck.
- Docstring updated to explain the policy.

pr-test-analyzer M2: missing test for mixed external_article + x_article.
- Added `test_should_refetch_external_transient_alongside_xcom_source`:
  a transient-failed external_article + an x_article (any state) must
  retry on the external (x.com is fetch_x's job, must not block).

pr-test-analyzer M3: missing test for "re-fetch replaces, not appends".
- Added `test_fetch_pending_replaces_sources_does_not_append`: a
  transient-failed item with 1 source stays at 1 source after retry,
  not 2.

Also added `test_should_refetch_uncategorized_failure_treated_as_transient`
to pin the new failure_reason=None behaviour explicitly.

Cosmetic cleanup (pr-test-analyzer + code-reviewer):
- Hoisted `from xbrain.models import Content, ContentSource` and
  `from xbrain.fetch import _should_refetch` to the file top — dropped
  4 inline imports inside test bodies + 1 redundant local wrapper.

spec-compliance noted (NO change needed): the x-only + force case where
the helper returns True but fetch_pending filters x-only items before
consulting _should_refetch. User-visible behaviour matches the spec;
helper alone returning True is a curiosity, not a bug.

code-reviewer APPROVED with no blockers; radon B(8) on _should_refetch
is unchanged from the helper's original shape.

Total: 299 tests (up from 296), coverage 88% unchanged, all-green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@VGonPa VGonPa merged commit 2a614ee into develop May 22, 2026
1 check passed
@VGonPa VGonPa deleted the ws-19-retry-transient branch May 22, 2026 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant