Skip to content

feat(extractor): polite HTTP client with Retry-After + restart-loop fix#317

Merged
SimplicityGuy merged 2 commits intomainfrom
feature/polite-downloader
Apr 29, 2026
Merged

feat(extractor): polite HTTP client with Retry-After + restart-loop fix#317
SimplicityGuy merged 2 commits intomainfrom
feature/polite-downloader

Conversation

@SimplicityGuy
Copy link
Copy Markdown
Owner

Summary

Discogs and MusicBrainz both rate-limit aggressively (Discogs has been observed returning a 36-minute Retry-After). The extractor was issuing raw reqwest::get calls with no User-Agent and no 429 handling, so:

  1. A 429 response body became the "HTML" we scraped → no year directories → process::exit(1)
  2. Docker's restart: on-failure fired ~50s later, hammered the limiter again, and slid the cooldown window forward

This adds polite_http::PoliteClient, used at every upstream HTTP call site:

  • Sends User-Agent: discogsography-extractor/<version> (+https://github.com/...) on every request
  • Enforces a configurable minimum gap between requests (5s Discogs, 2s MusicBrainz)
  • Honors Retry-After on 429 / 503, sleeping in-process up to a configurable cap (2h Discogs, 30m MusicBrainz) — so the cooldown rides out within the same binary instead of triggering a restart loop
  • Falls back to growing default backoff (30s, 60s, 120s, …) when Retry-After is absent
  • Bumps download retries 3→5 and base delays 2s→30s

Belt-and-suspenders: main.rs now sleeps FAILURE_COOLDOWN_SECS (default 600s, override via env) before exiting on terminal failure, so docker can't flap us through a residual rate-limit window even if the polite client can't recover.

Files changed

  • extractor/src/polite_http.rs (new, 300 lines + 6 unit tests)
  • extractor/src/discogs_downloader.rs — wired client at 3 call sites, bumped retry knobs
  • extractor/src/musicbrainz_downloader.rs — wired client at 3 call sites, bumped retry knobs
  • extractor/src/main.rsmod polite_http, post-failure cooldown
  • extractor/src/lib.rspub mod polite_http export

Test plan

  • cargo test — 480 lib tests + 50 integration tests pass (was 474+50 baseline → +6 new polite_http tests)
  • cargo clippy --lib --all-features -- -D warnings clean
  • cargo fmt clean on changed files
  • New polite_http tests cover: gap enforcement, 429 retry, Retry-After cap, max retry exhaustion, missing-header fallback, concurrent serialization
  • Existing extractor_di_test::test_run_musicbrainz_loop_periodic_check_ok_true still passes (had to drop client timeout in favor of connect_timeout because tokio::test(start_paused = true) advances virtual time past wall-clock timeouts)

Operator notes

  • Default behavior is more polite than before; expect download cycles to take slightly longer (a few extra seconds total). Extraction dwarfs download time, so this is net invisible.
  • To shorten the post-failure cooldown for development, set FAILURE_COOLDOWN_SECS=0 in the extractor environment.

🤖 Generated with Claude Code

Discogs and MusicBrainz both rate-limit aggressively. The extractor was
issuing raw `reqwest::get` calls with no User-Agent and no 429 handling.
On a 429, the response body became the "HTML" we scraped → no year
directories → exit 1 → docker `restart: on-failure` fired ~50s later
and slid the limiter window forward.

This change introduces `polite_http::PoliteClient`:

* Sends `User-Agent: discogsography-extractor/<version>` on every request
* Enforces a configurable minimum gap between requests (5s Discogs, 2s MB)
* Honors `Retry-After` on 429 / 503, sleeping in-process up to a cap
  (2h Discogs, 30m MB) so cooldowns don't trigger restart-loop flapping
* Falls back to growing default backoff when `Retry-After` is absent

Both downloaders are converted to use the polite client at all 6 call
sites. Retry counts go 3→5 and base delays 2s→30s for download attempts.

Belt-and-suspenders: on terminal failure, `main.rs` sleeps
`FAILURE_COOLDOWN_SECS` (default 600s) before exiting so docker can't
flap us through a residual rate-limit window.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 29, 2026

Codecov Report

❌ Patch coverage is 96.68246% with 7 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
extractor/src/main.rs 58.82% 7 Missing ⚠️

📢 Thoughts on this report? Let us know!

…cooldown

Codecov flagged 18 uncovered lines on PR #317. This adds:

* extracted `parse_failure_cooldown` + `apply_failure_cooldown` helpers in
  main.rs so the env-var → duration mapping and the actual sleep are
  unit-testable without invoking `process::exit` (covers the 13 lines on
  the previously-untestable Err branch)
* discogs_downloader: tests for scrape main-page returning 500 (error
  surfaced) and one of two year directories returning 500 (warn-and-
  continue, other year still succeeds)
* musicbrainz_downloader: tests for index 500 and SHA256SUMS 404 paths

Lib tests 480 → 486 (+6); bin tests 479 → 484 (+5).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

E2E Coverage (webkit)

Totals Coverage
Statements: 46.5% ( 1241 / 2669 )
Lines: 46.5% ( 1241 / 2669 )

StandWithUkraine

@github-actions
Copy link
Copy Markdown
Contributor

E2E Coverage (firefox)

Totals Coverage
Statements: 46.5% ( 1241 / 2669 )
Lines: 46.5% ( 1241 / 2669 )

StandWithUkraine

@github-actions
Copy link
Copy Markdown
Contributor

E2E Coverage (chromium)

Totals Coverage
Statements: 46.5% ( 1241 / 2669 )
Lines: 46.5% ( 1241 / 2669 )

StandWithUkraine

@github-actions
Copy link
Copy Markdown
Contributor

E2E Coverage (webkit - iPhone 15)

Totals Coverage
Statements: 46.5% ( 1241 / 2669 )
Lines: 46.5% ( 1241 / 2669 )

StandWithUkraine

@github-actions
Copy link
Copy Markdown
Contributor

E2E Coverage (webkit - iPad Pro 11)

Totals Coverage
Statements: 46.5% ( 1241 / 2669 )
Lines: 46.5% ( 1241 / 2669 )

StandWithUkraine

@SimplicityGuy SimplicityGuy merged commit c8dafc9 into main Apr 29, 2026
57 checks passed
@SimplicityGuy SimplicityGuy deleted the feature/polite-downloader branch April 29, 2026 02:34
SimplicityGuy added a commit that referenced this pull request Apr 29, 2026
…ane (#318)

PR #317 raised both downloaders' transport-error retry budget from 3
attempts at 2s base to 5 attempts at 30s base. The intent was rate-limit
politeness, but rate-limit handling actually lives in `polite_http` and
never reaches this loop — only post-connect transport errors do
(partial reads, flush/sync errors, mid-stream drops).

Side effect: integration tests in `tests/http_integration_test.rs` see
`cfg(not(test))` (the `#[cfg(test)]` override only applies when the
*library* itself is being unit-tested, not when external test binaries
link against it). Failing-download tests therefore paid the full
production backoff: 30+60+120+240 = 450s sleeping per test.

That blew up `http_integration_test` from ~50s on main to 460s locally,
and made `test-extractor` time out under `cargo-llvm-cov` in CI on
PR #317.

Restoring the original 3-retry / 2s-base — those are the right numbers
for transport-error retry. The polite client owns the rate-limit
backoff path with its own server-driven Retry-After handling and per-
provider caps (2h Discogs, 30m MusicBrainz).

Wall time: `cargo test --test http_integration_test` 460s → 75s
sequentially, 20s parallel.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant