fix: unblock MusicBrainz extractor during Discogs periodic wait by SimplicityGuy · Pull Request #303 · SimplicityGuy/discogsography

SimplicityGuy · 2026-04-15T06:22:35Z

Summary

Fixes a bug where the MusicBrainz extractor got stuck waiting forever for Discogs even though the Discogs extractor was idle (sleeping between 5-day periodic checks).

Two root causes:

process_discogs_data leaked Running status on early-return paths. Line 92 set extraction_status = Running at the top, but four early-returns (no files, Skip, empty data_files, empty pending_files) returned without resetting it. After a periodic check hit the Skip path, the live status stayed "running" for the entire 5-day sleep window, and the MusicBrainz extractor's wait_for_discogs_idle polled indefinitely.
"completed" conflated "just finished" with "sleeping on schedule". Even with the above fixed, operators looking at /health couldn't tell the difference between a freshly-finished run and one already back on the 5-day cycle.

What changed

Rust extractor (extractor/src/extractor.rs)

Fixed all four early-return paths to set Completed (mirrors process_musicbrainz_data which already did this)
New ExtractionStatus::Waiting variant with full doc comment
run_discogs_loop and run_musicbrainz_loop transition Completed → Waiting right before the periodic sleep; Failed is preserved through the sleep to keep the failure signal visible to operators
wait_for_discogs_idle already proceeds on anything ≠ "running", so it picks up "waiting" with zero logic changes

API tracker (api/routers/admin.py)

_track_extraction accepts both "completed" and "waiting" as terminal success
Writes the observed status value verbatim to extraction_history.status (no mapping) — operators can tell a just-finished run from one back on schedule

Dashboard (dashboard/static/admin.{html,js})

New .badge-waiting CSS class (blue, distinct from completed's green)
_statusBadgeClass handles the waiting value

Documentation (extractor/README.md)

New Extraction Status Lifecycle section with a Mermaid state diagram
Transition-point table mapping each state change to its code location
Design notes explaining why Completed is kept transient, why Waiting is dominant, and why Failed persists through the sleep window

Lifecycle

mermaid stateDiagram-v2 [*] --> Idle Idle --> Running : initial run or trigger Running --> Completed : Ok(true) Running --> Failed : Err or Ok(false) Completed --> Waiting : run_*_loop before sleep Waiting --> Running : periodic wake or trigger Failed --> Running : next periodic attempt or trigger

Test plan

`cargo test --manifest-path extractor/Cargo.toml --lib` → 474 passed (5 new)
`cargo clippy --manifest-path extractor/Cargo.toml --lib` → clean
`uv run pytest tests/api/test_admin_endpoints.py` → 73 passed (1 new + 1 strengthened)
`uv run ruff check .` → clean
`uv run mypy api/routers/admin.py` → clean
Pre-commit hooks on commit → all passed
Deploy to staging and verify MusicBrainz extractor starts promptly after a Discogs periodic check

🤖 Generated with Claude Code

The MusicBrainz extractor calls wait_for_discogs_idle before starting, which polls the Discogs extractor's /health endpoint and waits until extraction_status is not "running". Two bugs kept it stuck forever: 1. process_discogs_data set status=Running at line 92 but four early-return paths (no files, Skip, empty data_files, empty pending_files) never reset it. After a periodic check hit the Skip path, status stayed "running" for the entire 5-day sleep window and MusicBrainz polled indefinitely. 2. Even with that fixed, "completed" conflates "just finished a run" with "sleeping between periodic checks" — operators couldn't tell the difference from /health. Both are fixed: - Each early-return path in process_discogs_data now sets status=Completed (mirroring process_musicbrainz_data). - New ExtractionStatus::Waiting variant, set by run_discogs_loop and run_musicbrainz_loop immediately before the periodic sleep. Failed is preserved through the sleep to keep failure visible; only Completed transitions to Waiting. - wait_for_discogs_idle already proceeds on anything != "running", so it picks up "waiting" with zero additional logic. - api/routers/admin.py _track_extraction accepts both "completed" and "waiting" as terminal success and writes the observed value verbatim to extraction_history.status (no mapping). Dashboard badge adds a distinct blue "waiting" style so operators see at a glance whether a run just finished or is back on the 5-day schedule. - extractor/README.md documents the full lifecycle with a mermaid state diagram and a transition-point table. Tests: 474 Rust tests pass (5 new), 73 admin endpoint tests pass (1 new + 1 strengthened). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

codecov · 2026-04-15T06:25:49Z

Codecov Report

❌ Patch coverage is 78.94737% with 4 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
extractor/src/extractor.rs	76.47%	4 Missing ⚠️

📢 Thoughts on this report? Let us know!

The run_musicbrainz_loop tests spawn helper tasks that poll extraction_status waiting for Completed before advancing the test. With the Completed → Waiting transition added to the loop, those polling tasks never see Completed (the loop transitions to Waiting in the microsecond after process_musicbrainz_data returns) and spin forever — which manifests as hung test_run_musicbrainz_loop_* integration tests in CI. Both polling loops in extractor_di_test.rs now break on Completed OR Waiting. Also added the Waiting variant to the as_str exhaustiveness test. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-15T16:03:24Z

E2E Coverage (webkit)

Totals
Statements:	47.64% ( 1241 / 2605 )
Lines:	47.64% ( 1241 / 2605 )

github-actions · 2026-04-15T16:03:41Z

E2E Coverage (chromium)

Totals
Statements:	47.64% ( 1241 / 2605 )
Lines:	47.64% ( 1241 / 2605 )

github-actions · 2026-04-15T16:04:31Z

E2E Coverage (firefox)

Totals
Statements:	47.64% ( 1241 / 2605 )
Lines:	47.64% ( 1241 / 2605 )

github-actions · 2026-04-15T16:06:09Z

E2E Coverage (webkit - iPad Pro 11)

Totals
Statements:	47.64% ( 1241 / 2605 )
Lines:	47.64% ( 1241 / 2605 )

github-actions · 2026-04-15T16:06:33Z

E2E Coverage (webkit - iPhone 15)

Totals
Statements:	47.64% ( 1241 / 2605 )
Lines:	47.64% ( 1241 / 2605 )

SimplicityGuy and others added 3 commits April 15, 2026 08:22

chore: increase timeout

fcbccfc

chore: update deps

eda4b23

SimplicityGuy merged commit 48f8172 into main Apr 15, 2026
57 checks passed

SimplicityGuy deleted the fix/extractor-waiting-status branch April 15, 2026 16:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: unblock MusicBrainz extractor during Discogs periodic wait#303

fix: unblock MusicBrainz extractor during Discogs periodic wait#303
SimplicityGuy merged 4 commits intomainfrom
fix/extractor-waiting-status

SimplicityGuy commented Apr 15, 2026

Uh oh!

codecov bot commented Apr 15, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 15, 2026

Uh oh!

github-actions bot commented Apr 15, 2026

Uh oh!

github-actions bot commented Apr 15, 2026

Uh oh!

github-actions bot commented Apr 15, 2026

Uh oh!

github-actions bot commented Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

SimplicityGuy commented Apr 15, 2026

Summary

What changed

Lifecycle

Test plan

Uh oh!

codecov bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Apr 15, 2026

E2E Coverage (webkit)

Uh oh!

github-actions bot commented Apr 15, 2026

E2E Coverage (chromium)

Uh oh!

github-actions bot commented Apr 15, 2026

E2E Coverage (firefox)

Uh oh!

github-actions bot commented Apr 15, 2026

E2E Coverage (webkit - iPad Pro 11)

Uh oh!

github-actions bot commented Apr 15, 2026

E2E Coverage (webkit - iPhone 15)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov bot commented Apr 15, 2026 •

edited

Loading