fix: unblock MusicBrainz extractor during Discogs periodic wait#303
Merged
SimplicityGuy merged 4 commits intomainfrom Apr 15, 2026
Merged
fix: unblock MusicBrainz extractor during Discogs periodic wait#303SimplicityGuy merged 4 commits intomainfrom
SimplicityGuy merged 4 commits intomainfrom
Conversation
The MusicBrainz extractor calls wait_for_discogs_idle before starting, which polls the Discogs extractor's /health endpoint and waits until extraction_status is not "running". Two bugs kept it stuck forever: 1. process_discogs_data set status=Running at line 92 but four early-return paths (no files, Skip, empty data_files, empty pending_files) never reset it. After a periodic check hit the Skip path, status stayed "running" for the entire 5-day sleep window and MusicBrainz polled indefinitely. 2. Even with that fixed, "completed" conflates "just finished a run" with "sleeping between periodic checks" — operators couldn't tell the difference from /health. Both are fixed: - Each early-return path in process_discogs_data now sets status=Completed (mirroring process_musicbrainz_data). - New ExtractionStatus::Waiting variant, set by run_discogs_loop and run_musicbrainz_loop immediately before the periodic sleep. Failed is preserved through the sleep to keep failure visible; only Completed transitions to Waiting. - wait_for_discogs_idle already proceeds on anything != "running", so it picks up "waiting" with zero additional logic. - api/routers/admin.py _track_extraction accepts both "completed" and "waiting" as terminal success and writes the observed value verbatim to extraction_history.status (no mapping). Dashboard badge adds a distinct blue "waiting" style so operators see at a glance whether a run just finished or is back on the 5-day schedule. - extractor/README.md documents the full lifecycle with a mermaid state diagram and a transition-point table. Tests: 474 Rust tests pass (5 new), 73 admin endpoint tests pass (1 new + 1 strengthened). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
The run_musicbrainz_loop tests spawn helper tasks that poll extraction_status waiting for Completed before advancing the test. With the Completed → Waiting transition added to the loop, those polling tasks never see Completed (the loop transitions to Waiting in the microsecond after process_musicbrainz_data returns) and spin forever — which manifests as hung test_run_musicbrainz_loop_* integration tests in CI. Both polling loops in extractor_di_test.rs now break on Completed OR Waiting. Also added the Waiting variant to the as_str exhaustiveness test. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
Contributor
Contributor
Contributor
Contributor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes a bug where the MusicBrainz extractor got stuck waiting forever for Discogs even though the Discogs extractor was idle (sleeping between 5-day periodic checks).
Two root causes:
process_discogs_dataleakedRunningstatus on early-return paths. Line 92 setextraction_status = Runningat the top, but four early-returns (no files,Skip, emptydata_files, emptypending_files) returned without resetting it. After a periodic check hit theSkippath, the live status stayed"running"for the entire 5-day sleep window, and the MusicBrainz extractor'swait_for_discogs_idlepolled indefinitely."completed"conflated "just finished" with "sleeping on schedule". Even with the above fixed, operators looking at/healthcouldn't tell the difference between a freshly-finished run and one already back on the 5-day cycle.What changed
Rust extractor (
extractor/src/extractor.rs)Completed(mirrorsprocess_musicbrainz_datawhich already did this)ExtractionStatus::Waitingvariant with full doc commentrun_discogs_loopandrun_musicbrainz_looptransitionCompleted → Waitingright before the periodic sleep;Failedis preserved through the sleep to keep the failure signal visible to operatorswait_for_discogs_idlealready proceeds on anything ≠"running", so it picks up"waiting"with zero logic changesAPI tracker (
api/routers/admin.py)_track_extractionaccepts both"completed"and"waiting"as terminal successextraction_history.status(no mapping) — operators can tell a just-finished run from one back on scheduleDashboard (
dashboard/static/admin.{html,js}).badge-waitingCSS class (blue, distinct from completed's green)_statusBadgeClasshandles thewaitingvalueDocumentation (
extractor/README.md)Completedis kept transient, whyWaitingis dominant, and whyFailedpersists through the sleep windowLifecycle
mermaid stateDiagram-v2 [*] --> Idle Idle --> Running : initial run or trigger Running --> Completed : Ok(true) Running --> Failed : Err or Ok(false) Completed --> Waiting : run_*_loop before sleep Waiting --> Running : periodic wake or trigger Failed --> Running : next periodic attempt or triggerTest plan
🤖 Generated with Claude Code