fix: lake: race condition in monitor queue#13559
Merged
Merged
Conversation
Collaborator
|
Reference manual CI status:
|
|
Mathlib CI status (docs):
|
pull Bot
pushed a commit
to VitalyAnkh/lean4
that referenced
this pull request
May 11, 2026
This PR re-enables all of the Lake tests by default. The previous flakiness appears to have been fixed by leanprover#13559, as multiple runs of leanprover#8580 demonstrate. The `LAKE_CI` CMake setting and the `lake-ci` label is kept to potentially enable expensive Lake tests in the future (e.g., the online tests that are not currently run in CI).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR fixes a race condition in the Lake build monitor's draining of the job queue.
Previously, the monitor drained the registered-jobs queue before checking task states. A finished job in that scan may have registered new jobs in its body just before terminating. Those jobs landed in the queue after the drain, so when no other unfinished jobs remained the monitor exited without ever observing them. If one of the missed jobs errors, this could produce an "uncaught top-level build failure" message alongside "Build completed successfully".
To fix this, the monitor loop is restructured so the queue is drained after task states are scanned, with an additional drain before the loop terminates.
drainQueueandscanJobsare split out frompoll, andMonitor.mainperforms an initial drain to capture any jobs registered before the monitor starts iterating. Finishing jobs cannot register further work once their state transitions, so a single post-scan drain is sufficient to close the window.🤖 Prepared with Claude Code