fix: resolve LockException under concurrent multi-process execution#4804
Merged
greysonlalonde merged 11 commits intomainfrom Mar 11, 2026
Merged
fix: resolve LockException under concurrent multi-process execution#4804greysonlalonde merged 11 commits intomainfrom
greysonlalonde merged 11 commits intomainfrom
Conversation
greysonlalonde
commented
Mar 11, 2026
joaomdmoura
reviewed
Mar 11, 2026
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
xdist isnt playing nice, testing locally
vinibrsl
approved these changes
Mar 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
portalocker.Locktimeout from default 5s to 120s in ChromaDB factory — the direct cause ofLockException(BlockingIOError(11, 'Resource temporarily unavailable'))under concurrent Airflow workersPRAGMA journal_mode=WALon both SQLite databases (kickoff task outputs + flow persistence) for concurrent read/write supportportalocker) around all LanceDB write paths to prevent contention across Airflow worker processesTest plan
LockExceptionerrors in logsNote
Medium Risk
Touches multiple persistence backends (SQLite, LanceDB, ChromaDB) to change locking and journal settings; mistakes could cause deadlocks or data loss/corruption under load.
Overview
Improves robustness under concurrent multi-process execution by standardizing cross-process locking and increasing timeouts.
SQLite persistence (
flowstate + kickoff task outputs) now connects with a longertimeoutand enablesPRAGMA journal_mode=WALto better support concurrent readers/writers.LanceDB writes are now serialized across processes using a shared
lock_storelock around table creation/indexing, compaction/optimize, and all write operations, while keeping commit-conflict retries. ChromaDB client creation switches from ad-hoc lockfiles to the new centralizedlock_store(with a longer default lock wait and optional Redis-backed distributed locks).Written by Cursor Bugbot for commit 8794776. This will update automatically on new commits. Configure here.