Skip to content

Fix race condition in LuceneIndex GraphDirectory during store#628

Merged
fh-ms merged 2 commits intomainfrom
bugfix/lucene-index-synch
Mar 30, 2026
Merged

Fix race condition in LuceneIndex GraphDirectory during store#628
fh-ms merged 2 commits intomainfrom
bugfix/lucene-index-synch

Conversation

@fh-ms
Copy link
Copy Markdown
Contributor

@fh-ms fh-ms commented Mar 30, 2026

This pull request refactors the initialization logic for the Lucene index in LuceneIndex.java, primarily to improve clarity and ensure thread safety when using a GraphDirectory. The main changes involve extracting logic for determining directory type and configuring merge schedulers to prevent concurrency issues.

Initialization and thread safety improvements:

  • Extracted the logic to detect if a GraphDirectory is used into a new usesGraphDirectory() method, improving readability and maintainability.
  • Updated createDirectory() to use usesGraphDirectory(), clarifying the choice between GraphDirectory and a custom directory creator.
  • In lazyInit(), when using a GraphDirectory, configured the IndexWriterConfig to use a SerialMergeScheduler instead of the default ConcurrentMergeScheduler, ensuring merges run on the caller's thread and preventing race conditions with GigaMap serialization.

These changes help ensure that Lucene index merges are handled safely in environments where background thread concurrency could cause data corruption or race conditions.

…ization

Lucene's default ConcurrentMergeScheduler runs background merge threads that modify the GraphDirectory's fileEntries map concurrently with GigaMap#store, causing BinaryPersistenceException with inconsistent element counts. Use SerialMergeScheduler when GraphDirectory is active to ensure merges run on the caller's thread under the GigaMap lock.
@fh-ms fh-ms added bug Something isn't working GigaMap labels Mar 30, 2026
@fh-ms fh-ms requested a review from Copilot March 30, 2026 14:45
@fh-ms fh-ms changed the title fix: race condition in LuceneIndex GraphDirectory during store serialization Fix race condition in LuceneIndex GraphDirectory during store Mar 30, 2026
@fh-ms fh-ms requested a review from zdenek-jonas March 30, 2026 14:46
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a concurrency issue when using the in-graph Lucene GraphDirectory by preventing Lucene from mutating the persistent fileEntries map from background merge threads during GigaMap serialization.

Changes:

  • Refactors lazyInit() to build a reusable IndexWriterConfig and conditionally configure a SerialMergeScheduler.
  • Introduces usesGraphDirectory() and simplifies createDirectory() to centralize the “in-graph vs external directory” decision.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@fh-ms fh-ms merged commit 3753f00 into main Mar 30, 2026
13 checks passed
@fh-ms fh-ms deleted the bugfix/lucene-index-synch branch March 30, 2026 15:17
@fh-ms fh-ms added this to the 4.1.0 milestone Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working GigaMap

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants