feat: use java virtual threads for virtual storage load on demand thread pool by default#19396
Conversation
…ol by default changes: * switch default `SegmentLocalCacheManager.virtualStorageLoadOnDemandExec` to virtual threads with a `Semaphore` for backpressure * added `SegmentLoaderConfig.virtualStorageUseVirtualThreads` (`druid.segmentCache.virtualStorageUseVirtualThreads`) config that defaults to true, but allows opt-out via setting to false * raise default `SegmentLoaderConfig.virtualStorageLoadThreads` default to Math.max(32, 4 * cores), sized as ~4x lookahead per processing thread * convert `SegmentCacheEntry` from `synchronized` to `ReentrantLock` so virtual threads park instead of pinning the carrier during mount
|
Any benchmarks? |
no, not really yet, we have the option to go back to the previous behavior if it is worse for any reason. |
A comment about this in the code would be nice. Also: does running on a Java 25 runtime get this benefit now, or do we need to target Java 25 too? |
capistrant
left a comment
There was a problem hiding this comment.
I think the changes look good and I appreciate the emergency outlet to use the legacy path. one testing nit:
- Can we parameterize the tests that flex this code to use both the new default as well as the legacy thread model? Maybe overkill since the only diff is in the construction of the executor... which is why it feels nitty
FrankChen021
left a comment
There was a problem hiding this comment.
| Severity | Findings |
|---|---|
| P0 | 0 |
| P1 | 0 |
| P2 | 1 |
| P3 | 0 |
| Total | 1 |
This is an automated review by Codex GPT-5
| if (config.isVirtualStorageUseVirtualThreads()) { | ||
| log.info( | ||
| "Using virtual storage mode with virtual threads - max concurrent on demand loads: [%d].", | ||
| config.getVirtualStorageLoadThreads() |
There was a problem hiding this comment.
P2 Validate virtualStorageLoadThreads before using it as a semaphore limit
With virtualStorageUseVirtualThreads=true, setting druid.segmentCache.virtualStorageLoadThreads to 0 now creates a Semaphore with zero permits. Every on-demand load task then blocks forever in acquireUninterruptibly(), so queries wait until timeout instead of failing fast at startup. The old fixed-thread-pool path rejected 0 via Executors.newFixedThreadPool(0). Please validate virtualStorageLoadThreads > 0 for both modes, or otherwise preserve fail-fast behavior.
I am unsure if it would have needed to target java 25 to get the benefit if this PR instead was just making the part of the change of switching the pool to virtual but leaving the mount stuff using |
FrankChen021
left a comment
There was a problem hiding this comment.
I have reviewed the code for correctness, edge cases, concurrency, and integration risks; no issues found.
This is an automated review by Codex GPT-5
FrankChen021
left a comment
There was a problem hiding this comment.
I have reviewed the code for correctness, edge cases, concurrency, and integration risks; no issues found.
This is an automated review by Codex GPT-5
Description
This PR switches the
SegmentLocalCacheManageron-demand load executor (used in virtual-storage mode) from a fixed platform-thread pool to one virtual thread per task with aSemaphorefor backpressure, and convertsSegmentCacheEntryfrom synchronized to ReentrantLock so virtual threads park rather than pinning their carrier during the long-running mount path.The on-demand load work is overwhelmingly socket wait against deep storage with a small CPU portion at the end (factorize/mmap setup). Virtual threads let us fan out hundreds of in-flight loads cheaply, with the semaphore providing a similar kind of backpressure the pool size used to give implicitly. This becomes especially relevant once partial loading lands and the pool starts handling many smaller per-internal-segment-file range read calls instead of one big mount per segment.
The
ReentrantLockconversion inSegmentCacheEntryis what makes the virtual threads switch actually pay off on Java 21. Without it, the entiremount()body runs insidesynchronized (this), pinning the carrier thread and effectivelycapping concurrent mounts at the carrier-pool size regardless of how many virtual threads are spawned.
ReentrantLockparks the virtual thread properly. Once Druid's minimum is Java 24+, JEP 491 makes this conversion redundant, but it's a mechanical minimum-risk change in the meantime.changes:
SegmentLocalCacheManager.virtualStorageLoadOnDemandExecto virtual threads with aSemaphorefor backpressureSegmentLoaderConfig.virtualStorageUseVirtualThreads(druid.segmentCache.virtualStorageUseVirtualThreads) config that defaults to true, but allows opt-out via setting to falseSegmentLoaderConfig.virtualStorageLoadThreadsdefault to Math.max(32, 4 * cores), sized as ~4x lookahead per processing threadSegmentCacheEntryfromsynchronizedtoReentrantLockso virtual threads park instead of pinning the carrier during mount