-
Notifications
You must be signed in to change notification settings - Fork 1.5k
perf(sql): improve SQL latency by moving munmap() to a background job #6386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis PR introduces asynchronous memory unmapping support to QuestDB's Cairo engine. It adds a new configuration property Changes
Sequence DiagramsequenceDiagram
participant Worker as AsyncMunmapJob
participant Cache as MmapCache
participant Queue as munmapTaskQueue
participant Consumer as MunmapTask Consumer
loop Periodic Execution
Worker->>Cache: asyncMunmap()
alt ASYNC_MUNMAP_ENABLED
Cache->>Queue: drain & process enqueued tasks
Queue->>Consumer: consume MunmapTask
Consumer->>Consumer: perform munmap() + memory accounting
Consumer-->>Cache: return unmapped count
else Direct Path
Cache->>Cache: fall back to direct munmap
end
Cache-->>Worker: return boolean (work done)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai please review |
|
@jerrinot: I'll review the changes in this PR for you. ✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/src/main/java/io/questdb/cairo/TableReader.java (1)
1352-1406: Update line 531 to handle the return value intentionally.The method signature change to return
booleannow requires explicit handling at all call sites. Line 531 ingoActiveAtTxn()discards the return value without explanation, while line 627 inreload()captures it asfastPathfor performance diagnostics. Either capture the return value for logging (matching thereload()pattern) or add a suppression comment explaining why it's intentionally ignored.
🧹 Nitpick comments (9)
core/src/main/java/io/questdb/std/MmapCache.java (2)
339-350: Queue-full fallback is correct; add telemetry to see pressure.The sync fallback when next() returns no slot is good. Consider incrementing a counter or log at a throttled rate to observe sustained queue pressure.
39-45: Make capacity configurable (optional).Hardcoded MUNMAP_QUEUE_CAPACITY (8k) may be too small/large depending on workload. Consider a config key (e.g., cairo.file.async.munmap.queue.capacity) with a sensible default.
core/src/main/java/io/questdb/cairo/TableReader.java (3)
614-636: Slow reload logging: consider configurable threshold.50ms is hardcoded. Expose via configuration (e.g., reader.slowReloadLogMs) to tune per deployment.
627-636: Slow reconcile logging: same threshold configurability applies.Apply the same configurable threshold to reconcile logs.
817-835: Per-column close timing: good; consider throttled logging.Logging only when >50ms is good. Ensure logger rate limiting is applied elsewhere to avoid bursts.
core/src/main/java/io/questdb/Bootstrap.java (1)
213-217: Flag wiring LGTM; consider surfacing in config report.Add an advisory log line in reportValidateConfig() to show async munmap state for operators.
@@ void reportValidateConfig() { - if (cairoConfig != null) { + if (cairoConfig != null) { log.advisoryW().$(" - attach partition suffix: ").$(cairoConfig.getAttachPartitionSuffix()).$(); log.advisoryW().$(" - open database [").$uuid(cairoConfig.getDatabaseIdLo(), cairoConfig.getDatabaseIdHi()).I$(); + log.advisoryW().$(" - async munmap: ").$(cairoConfig.getAsyncMunmapEnabled()).$();core/src/main/java/io/questdb/std/Files.java (2)
72-72: Consider volatile for the feature flag (optional).If toggled post-bootstrap (even accidentally), visibility across threads isn’t guaranteed. Marking volatile is a cheap safety net.
- public static boolean ASYNC_MUNMAP_ENABLED = false; + public static volatile boolean ASYNC_MUNMAP_ENABLED = false;
233-235: Avoid exposing the whole MmapCache; provide a narrow façade.Expose a convenience Files.asyncMunmapOnce() instead of returning the cache to reduce accidental misuse.
- public static MmapCache getMmapCache() { - return mmapCache; - } + public static MmapCache getMmapCache() { + return mmapCache; // keep for now if used elsewhere + } + + // Convenience façade for the background job + public static boolean asyncMunmapOnce() { + return mmapCache.asyncMunmap(); + }core/src/main/java/io/questdb/ServerMain.java (1)
299-303: Verify the pool selection for async munmap operations.The TODO comment raises a valid concern about using the shared query pool for munmap operations. During high query load, this pool may have significant contention, potentially defeating the purpose of offloading munmap to improve tail latency.
Consider whether:
- The shared query pool has sufficient capacity for this additional job
- A dedicated pool or the shared write pool would be more appropriate
- Munmap operations might interfere with query execution latency
You might want to run performance tests under high query load to validate this choice.
Minor: The TODO comment uses lowercase "todo:" instead of the conventional "TODO:" format.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
core/src/main/java/io/questdb/Bootstrap.java(1 hunks)core/src/main/java/io/questdb/PropServerConfiguration.java(3 hunks)core/src/main/java/io/questdb/PropertyKey.java(1 hunks)core/src/main/java/io/questdb/ServerMain.java(3 hunks)core/src/main/java/io/questdb/cairo/CairoConfiguration.java(1 hunks)core/src/main/java/io/questdb/cairo/CairoConfigurationWrapper.java(1 hunks)core/src/main/java/io/questdb/cairo/DefaultCairoConfiguration.java(1 hunks)core/src/main/java/io/questdb/cairo/TableReader.java(9 hunks)core/src/main/java/io/questdb/std/Files.java(2 hunks)core/src/main/java/io/questdb/std/MmapCache.java(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)
- GitHub Check: New pull request (Rust Test and Lint on linux-jdk17)
- GitHub Check: New pull request (Hosted Running tests on windows-other-2)
- GitHub Check: New pull request (Hosted Running tests on windows-other-1)
- GitHub Check: New pull request (Hosted Running tests on windows-pgwire)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo-2)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo-1)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz2)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-other)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz1)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-pgwire)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin-sub)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-sub)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin-base)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-root)
- GitHub Check: New pull request (Hosted Running tests on mac-other)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz2)
- GitHub Check: New pull request (Hosted Running tests on mac-pgwire)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz1)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo-fuzz)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-sub)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-root)
- GitHub Check: New pull request (SelfHosted Griffin And Fuzz tests on linux-x64-zfs)
- GitHub Check: New pull request (Hosted Running tests on mac-griffin)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Other tests on linux-arm64)
- GitHub Check: New pull request (Check Changes Check changes)
- GitHub Check: build
- GitHub Check: build
🔇 Additional comments (14)
core/src/main/java/io/questdb/std/MmapCache.java (2)
330-337: Record pooling helper looks good.Reuses pooled records correctly and initializes via of(...).
48-53: Consumer scheduler confirmed—no action needed.Verification complete. AsyncMunmapJob exists (ServerMain.java:505), is scheduled to sharedPoolQuery when getAsyncMunmapEnabled() is true (line 299-302), and correctly calls asyncMunmap() (line 515). The configuration propagates through Bootstrap and Files. Sequences in MmapCache constructor are wired correctly as noted. Code is ready.
core/src/main/java/io/questdb/cairo/TableReader.java (3)
757-780: Deleted partition close: added timing/logs LGTM.Includes thread and duration; helpful for hotspots.
802-809: Partition close timing/logs LGTM.Clear visibility with thread and millis.
1727-1742: All call sites of reloadSlow(boolean) properly handled.Verification confirms:
- Line 615: return value captured and used for telemetry metrics
- Line 148: return value ignored during constructor initialization (acceptable)
core/src/main/java/io/questdb/cairo/CairoConfiguration.java (1)
129-130: New config accessor LGTM.API shape matches others.
core/src/main/java/io/questdb/cairo/CairoConfigurationWrapper.java (1)
104-108: Delegation LGTM.Consistent with other getters.
core/src/main/java/io/questdb/cairo/DefaultCairoConfiguration.java (1)
134-137: Default returns false — LGTM.Matches safe default behavior.
core/src/main/java/io/questdb/PropertyKey.java (1)
609-609: LGTM!The new configuration property follows established naming conventions and patterns. The property key is appropriately positioned among other Cairo file configuration properties.
core/src/main/java/io/questdb/ServerMain.java (2)
50-50: LGTM!The new imports are necessary for the
AsyncMunmapJobimplementation and are appropriately added.Also applies to: 52-52
505-517: LGTM!The
AsyncMunmapJobimplementation is clean and follows the established pattern for background jobs in this codebase:
- Properly extends
SynchronizedJobto serialize execution- Minimal, focused responsibility
- Consistent with other jobs like
EngineMaintenanceJobin terms of error handling approachcore/src/main/java/io/questdb/PropServerConfiguration.java (3)
149-149: LGTM: new async munmap flag fieldConsistent naming and placement; final + private is appropriate.
2982-2985: LGTM: getter exposed via CairoConfigurationGetter matches interface style and correctly returns the field.
1531-1531: API and property key plumbing verified—no issues found.All three
CairoConfigurationimplementors (PropCairoConfiguration,DefaultCairoConfiguration,CairoConfigurationWrapper) properly declare and implementgetAsyncMunmapEnabled(). The property keyCAIRO_FILE_ASYNC_MUNMAP_ENABLEDis defined inPropertyKey.javaand correctly read inPropServerConfiguration.java(line 1530) with a sensible default offalse. The value is properly wired throughBootstrap.java(line 214) toFiles.ASYNC_MUNMAP_ENABLEDand consumed byServerMain.java(line 299) to initializeAsyncMunmapJob. The configuration chain is complete and correctly integrated.
|
@coderabbitai review again please |
|
@jerrinot: I'll perform a full review of the changes in this PR again. ✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (2)
core/src/main/java/io/questdb/std/Files.java (1)
233-235: Consider exposing specific operations rather than the cache instance.Returning the internal
mmapCachedirectly breaks encapsulation. Consider exposing only the specific operations needed by external components (e.g.,asyncMunmap(),unmap()) rather than the entire cache instance.core/src/main/java/io/questdb/std/MmapCache.java (1)
53-63: LGTM: ring buffer and sequences initialized correctly.Capacity is a power of two; MP→SC barrier chaining looks right; asyncMunmap() delegates to SC consumer.
Rename munmapProducesSequence → munmapProducerSequence for clarity (typo).
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
core/src/main/java/io/questdb/Bootstrap.java(1 hunks)core/src/main/java/io/questdb/PropServerConfiguration.java(3 hunks)core/src/main/java/io/questdb/PropertyKey.java(1 hunks)core/src/main/java/io/questdb/ServerMain.java(4 hunks)core/src/main/java/io/questdb/cairo/CairoConfiguration.java(1 hunks)core/src/main/java/io/questdb/cairo/CairoConfigurationWrapper.java(1 hunks)core/src/main/java/io/questdb/cairo/DefaultCairoConfiguration.java(1 hunks)core/src/main/java/io/questdb/std/Files.java(2 hunks)core/src/main/java/io/questdb/std/MmapCache.java(4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build
- GitHub Check: build
🔇 Additional comments (11)
core/src/main/java/io/questdb/cairo/CairoConfigurationWrapper.java (1)
104-107: LGTM!The delegation pattern is consistent with other methods in this wrapper class.
core/src/main/java/io/questdb/Bootstrap.java (1)
214-214: Initialization follows existing pattern.This initialization mirrors the existing
FS_CACHE_ENABLEDpattern on line 213, maintaining consistency. However, note the concerns raised in theFiles.javareview regarding mutable static state.core/src/main/java/io/questdb/cairo/CairoConfiguration.java (1)
129-129: LGTM!The interface method addition is straightforward and follows the existing contract pattern.
core/src/main/java/io/questdb/cairo/DefaultCairoConfiguration.java (1)
134-137: LGTM!The default value of
falseis conservative and safe, aligning with the feature opt-in approach.core/src/main/java/io/questdb/PropertyKey.java (1)
609-609: LGTM!The property key follows the established naming convention and is positioned appropriately in the enum.
core/src/main/java/io/questdb/PropServerConfiguration.java (3)
149-149: LGTM: config flag added.Field introduction is correct and immutable.
1531-1531: LGTM: property wiring.Reads CAIRO_FILE_ASYNC_MUNMAP_ENABLED with sane default (false).
Please confirm PropertyKey.CAIRO_FILE_ASYNC_MUNMAP_ENABLED is defined and exported in PropertyKey and that Bootstrap initializes Files.ASYNC_MUNMAP_ENABLED from it.
2982-2986: LGTM: API surfaced in Cairo configuration.Getter placement and visibility are consistent with other flags.
core/src/main/java/io/questdb/ServerMain.java (2)
191-197: LGTM: safe helper.Returns 0 when HTTP not started; avoids NPE.
504-516: Minor: guard against initialization races.If Files.getMmapCache() can return null early, this NPEs. If Files guarantees a non-null singleton here, ignore; otherwise add a null check or lazy retrieval in runSerially().
core/src/main/java/io/questdb/std/MmapCache.java (1)
350-372: Good non-blocking fallback; one note.When enqueue fails (queue full), falling back to synchronous munmap with exception parity is correct. This relies on the consumer actually running; see ServerMain fix to ensure the job is scheduled on an active pool.
Consider exporting a lightweight gauge for queue depth to spot sustained backlogs.
|
unrelated test failure, fixed by #6389 |
ideoma
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs more testing. Please make the new config option random-driven in WalWriterFuzzTest
|
@CodeRabbit review additional testing (and changes). Also other reviewers comments, i am concerned about assert Os.posix(). how's that passing on windows ? |
🧩 Analysis chain🏁 Script executed: Length of output: 22907 🏁 Script executed: Length of output: 34606 @bluestreak01: I'll review the additional testing and changes, and address your concern about the Analysis of
|
|
@jerrinot this is a user-facing change (perf), could you add PR description to allow this change to be documented in the release notes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
core/src/main/java/io/questdb/std/Files.java (1)
72-72: Make ASYNC_MUNMAP_ENABLED safely visible across threadsPublic mutable static without visibility guarantees risks stale reads. At minimum, mark it volatile.
- public static boolean ASYNC_MUNMAP_ENABLED = false; + public static volatile boolean ASYNC_MUNMAP_ENABLED = false;core/src/main/java/io/questdb/ServerMain.java (1)
500-512: Consider moving AsyncMunmapJob to a top‑level classKeeps ServerMain lean and aligns with other job classes. Functional change not required.
🧹 Nitpick comments (3)
core/src/test/java/io/questdb/test/tools/TestUtils.java (1)
2529-2529: Drain async munmap fully in LeakCheckEnsure the queue is empty before/after tests; loop until idle.
- Files.getMmapCache().asyncMunmap(); + while (Files.getMmapCache().asyncMunmap()) { /* drain */ } @@ - Files.getMmapCache().asyncMunmap(); + while (Files.getMmapCache().asyncMunmap()) { /* drain */ }Also applies to: 2569-2570
core/src/test/java/io/questdb/test/std/FilesCacheFuzzTest.java (1)
69-72: Test toggling of async munmap — minor ergonomics
- OS‑gated parameter matrix — good.
- Consider moving Files.ASYNC_MUNMAP_ENABLED assignment from constructor to @before and restore the previous value in @after to reduce inter‑test coupling if runner parallelism is enabled. Current @afterclass reset is fine if tests are strictly serial.
Also applies to: 74-84, 98-99
core/src/main/java/io/questdb/mp/WorkerPoolUtils.java (1)
50-60: Replace assert with runtime guard for OS checkAssertions may be disabled; make the constraint explicit to avoid accidental enablement on non‑POSIX.
- if (config.getAsyncMunmapEnabled()) { - assert Os.isPosix(); + if (config.getAsyncMunmapEnabled()) { + if (!Os.isPosix()) { + throw new UnsupportedOperationException("Async munmap is not supported on non-POSIX systems"); + } Files.ASYNC_MUNMAP_ENABLED = true; ServerMain.AsyncMunmapJob asyncMunmapJob = new ServerMain.AsyncMunmapJob(); pool.assign(asyncMunmapJob); } else { Files.ASYNC_MUNMAP_ENABLED = false; }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
core/src/main/java/io/questdb/PropServerConfiguration.java(3 hunks)core/src/main/java/io/questdb/ServerMain.java(4 hunks)core/src/main/java/io/questdb/mp/WorkerPoolUtils.java(2 hunks)core/src/main/java/io/questdb/std/Files.java(2 hunks)core/src/main/java/io/questdb/std/MmapCache.java(4 hunks)core/src/test/java/io/questdb/test/ServerMainTest.java(2 hunks)core/src/test/java/io/questdb/test/cairo/fuzz/AbstractFuzzTest.java(2 hunks)core/src/test/java/io/questdb/test/cairo/fuzz/WalWriterFuzzTest.java(23 hunks)core/src/test/java/io/questdb/test/std/FilesCacheFuzzTest.java(3 hunks)core/src/test/java/io/questdb/test/tools/TestUtils.java(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- core/src/main/java/io/questdb/PropServerConfiguration.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (34)
- GitHub Check: New pull request (Coverage Report Coverage Report)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-other)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-pgwire)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-sub)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-root)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz2)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz1)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-sub)
- GitHub Check: New pull request (Rust Test and Lint on linux-jdk17)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-root)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Other tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x86-graal)
- GitHub Check: New pull request (Hosted Running tests on windows-other-2)
- GitHub Check: New pull request (Hosted Running tests on windows-other-1)
- GitHub Check: New pull request (Hosted Running tests on windows-pgwire)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo-2)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo-1)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz2)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz1)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin-sub)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin-base)
- GitHub Check: New pull request (Hosted Running tests on mac-other)
- GitHub Check: New pull request (Hosted Running tests on mac-pgwire)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo-fuzz)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-x86-graal)
- GitHub Check: New pull request (Hosted Running tests on mac-griffin)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-x86-graal)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-arm64)
- GitHub Check: New pull request (Check Changes Check changes)
🔇 Additional comments (17)
core/src/test/java/io/questdb/test/cairo/fuzz/AbstractFuzzTest.java (1)
198-198: Async munmap job wiring — LGTMJob is assigned before pool start; no side effects when disabled via config.
Also applies to: 215-215
core/src/test/java/io/questdb/test/ServerMainTest.java (2)
66-101: Async munmap bootstrap test — LGTMCovers POSIX vs Windows paths and resets static state in finally.
457-457: Parameters expectation update — LGTMAdding cairo.file.async.munmap.enabled with default false matches config exposure.
core/src/main/java/io/questdb/std/Files.java (1)
233-236: Singleton mmap cache accessor — LGTMGetter aligns usages and avoids multiple instances.
core/src/main/java/io/questdb/ServerMain.java (2)
191-196: Active connection count accessor — LGTMSafe default to 0 when HTTP server is not initialized.
299-299: Initialize async munmap before query jobs — LGTMOrdering ensures the consumer job is present prior to query workload.
core/src/test/java/io/questdb/test/cairo/fuzz/WalWriterFuzzTest.java (5)
28-28: LGTM: Clean addition of async munmap test infrastructure.The new imports and static field are properly scoped for testing the async munmap feature. Initializing
ASYNC_MUNMAPtofalseprovides a safe default.Also applies to: 35-35, 63-63
70-71: LGTM: Async munmap property correctly configured in test setup.The property is set before
AbstractCairoTest.setUpStatic(), ensuring the configuration takes effect during test infrastructure initialization.
75-79: LGTM: Proper test cleanup ensures isolation.Resetting
Files.ASYNC_MUNMAP_ENABLEDtofalsein the@AfterClassmethod prevents state leakage between test classes and ensures proper test isolation.
642-658: LGTM: Platform-aware test parameterization with proper state management.The method correctly guards async munmap enablement with
Os.isPosix()(line 644), preventing Windows compatibility issues. Conditional reinitialization (lines 648-657) efficiently applies configuration changes only when parameters differ, improving test performance while maintaining correctness.
102-102: LGTM: Comprehensive test coverage of async munmap across fuzz test scenarios.All parameterized tests consistently call
setTestParams(rnd), ensuring the async munmap feature is exercised across diverse test scenarios with randomized configurations.Also applies to: 162-162, 174-174, 185-185, 214-214, 266-266, 294-294, 304-304, 332-332, 362-362, 390-390, 400-400, 410-410, 447-447, 457-457, 468-468, 506-506, 536-536, 562-562, 617-617
core/src/main/java/io/questdb/std/MmapCache.java (6)
39-59: LGTM: Clean singleton pattern with properly initialized async munmap infrastructure.The singleton pattern is appropriate for a shared cache. The ring queue with 8K capacity and properly chained producer/consumer sequences provides a solid foundation for async munmap operations. The concurrency primitives used (MPSequence, SCSequence, RingQueue) are platform-independent.
61-69: LGTM: Clear async munmap processing with documented threading constraints.The method correctly drains all pending munmap tasks using
consumeAll. The documentation clearly states the single-threaded requirement, which is enforced by theSCSequence(single-consumer) design. The boolean return value enables callers to optimize scheduling based on whether work was performed.
336-348: LGTM: Proper async munmap error handling with errno logging.The consumer correctly logs errno on failure (line 341, 345), addressing previous review feedback. Memory accounting is only updated on success (lines 338-340), which is correct—a failed munmap means the mapping remains active. The critical log level appropriately signals silent failures in the background job.
359-388: LGTM: Well-designed async/sync munmap flow with appropriate fallbacks.The implementation properly handles multiple scenarios:
Async path (lines 360-378):
- CAS retry loop (lines 364-366) with
Os.pause()prevents tight spinning while handling transient contention- Successful enqueue offloads munmap to background thread (lines 368-374)
- Queue-full fallback to synchronous munmap (lines 375-377) ensures correctness under extreme load at the cost of latency benefit
Sync path (lines 379-388):
- Throws exceptions on failure, providing immediate feedback to callers
- Memory accounting updated only on success, consistent with async path
The asymmetry in error handling (async logs, sync throws) is appropriate given that async operations cannot propagate exceptions back to the original caller.
420-424: LGTM: Clean data holder for munmap tasks.Simple, focused data structure containing exactly the fields needed for munmap operations (address, size) and memory accounting (memoryTag).
360-360: Platform-specific initialization is correctly implemented — no fixes required.The configuration layer (PropServerConfiguration:1530-1533) validates the async munmap setting and throws a ServerConfigurationException if enabled on Windows. The setup layer (WorkerPoolUtils:52-54) includes an assertion
assert Os.isPosix()before enabling Files.ASYNC_MUNMAP_ENABLED. Tests properly guard async munmap on POSIX systems (WalWriterFuzzTest:644). Windows compatibility is well-protected.
[PR Coverage check]😍 pass : 60 / 73 (82.19%) file detail
|
Users running latency-sensitive workloads will experience more consistent query response times, particularly at p99 and above.
Previously, memory unmapping operations could cause unpredictable latency spikes when they occurred during query execution. This change moves these expensive unmapping operations to a background thread, eliminating their impact on query latency. The result is smoother, more predictable performance - especially important for applications with strict SLA requirements or real-time analytics dashboards.
This change is initially an opt-in, to enable it add the following to your server configuration:
cairo.file.async.munmap.enabled=trueBefore:
After:
The numbers above are from a test with ParallelGC. If predictable latency is more important than raw throughput then switching to ZGC has a positive effect too.