Share a single pooled ByteBuf allocator across Netty query transports instead of one per server channel#18905
Merged
Conversation
… instead of one per server channel
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #18905 +/- ##
=========================================
Coverage 64.79% 64.80%
- Complexity 1318 1347 +29
=========================================
Files 3393 3392 -1
Lines 211566 211637 +71
Branches 33285 33295 +10
=========================================
+ Hits 137091 137154 +63
- Misses 63398 63423 +25
+ Partials 11077 11060 -17
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
gortiz
approved these changes
Jul 2, 2026
gortiz
left a comment
Contributor
There was a problem hiding this comment.
We had issues with this configuration multiple times. I thought it was already fixed! Sad to find I was wrong, but thanks for fixing it once and for all!
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
#15335 introduced
PooledByteBufAllocatorWithLimitsto reduce the number of direct arenas in the Netty pooled allocators used by the broker <-> server query transport. However,ServerChannels.ServerChannel's constructor callsPooledByteBufAllocatorWithLimits.getBufferAllocatorWithLimits(...), which creates a brand-newPooledByteBufAllocatoron every call — and oneServerChannelexists perServerRoutingInstance(per server, per table type). Prior to #15335, all server channels shared the singlePooledByteBufAllocator.DEFAULT.A broker connected to N servers therefore creates N independent pooled allocators, each with up to
min(2 * cores, maxDirectMemory / chunkSize / 5)direct arenas:io.netty.util.internal.OutOfDirectMemoryErrorat the-XX:MaxDirectMemorySizecap this way: long after the original response spike, even small request writes (LengthFieldPrepender.encode->PoolArena.newChunk) fail intermittently because most of the direct memory budget is pinned across dozens of mostly-idle per-channel pools, and only a broker restart (which throws away the allocators) recovers. Related recovery-path fixes: Fix DirectOOMHandler string matching bug that missed Netty direct memory OOM #17684, Fix broker write failure handling in ServerChannels #17861.NETTY_POOLED_*broker gauges are wrong. Every newServerChannelre-registersBrokerGauge.NETTY_POOLED_USED_DIRECT_MEMORY(and the other pooled-allocator gauges) against its own allocator's metric, overwriting the previous registration, so the gauge only ever reports the last-created allocator's usage and hides the true total. This made the incidents above much harder to diagnose.The server side has a milder variant of the same problem: each
QueryServer(plaintext and TLS listeners) creates its own limited allocator instart().Fix
PooledByteBufAllocatorWithLimits.getSharedBufferAllocatorWithLimits(), which lazily creates a single process-wide arena-limited allocator; the per-call factory is now private so the shared instance is the only way to obtain one.ServerChannelsobtains it once in its constructor, uses it for everyServerChannelbootstrap, and registers theNETTY_POOLED_*gauges once against that shared allocator's metric (the gauges now also appear at broker startup instead of on the first query).QueryServeruses the same shared allocator, so servers running plaintext + TLS listeners (and combined-role JVMs like quickstarts) share one pool, restoring the pre-Fix Direct Memory OOM on Server #15335 sharing behavior while keeping the reduced arena count — which now applies once per process instead of once per connection.io.netty.allocator.numDirectArenas=0is still honored), and the chosen sizing is logged at creation for diagnosability.This only affects the unshaded Netty transports; the gRPC transports use shaded Netty classes with their own per-component (not per-connection) allocators and are unaffected.
Behavior notes for operators
BrokerGauge.NETTY_POOLED_*values will step up after this change: they previously reported a single (the most recently created) per-channel allocator and now report the whole shared pool. Alerts calibrated against the old values should be re-baselined — the new values are the accurate ones.BrokerGauge.NETTY_POOLED_*andServerGauge.NETTY_POOLED_*now report the same underlying allocator, so summing them across roles double-counts.Testing
ServerChannelsTest.testChannelsShareBufferAllocatorasserts that all server channels — including ones created by differentServerChannelsinstances (e.g. the TLS one) — are bootstrapped with the same allocator instance, and that it is the shared allocator.QueryServerTestnow asserts both the server socket channel and the accepted child channels (which allocate the request/response buffers) are configured with the shared allocator.