Share a single pooled ByteBuf allocator across Netty query transports instead of one per server channel by yashmayya · Pull Request #18905 · apache/pinot

yashmayya · 2026-07-01T22:51:55Z

Problem

#15335 introduced PooledByteBufAllocatorWithLimits to reduce the number of direct arenas in the Netty pooled allocators used by the broker <-> server query transport. However, ServerChannels.ServerChannel's constructor calls PooledByteBufAllocatorWithLimits.getBufferAllocatorWithLimits(...), which creates a brand-new PooledByteBufAllocator on every call — and one ServerChannel exists per ServerRoutingInstance (per server, per table type). Prior to #15335, all server channels shared the single PooledByteBufAllocator.DEFAULT.

A broker connected to N servers therefore creates N independent pooled allocators, each with up to min(2 * cores, maxDirectMemory / chunkSize / 5) direct arenas:

Direct memory retention is amplified up to N times. Netty pooled arenas retain chunk memory after the buffers allocated from them are released, and free space in one allocator's pool can never serve another allocator's allocations. Large scatter-gather responses spike several channels' pools, each pool's retained chunk memory ratchets up independently, and the broker's total pinned direct memory grows toward N times the intended per-process footprint. We have seen production brokers repeatedly hit io.netty.util.internal.OutOfDirectMemoryError at the -XX:MaxDirectMemorySize cap this way: long after the original response spike, even small request writes (LengthFieldPrepender.encode -> PoolArena.newChunk) fail intermittently because most of the direct memory budget is pinned across dozens of mostly-idle per-channel pools, and only a broker restart (which throws away the allocators) recovers. Related recovery-path fixes: Fix DirectOOMHandler string matching bug that missed Netty direct memory OOM #17684, Fix broker write failure handling in ServerChannels #17861.
The NETTY_POOLED_* broker gauges are wrong. Every new ServerChannel re-registers BrokerGauge.NETTY_POOLED_USED_DIRECT_MEMORY (and the other pooled-allocator gauges) against its own allocator's metric, overwriting the previous registration, so the gauge only ever reports the last-created allocator's usage and hides the true total. This made the incidents above much harder to diagnose.

The server side has a milder variant of the same problem: each QueryServer (plaintext and TLS listeners) creates its own limited allocator in start().

Fix

Add PooledByteBufAllocatorWithLimits.getSharedBufferAllocatorWithLimits(), which lazily creates a single process-wide arena-limited allocator; the per-call factory is now private so the shared instance is the only way to obtain one.
ServerChannels obtains it once in its constructor, uses it for every ServerChannel bootstrap, and registers the NETTY_POOLED_* gauges once against that shared allocator's metric (the gauges now also appear at broker startup instead of on the first query).
QueryServer uses the same shared allocator, so servers running plaintext + TLS listeners (and combined-role JVMs like quickstarts) share one pool, restoring the pre-Fix Direct Memory OOM on Server #15335 sharing behavior while keeping the reduced arena count — which now applies once per process instead of once per connection.
Since the allocator is now created once and lives for the process lifetime, the computed arena-count default is floored at 1 so that a depleted direct-memory snapshot at creation time cannot permanently disable pooling (an explicit io.netty.allocator.numDirectArenas=0 is still honored), and the chosen sizing is logged at creation for diagnosability.

This only affects the unshaded Netty transports; the gRPC transports use shaded Netty classes with their own per-component (not per-connection) allocators and are unaffected.

Behavior notes for operators

BrokerGauge.NETTY_POOLED_* values will step up after this change: they previously reported a single (the most recently created) per-channel allocator and now report the whole shared pool. Alerts calibrated against the old values should be re-baselined — the new values are the accurate ones.
In combined-role JVMs (e.g. quickstarts), BrokerGauge.NETTY_POOLED_* and ServerGauge.NETTY_POOLED_* now report the same underlying allocator, so summing them across roles double-counts.

Testing

New ServerChannelsTest.testChannelsShareBufferAllocator asserts that all server channels — including ones created by different ServerChannels instances (e.g. the TLS one) — are bootstrapped with the same allocator instance, and that it is the shared allocator.
QueryServerTest now asserts both the server socket channel and the accepted child channels (which allocate the request/response buffers) are configured with the shared allocator.

… instead of one per server channel

codecov-commenter · 2026-07-01T23:45:57Z

Codecov Report

❌ Patch coverage is 96.42857% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 64.80%. Comparing base (3718bc3) to head (b136b4d).
⚠️ Report is 9 commits behind head on master.

Files with missing lines	Patch %	Lines
...re/transport/PooledByteBufAllocatorWithLimits.java	92.30%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff            @@
##             master   #18905   +/-   ##
=========================================
  Coverage     64.79%   64.80%           
- Complexity     1318     1347   +29     
=========================================
  Files          3393     3392    -1     
  Lines        211566   211637   +71     
  Branches      33285    33295   +10     
=========================================
+ Hits         137091   137154   +63     
- Misses        63398    63423   +25     
+ Partials      11077    11060   -17

Flag	Coverage Δ
custom-integration1	`100.00% <ø> (ø)`
integration	`100.00% <ø> (ø)`
integration1	`100.00% <ø> (ø)`
integration2	`0.00% <ø> (ø)`
java-21	`64.80% <96.42%> (+<0.01%)`	⬆️
temurin	`64.80% <96.42%> (+<0.01%)`	⬆️
unittests	`64.80% <96.42%> (+<0.01%)`	⬆️
unittests1	`56.99% <96.42%> (+0.01%)`	⬆️
unittests2	`37.15% <92.85%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

gortiz

We had issues with this configuration multiple times. I thought it was already fixed! Sad to find I was wrong, but thanks for fixing it once and for all!

Share a single pooled ByteBuf allocator across Netty query transports…

b136b4d

… instead of one per server channel

yashmayya added enhancement Improvement to existing functionality memory Related to memory usage or optimization labels Jul 1, 2026

yashmayya requested review from Jackie-Jiang, gortiz and xiangfu0 July 1, 2026 22:54

gortiz approved these changes Jul 2, 2026

View reviewed changes

yashmayya merged commit 545210b into apache:master Jul 2, 2026
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Share a single pooled ByteBuf allocator across Netty query transports instead of one per server channel#18905

Share a single pooled ByteBuf allocator across Netty query transports instead of one per server channel#18905
yashmayya merged 1 commit into
apache:masterfrom
yashmayya:netty-shared-pooled-allocator

yashmayya commented Jul 1, 2026

Uh oh!

codecov-commenter commented Jul 1, 2026 •

edited

Loading

Uh oh!

gortiz left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

yashmayya commented Jul 1, 2026

Problem

Fix

Behavior notes for operators

Testing

Uh oh!

codecov-commenter commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

gortiz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-commenter commented Jul 1, 2026 •

edited

Loading