Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Netty 5] Default PooledByteBufAllocator configuration #8536

Closed
normanmaurer opened this issue Nov 13, 2018 · 3 comments · Fixed by #12108
Closed

[Netty 5] Default PooledByteBufAllocator configuration #8536

normanmaurer opened this issue Nov 13, 2018 · 3 comments · Fixed by #12108
Assignees
Projects

Comments

@normanmaurer
Copy link
Member

Default memory per chunk that the arena allocates is 16MB. We allocate 2 times number of arenas for direct and for heap buffers. This can be a large amount of memory for applications that may not need it. We should consider reducing the default values to reduce the initial memory footprint, and applications can tune the allocator if they need more chunks.

@normanmaurer normanmaurer added this to To do in Netty 5 via automation Nov 13, 2018
@vkostyukov
Copy link
Contributor

👍

We override these in Finagle for the very same reason.

normanmaurer added a commit that referenced this issue Apr 1, 2019
Motivation:

We currently use a thread local cache for all threads which often is suprising to users as it may result in a lot of memory usage if they allocate buffers from outside the EventLoop in different threads. We should better not do this by default to keep suprises to a minimum. Users that need the performance and know what they are doing can still change this.

Modifications:

Change io.netty.allocator.useCacheForAllThreads to false by default

Result:

Related to #8536.
normanmaurer added a commit that referenced this issue Apr 1, 2019
…8991)

Motivation:

We currently use a thread local cache for all threads which often is suprising to users as it may result in a lot of memory usage if they allocate buffers from outside the EventLoop in different threads. We should better not do this by default to keep suprises to a minimum. Users that need the performance and know what they are doing can still change this.

Modifications:

Change io.netty.allocator.useCacheForAllThreads to false by default

Result:

Related to #8536.
@normanmaurer
Copy link
Member Author

@chrisvest want to have a look ?

@chrisvest
Copy link
Contributor

@normanmaurer Yeah, I'll make a note to look at this soon.

chrisvest added a commit to chrisvest/netty that referenced this issue Feb 25, 2022
… MiB

Motivation:
By default we allocate 2 arenas per core, and each arena that is put to use will allocate a chunk.
If we don't need a lot of memory, and certainly not compared to the number of cores on a system, then this will take up more memory than necessary, since each chunk is 16 MiB.
By reducing the chunk size to 4 MiB, we reduce the minimum memory usage by a good deal, in these cases where not much is needed.
The drawback is that we risk allocating more huge buffers, but this is a fair trade-off since Netty's use cases mostly involve very small buffers.

Modification:
Reduce the default max order from 11 to 9.
Also make similar configuration changes in PooledByteBufAllocatorTest, to reduce the memory usage during testing.

Result:
Netty now uses less memory when less memory is needed by the application.
This fixes netty#8536
Netty 5 automation moved this from To do to Done Feb 25, 2022
chrisvest added a commit that referenced this issue Feb 25, 2022
… MiB (#12108)

Motivation:
By default we allocate 2 arenas per core, and each arena that is put to use will allocate a chunk.
If we don't need a lot of memory, and certainly not compared to the number of cores on a system, then this will take up more memory than necessary, since each chunk is 16 MiB.
By reducing the chunk size to 4 MiB, we reduce the minimum memory usage by a good deal, in these cases where not much is needed.
The drawback is that we risk allocating more huge buffers, but this is a fair trade-off since Netty's use cases mostly involve very small buffers.

Modification:
Reduce the default max order from 11 to 9.
Also make similar configuration changes in PooledByteBufAllocatorTest, to reduce the memory usage during testing.

Result:
Netty now uses less memory when less memory is needed by the application.
This fixes #8536
chrisvest pushed a commit to chrisvest/netty that referenced this issue Feb 28, 2022
…etty#8991)

Motivation:

We currently use a thread local cache for all threads which often is suprising to users as it may result in a lot of memory usage if they allocate buffers from outside the EventLoop in different threads.
We should better not do this by default to keep suprises to a minimum.
Users that need the performance and know what they are doing can still change this.

Modifications:

Change io.netty.allocator.useCacheForAllThreads to false by default

Result:

Related to netty#8536.
normanmaurer added a commit that referenced this issue Mar 3, 2022
…8991) (#12109)

Motivation:

We currently use a thread local cache for all threads which often is suprising to users as it may result in a lot of memory usage if they allocate buffers from outside the EventLoop in different threads.
We should better not do this by default to keep suprises to a minimum.
Users that need the performance and know what they are doing can still change this.

Modifications:

Change io.netty.allocator.useCacheForAllThreads to false by default

Result:

Related to #8536.

Co-authored-by: Norman Maurer <norman_maurer@apple.com>
raidyue pushed a commit to raidyue/netty that referenced this issue Jul 8, 2022
… MiB (netty#12108)

Motivation:
By default we allocate 2 arenas per core, and each arena that is put to use will allocate a chunk.
If we don't need a lot of memory, and certainly not compared to the number of cores on a system, then this will take up more memory than necessary, since each chunk is 16 MiB.
By reducing the chunk size to 4 MiB, we reduce the minimum memory usage by a good deal, in these cases where not much is needed.
The drawback is that we risk allocating more huge buffers, but this is a fair trade-off since Netty's use cases mostly involve very small buffers.

Modification:
Reduce the default max order from 11 to 9.
Also make similar configuration changes in PooledByteBufAllocatorTest, to reduce the memory usage during testing.

Result:
Netty now uses less memory when less memory is needed by the application.
This fixes netty#8536
raidyue pushed a commit to raidyue/netty that referenced this issue Jul 8, 2022
…etty#8991) (netty#12109)

Motivation:

We currently use a thread local cache for all threads which often is suprising to users as it may result in a lot of memory usage if they allocate buffers from outside the EventLoop in different threads.
We should better not do this by default to keep suprises to a minimum.
Users that need the performance and know what they are doing can still change this.

Modifications:

Change io.netty.allocator.useCacheForAllThreads to false by default

Result:

Related to netty#8536.

Co-authored-by: Norman Maurer <norman_maurer@apple.com>
franz1981 pushed a commit to franz1981/netty that referenced this issue Aug 22, 2022
… MiB (netty#12108)

Motivation:
By default we allocate 2 arenas per core, and each arena that is put to use will allocate a chunk.
If we don't need a lot of memory, and certainly not compared to the number of cores on a system, then this will take up more memory than necessary, since each chunk is 16 MiB.
By reducing the chunk size to 4 MiB, we reduce the minimum memory usage by a good deal, in these cases where not much is needed.
The drawback is that we risk allocating more huge buffers, but this is a fair trade-off since Netty's use cases mostly involve very small buffers.

Modification:
Reduce the default max order from 11 to 9.
Also make similar configuration changes in PooledByteBufAllocatorTest, to reduce the memory usage during testing.

Result:
Netty now uses less memory when less memory is needed by the application.
This fixes netty#8536
franz1981 pushed a commit to franz1981/netty that referenced this issue Aug 22, 2022
…etty#8991) (netty#12109)

Motivation:

We currently use a thread local cache for all threads which often is suprising to users as it may result in a lot of memory usage if they allocate buffers from outside the EventLoop in different threads.
We should better not do this by default to keep suprises to a minimum.
Users that need the performance and know what they are doing can still change this.

Modifications:

Change io.netty.allocator.useCacheForAllThreads to false by default

Result:

Related to netty#8536.

Co-authored-by: Norman Maurer <norman_maurer@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Netty 5
  
Done
Development

Successfully merging a pull request may close this issue.

3 participants