Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable PCBC completionObjects autoShrink to reduce memory usage and gc #3913

Merged

Conversation

wenbingshen
Copy link
Member

@wenbingshen wenbingshen commented Apr 11, 2023

Motivation

PerChannelBookieClient completionObjects occupy a lot of heap space and cannot be recycled.
The figure below shows that the internal table array of ConcurrentOpenHashMap has used space size=0, but the array length is still 16384, and the memory overhead is 65552bytes.
image

image

ConcurrentOpenHashMap default DefaultConcurrencyLevel=16. We have hundreds of bookie nodes. Due to the feature of bookie polling and writing, the client and server have long connection characteristics, which will as a result, the memory usage of about 65552 * 16 * 1776 = 1.74GB cannot be recycled, and the space take up by these tables is all size=0 (The broker's owner topic has drifted to other brokers due to Full GC).
image

When the throughput of the pulsar cluster increases and the bookie cluster expands, these memory usage will also increase. Coupled with the unreasonable memory usage in other aspects of pulsar that we know, this will cause the pulsar broker to continuously generate full gc.

Changes

I think adding autoShrink to completionObjects can reduce this part of memory usage and reduce the frequency of Full GC.

@horizonzy
Copy link
Member

ConcurrentOpenHashMap default DefaultConcurrencyLevel=16. We have hundreds of bookie nodes. Due to the feature of bookie polling and writing, the client and server have long connection characteristics, which will as a result, the memory usage of about 65552 * 16 * 1776 = 1.74GB cannot be recycled, and the space take up by these tables is all size=0 (The broker's owner topic has drifted to other brokers due to Full GC).

I have a question about this. The memory occupation is not about the array size. The key(CompletionKey) and the value(CompletionValue) is the occupier. As long as the key and the value is removed, the memory occupation will be decrease.
We make the array size autoShrink didn't help the GC

@wenbingshen
Copy link
Member Author

wenbingshen commented Apr 13, 2023

I have a question about this. The memory occupation is not about the array size. The key(CompletionKey) and the value(CompletionValue) is the occupier. As long as the key and the value is removed, the memory occupation will be decrease. We make the array size autoShrink didn't help the GC

@horizonzy The layout of the array object in memory, when pointer compression is enabled, includes
array object header + length * 4 (reference) + length * (single element size)

We have an object array with a size of 16384, and the space it occupies in memory is = 8 + 4 + 4 + 4 * 16384 = 65552
In our pulsar broker, such an array would occupy about = 65552 * 16 * 1776 = 1.74GB
image

If we can turn on autoShrink, in the case of size=0, the array size will shrink to 24.
And the space it occupies in memory is = 8 + 4 + 4 + 4 * 24 = 112
In our pulsar broker, such an array would occupy about = 112 * 16 * 1776 = 3108KB
image

This way we can reclaim a lot of space in memory.

@horizonzy
Copy link
Member

In our pulsar broker, such an array would occupy about = 65552 * 16 * 1776 = 1.74GB

Why multiply 16.

@wenbingshen
Copy link
Member Author

Why multiply 16.

@horizonzy ConcurrentOpenHashMap default DefaultConcurrencyLevel=16

image
image

@horizonzy
Copy link
Member

Thanks, I got it.

Copy link
Member

@horizonzy horizonzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement. LGTM

Copy link
Contributor

@hangc0276 hangc0276 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice Catch!

@wenbingshen
Copy link
Member Author

@merlimat @eolivelli @dlg99 @zymap Can you help take a look at this pr. Thanks.

Copy link
Contributor

@dlg99 dlg99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hangc0276 hangc0276 merged commit ca33b31 into apache:master Apr 22, 2023
@wenbingshen wenbingshen deleted the wenbing/autoShrinkCompletionObjects branch April 23, 2023 02:55
zymap pushed a commit that referenced this pull request Jun 19, 2023
#3913)

### Motivation

PerChannelBookieClient completionObjects occupy a lot of heap space and cannot be recycled.
The figure below shows that the internal table array of ConcurrentOpenHashMap has used space size=0, but the array length is still 16384, and the memory overhead is 65552bytes.
![image](https://user-images.githubusercontent.com/35599757/231114802-db90c49b-d295-46d7-b7db-785035b341f0.png)

![image](https://user-images.githubusercontent.com/35599757/231113930-bd9f3f54-9052-4c0b-9a3f-2fc493632e35.png)

ConcurrentOpenHashMap default DefaultConcurrencyLevel=16. We have hundreds of bookie nodes. Due to the feature of bookie polling and writing, the client and server have long connection characteristics, which will as a result, the memory usage of about 65552 * 16 * 1776 = 1.74GB cannot be recycled, and the space take up by these tables is all size=0 (The broker's owner topic has drifted to other brokers due to Full GC).
![image](https://user-images.githubusercontent.com/35599757/231117087-08c80320-fa71-49c2-a199-cfee3d83ddc5.png)

When the throughput of the pulsar cluster increases and the bookie cluster expands, these memory usage will also increase. Coupled with the unreasonable memory usage in other aspects of pulsar that we know, this will cause the pulsar broker to continuously generate full gc.

### Changes
I think adding autoShrink to completionObjects can reduce this part of memory usage and reduce the frequency of Full GC.

(cherry picked from commit ca33b31)
zymap pushed a commit that referenced this pull request Dec 6, 2023
#3913)

### Motivation

PerChannelBookieClient completionObjects occupy a lot of heap space and cannot be recycled.
The figure below shows that the internal table array of ConcurrentOpenHashMap has used space size=0, but the array length is still 16384, and the memory overhead is 65552bytes.
![image](https://user-images.githubusercontent.com/35599757/231114802-db90c49b-d295-46d7-b7db-785035b341f0.png)

![image](https://user-images.githubusercontent.com/35599757/231113930-bd9f3f54-9052-4c0b-9a3f-2fc493632e35.png)

ConcurrentOpenHashMap default DefaultConcurrencyLevel=16. We have hundreds of bookie nodes. Due to the feature of bookie polling and writing, the client and server have long connection characteristics, which will as a result, the memory usage of about 65552 * 16 * 1776 = 1.74GB cannot be recycled, and the space take up by these tables is all size=0 (The broker's owner topic has drifted to other brokers due to Full GC).
![image](https://user-images.githubusercontent.com/35599757/231117087-08c80320-fa71-49c2-a199-cfee3d83ddc5.png)

When the throughput of the pulsar cluster increases and the bookie cluster expands, these memory usage will also increase. Coupled with the unreasonable memory usage in other aspects of pulsar that we know, this will cause the pulsar broker to continuously generate full gc.

### Changes
I think adding autoShrink to completionObjects can reduce this part of memory usage and reduce the frequency of Full GC.

(cherry picked from commit ca33b31)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants