Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not release the PinotDataBuffer when closing the index #5400

Merged
merged 1 commit into from
May 19, 2020

Conversation

Jackie-Jiang
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang commented May 16, 2020

The PinotDataBuffers are tracked and maintained inside SegmentDirectory
for ImmutableSegment and PinotDataBufferMemoryManager for MutableSegment.
They are created when loading the indexes and released when closing the
segment. If the PinotDataBuffer gets released when closing the indexes,
and if the buffer manager decides to reuse the buffer, the following read
on the buffer will cause JVM to crash. This can be triggered in
SegmentPreProcessor when the same indexes need to be opened twice in two
different preprocessors.

This PR standardize the behavior of indexes to not release (close) the
PinotDataBuffer when closing the index.
We don't need to worry about accessing the index after it is already
closed because that is already addressed in #4764

@Jackie-Jiang Jackie-Jiang force-pushed the fix_jvm_crash branch 2 times, most recently from 8ae9041 to f4704ae Compare May 16, 2020 21:56
Copy link
Member

@jackjlli jackjlli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor but LGTM.

@@ -60,11 +60,11 @@ static PinotDataBuffer getIndexBuffer(ColumnIndexDirectory columnDirectory, Stri
static void verifyMultipleReads(ColumnIndexDirectory columnDirectory, String column, int numIter)
throws Exception {
for (int ii = 0; ii < numIter; ii++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to rename the index variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done lol

The PinotDataBuffers are tracked and maintained inside SegmentDirectory
for ImmutableSegment and PinotDataBufferMemoryManager for MutableSegment.
They are created when loading the indexes and released when closing the
segment. If the PinotDataBuffer gets released when closing the indexes,
and if the buffer manager decides to reuse the buffer, the following read
on the buffer will cause JVM to crash. This can be triggered in
SegmentPreProcessor when the same indexes need to be opened twice in two
different preprocessors.

This PR standardize the behavior of indexes to not release (close) the
PinotDataBuffer when closing the index.
We don't need to worry about accessing the index after it is already
closed because that is already addressed in apache#4764
@Jackie-Jiang Jackie-Jiang merged commit 9b65c9e into apache:master May 19, 2020
@Jackie-Jiang Jackie-Jiang deleted the fix_jvm_crash branch May 19, 2020 00:07
Copy link
Contributor

@mcvsubbu mcvsubbu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the claim is that memory manager needs to be closed at the close of a segment. It will be good to put that comment in the memory manager interface. If for any reason the memory manager is retained across interfaces, a lot of memory will not be released.

Thanks for adding explicit close() calls with comments to NOT close buffers in there.

The 'reused later' comment needs some explanation. What may be the re-use of those buffers?

@@ -75,7 +67,7 @@ public BaseChunkSingleValueReader(PinotDataBuffer pinotDataBuffer) {
headerOffset += Integer.BYTES;

int dataHeaderStart = headerOffset;
if (_version > 1) {
if (version > 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version was made into a member variable so that when we add or bump the version it is available to derived classes. Not sure why you removed it.

Recently we made a version bump and it was painful and re-factoring had to be done. Some background in PR #5285

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not used anywhere, and IDE will have warning and can auto convert it to local.
I don't see why this can affect version bump as long as you have the version information inside the data buffer.

LOGGER.info("Allocating header buffer of size {} for: {}", _headerSize, _context);
_headerBuffer = _memoryManager.allocate(_headerSize, _context);
// NOTE: PinotDataBuffer is tracked in the PinotDataBufferMemoryManager. No need to track it inside the class.
PinotDataBuffer headerBuffer = _memoryManager.allocate(_headerSize, _context);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand this. is headerBuffer(s) alllocated here closed anywhere? Same q with databBuffers . The reader and writer classes we construct here do not close these buffers, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MemoryManager allocates and tracks all the buffers. It releases them when the segment is closed.

PinotDataBuffer buffer = _memoryManager.allocate(_chunkSizeInBytes, _allocationContext);
_dataBuffers.add(buffer);
_writers.add(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when are these closed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In MutableSegmentImpl.destroy() where memory manager is closed as the last step.

@Jackie-Jiang
Copy link
Contributor Author

I guess the claim is that memory manager needs to be closed at the close of a segment. It will be good to put that comment in the memory manager interface. If for any reason the memory manager is retained across interfaces, a lot of memory will not be released.

Both segment directory and memory manager are closed when destroying the segment, which is the expected behavior. They are responsible of allocating and releasing the buffers, and might reuse the buffer across different calls.

The 'reused later' comment needs some explanation. What may be the re-use of those buffers?

In SegmentDirectory, we cache the buffer for each index and return the same buffer if the same buffer is requested twice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants