-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not release the PinotDataBuffer when closing the index #5400
Conversation
8ae9041
to
f4704ae
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor but LGTM.
@@ -60,11 +60,11 @@ static PinotDataBuffer getIndexBuffer(ColumnIndexDirectory columnDirectory, Stri | |||
static void verifyMultipleReads(ColumnIndexDirectory columnDirectory, String column, int numIter) | |||
throws Exception { | |||
for (int ii = 0; ii < numIter; ii++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to rename the index variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done lol
The PinotDataBuffers are tracked and maintained inside SegmentDirectory for ImmutableSegment and PinotDataBufferMemoryManager for MutableSegment. They are created when loading the indexes and released when closing the segment. If the PinotDataBuffer gets released when closing the indexes, and if the buffer manager decides to reuse the buffer, the following read on the buffer will cause JVM to crash. This can be triggered in SegmentPreProcessor when the same indexes need to be opened twice in two different preprocessors. This PR standardize the behavior of indexes to not release (close) the PinotDataBuffer when closing the index. We don't need to worry about accessing the index after it is already closed because that is already addressed in apache#4764
f4704ae
to
91d46b5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the claim is that memory manager needs to be closed at the close of a segment. It will be good to put that comment in the memory manager interface. If for any reason the memory manager is retained across interfaces, a lot of memory will not be released.
Thanks for adding explicit close() calls with comments to NOT close buffers in there.
The 'reused later' comment needs some explanation. What may be the re-use of those buffers?
@@ -75,7 +67,7 @@ public BaseChunkSingleValueReader(PinotDataBuffer pinotDataBuffer) { | |||
headerOffset += Integer.BYTES; | |||
|
|||
int dataHeaderStart = headerOffset; | |||
if (_version > 1) { | |||
if (version > 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The version was made into a member variable so that when we add or bump the version it is available to derived classes. Not sure why you removed it.
Recently we made a version bump and it was painful and re-factoring had to be done. Some background in PR #5285
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not used anywhere, and IDE will have warning and can auto convert it to local.
I don't see why this can affect version bump as long as you have the version information inside the data buffer.
LOGGER.info("Allocating header buffer of size {} for: {}", _headerSize, _context); | ||
_headerBuffer = _memoryManager.allocate(_headerSize, _context); | ||
// NOTE: PinotDataBuffer is tracked in the PinotDataBufferMemoryManager. No need to track it inside the class. | ||
PinotDataBuffer headerBuffer = _memoryManager.allocate(_headerSize, _context); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand this. is headerBuffer(s) alllocated here closed anywhere? Same q with databBuffers . The reader and writer classes we construct here do not close these buffers, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MemoryManager allocates and tracks all the buffers. It releases them when the segment is closed.
PinotDataBuffer buffer = _memoryManager.allocate(_chunkSizeInBytes, _allocationContext); | ||
_dataBuffers.add(buffer); | ||
_writers.add( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when are these closed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In MutableSegmentImpl.destroy()
where memory manager is closed as the last step.
Both segment directory and memory manager are closed when destroying the segment, which is the expected behavior. They are responsible of allocating and releasing the buffers, and might reuse the buffer across different calls.
In SegmentDirectory, we cache the buffer for each index and return the same buffer if the same buffer is requested twice. |
The PinotDataBuffers are tracked and maintained inside SegmentDirectory
for ImmutableSegment and PinotDataBufferMemoryManager for MutableSegment.
They are created when loading the indexes and released when closing the
segment. If the PinotDataBuffer gets released when closing the indexes,
and if the buffer manager decides to reuse the buffer, the following read
on the buffer will cause JVM to crash. This can be triggered in
SegmentPreProcessor when the same indexes need to be opened twice in two
different preprocessors.
This PR standardize the behavior of indexes to not release (close) the
PinotDataBuffer when closing the index.
We don't need to worry about accessing the index after it is already
closed because that is already addressed in #4764