-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix double memory allocation during operator setup #5619
Conversation
Amazing work! We definitely need a backward compatibility test for segments. |
can you check in a small segment file from 0.3.0 and add a test case on top of that? |
@kishoreg , there is no backward incompatibility here. The PR #5510 actually results in double memory allocation (as explained above in the description) W.r.t tests for backward compatibility of raw index reader writer, we already have them. There are files checked in from prior versions to ensure current readers can read old versions. We have them for both compressed and uncompressed. I recently added more tests too. See VarByteChunkSingleValueWriterTest.java |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Got it. I thought there was something wrong with the metadata in the segment header. |
Good finding! Besides, the real root cause of this issue is allocating direct memory within reader context without freeing it explicitly. There is no guarantee when the direct memory can be released. We need to figure out a way to free the memory explicitly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -62,7 +62,7 @@ public DataFetcher(Map<String, DataSource> dataSourceMap) { | |||
DataSource dataSource = entry.getValue(); | |||
_dictionaryMap.put(column, dataSource.getDictionary()); | |||
DataSourceMetadata dataSourceMetadata = dataSource.getDataSourceMetadata(); | |||
BlockValSet blockValueSet = dataSource.nextBlock().getBlockValueSet(); | |||
BlockValSet blockValueSet = dataSourceMap.get(column).nextBlock().getBlockValueSet(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert this line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
private final BlockValSet _blockValSet; | ||
private final SingleColumnMultiValueReader _reader; | ||
private final FieldSpec.DataType _dataType; | ||
private BlockValSet _blockValSet; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) put this line after line 37
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -32,13 +32,14 @@ | |||
@SuppressWarnings("rawtypes") | |||
public final class SingleValueBlock implements Block { | |||
private final SingleColumnSingleValueReader _reader; | |||
private final BlockValSet _blockValSet; | |||
private final FieldSpec.DataType _dataType; | |||
private BlockValSet _blockValSet; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) Put this line after line 37
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -47,6 +50,9 @@ public BlockDocIdSet getBlockDocIdSet() { | |||
|
|||
@Override | |||
public BlockValSet getBlockValueSet() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Jackie-Jiang , I am concerned about the invocation of this in multi-threaded scenario. From the code, it doesn't look like it will happen since each query will create it's own. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@siddharthteotia There is no multi-threading for the blocks. All the blocks are processed by one thread
Co-authored-by: Siddharth Teotia <steotia@steotia-mn1.linkedin.biz>
Motivation: Currently DataSource is modeled as an Operator, where values are returned as a block complying with the Operator interface. This is confusing because of the following reasons: - The block contains all the documents in the segment, instead of a block of at most 10000 documents as in the Projection layer. - The values are always fetched with their document ids, instead of fetched as a block. Currently BlockValSet interface has 2 set of APIs because of this, which is confusing. - The BlockValSet returned by the DataSource is not really a value set, but a value reader on top of the forward index. - Extra BlockMetadata has to be maintained which can cause unexpected problems (e.g. the issue fixed in apache#5619) Changes: - Make DataSource standalong without implementing Operator - Add interfaces for forward index (ForwardIndexReader, ForwardIndexWriter, ForwardIndexReaderWriter) - Add ColumnValueReader class to help read forward index from DataSource - Remove the docId based APIs from BlockValSet
…5625) Changes: - Make `DataSource` independent of the query execution (not extend `Operator`). - Add interfaces for forward index. - Make `ForwardIndexReaderContext` close-able so that resources (e.g. off-heap buffer) can be released. - In `ForwardIndexReader', always read with `ForwardIndexReaderContext` and close it when the reading is done to prevent resource leak. - Remove `SingleValueSet` and `MultiValueSet` and change caller logic to directly use forward index and prevent resource leak. - Make `DataFetcher` directly use forward index and handle the type conversion within the class. Similar to Dictionary convention, type conversion among INT, LONG, FLOAT, DOUBLE, STRING, and between STRING and BYTES are supported. Motivation: Currently `DataSource` is modeled as an `Operator`, where values are returned as a block complying with the Operator interface. This is confusing because of the following reasons: - The block contains all the documents in the segment, instead of a block of at most 10000 documents as in the `Projection` layer. - The values are always fetched with their document ids, instead of fetched as a block. Currently `BlockValSet` interface has 2 sets of APIs because of this, which is confusing. - The `BlockValSet` returned by the `DataSource` is not really a value set, but a value reader on top of the forward index. - Extra `BlockMetadata` has to be maintained which can cause unexpected problems (e.g. the issue fixed in #5619) TODO: There are 2 remaining places where `ForwardIndexReaderContext` needs to be closed: - `ScanBasedDocIdIterator` - `DataFetcher` New inheritance hierarchy for the forward index classes: - `ForwardIndexReader` - Dictionary-encoded unsorted SV: `FixedBitSVForwardIndexReader` - Dictionary-encoded sorted SV: `SortedIndexReaderImpl` - Dictionary-encoded MV: `FixedBitMVForwardIndexReader` - Raw fixed-byte SV: `FixedByteChunkSVForwardIndexReader` - Raw var-byte SV: `VarByteChunkSVForwardIndexReader` - `MutableForwardIndex` extends `ForwardIndexReader` - Fixed-byte SV (for both dictionary-encoded and raw): `FixedByteSVMutableForwardIndex` - Var-byte SV: `VarByteSVMutableForwardIndex` - Fixed-byte MV (for both dictionary-encoded and raw): `FixedByteMVMutableForwardIndex` - `ForwardIndexCreator` - Dictionary-encoded unsorted SV: `SingleValueUnsortedForwardIndexCreator` - Dictionary-encoded sorted SV: `SingleValueSortedForwardIndexCreator` - Dictionary-encoded MV: `MultiValueUnsortedForwardIndexCreator` - Raw fixed-byte SV: `SingleValueFixedByteRawIndexCreator` - Raw var-byte SV: `SingleValueVarByteRawIndexCreator`
This is a fix for issue being seen in #5610
The call stack for the OOM is
This implies that error happens essentially during execution setup phase when the operators are being setup and haven't begun the execution by passing data between them.
As per @fx19880617 , the call stack seems to be consistently reproducible as soon as the client upgrades to 0.4.0 in production and runs a SELECT * query on segments created in 0.3.0. At the top of the stack, there is ChunkReaderContext which allocates a chunk and fails with direct memory OOM. @fx19880617 verified that chunk size and numDocsPerChunk etc are same so it is the not case that in 0.4.0, we are suddenly allocating more memory in the ChunkReaderContext and thus failing.
Either there is a leak or something else.
As part of PR #5510, one change made to SingleValueSet (also part of the call stack) was that reader context is now being created in the call to constructor as opposed to during every read call. There are two implications of this:
Earlier since chunk reader context objects were created on a per call basis, they were essentially short lived and probably never made their way to perm gen. Thus were garbage collected thereby also leading to garbage collecting the direct byte buffer reference inside them and freeing up direct memory -- this is essentially how direct memory is freed up in JVM unless cleaner is called. Now since they are being allocated in the constructor, they have essentially become long-lived objects and are unlikely to be GCed as quickly as they were in the previous code. Thus, there will be memory pressure.
The second implication as explained below is worse and probably the actual root cause of the OOM reported by the customer.
See this code for creation of ProjectionOperator (note that it is part of call stack)
_dataBlockMap.put(entry.getKey(), entry.getValue().nextBlock());
creates a block by going down the path of nextBlock() -> SingleValueBlock -> SingleValueSet -> constructor -> create reader context (direct memory allocated with the new PR)We then create DataFetcher with dataSourceMap and this code again creates a block
dataSourceMap.get(column).nextBlock().getBlockValueSet()
thus essentially going down the same path again eventually creating SingleValueBlock -> SingleValueSet -> reader context. -> allocating direct memorySo the memory is being allocated twice and somewhere in the middle of doing this for the DataFetcher for a given column, we fail with OOM