-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Change PerFieldMapperCodec to use tsdb doc values codec for all fields. #105301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change PerFieldMapperCodec to use tsdb doc values codec for all fields. #105301
Conversation
The index needs to be in tsdb mode. Except fields start with a `_` (excluding `_tsid`). Before this change relies on MapperService to check whether a field needed to use tsdb doc values codec, but we missed many field types (ip field type, scaled float field type, unsigned long field type, etc.). Instead we wanted to depend on the doc values type in FieldInfo, but that information is not available in PerFieldMapperCodec. So instead binary doc values support was added to ES87TSDBDocValuesFormat. This allows it to be used for any doc values field. The code was copied from Lucene90DocValuesFormat. Maybe we can some simple compression here? But on the other hand binary doc values shouldn't be made complex, otherwise it defeats the purpose of binary doc values. None of that has been done yet in this change.
b8fa415 to
7aff8cc
Compare
jpountz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the simpler logic. We can always make the decision to compress binary doc values later on if we want. Just copying @iverase, who might have opinions since geo_shape fields use binary doc values internally.
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
| this.addresses = addresses; | ||
| this.compressedData = compressedData; | ||
| // pre-allocate a byte array large enough for the biggest uncompressed block needed. | ||
| this.uncompressedBlock = new byte[biggestUncompressedBlockSize]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iverase I remember of you discussing humonguous allocations due to this sort of approach with very large fields such as geo-shapes, is there a better way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember we had problems because of the size of this array in the past for geo_shape fields. It was in part for a not optimal implementation of the LatLonShapeDocValues query which did not implement the scorer supplier so we were allocating this array for each query.
Because do values for geo_shape can be big and we compress a fix amount of documents, the array can grow pretty big an cause problems, if we are reading doc values very often.
I wonder if we can build those blocks not only considering a fix number of documents but a maximum size so we keep the size of that array under some sensitive size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Ignacio, I think encoding the number of documents would be helpful here in order to avoid accidentally reading GBs of data. So that we can limit number of uncompressed we write into block.
However I currently don't have time to implement this before tomorrow. And I like to get to change in as it allows use to consistently use the tsdb doc values format for all fields in a tsdb index. Without this change scaling float field, unsigned integers and others don't use the tsdb doc values codec and that is a waste.
I will try to implement the compression in a follow up later. Downside is that we have to add versioning logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fine, I don't think geo_shape is an important field type for tsdb indices so it can wait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, binary doc values isn't that much used for tsdb indices. Mainly range*, percolator, geo shape and, counted keyword field types.
server/src/main/java/org/elasticsearch/index/codec/tsdb/ES87TSDBDocValuesProducer.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/codec/tsdb/ES87TSDBDocValuesProducer.java
Outdated
Show resolved
Hide resolved
|
It looks like enabling the tsdb doc values codec for all field type saves another ~21MB in total storage usage. Field by field analysis
|
adaff85 to
7c278c7
Compare
The binary doc values implementation from LUCENE-9211 (apache/lucene-solr#1234) is used here, which stores the values in LZ4 compressed blocks, in order to reduce storage usage (in the Lucene default doc values coded binary doc values are stored without any compression). Follow up from elastic#105301
) Add compression for binary doc values using Zstd and blocks with a variable number of values. Block-wise LZ4 compression for binary doc values was previously added to Lucene in LUCENE-9211. This was subsequently removed in LUCENE-9378 due to query performance issues. We investigated adding to adding the original Lucene implementation to ES in #112416 and #105301. This previous approach used a constant number of values per block (specifically 32 values). This is nice because it makes it very easy to map a given value index (eg docId for dense values) to the block containing it with blockId = docId / 32. Unfortunately, if values are very large we cannot reduce the number of values per block and (de)compressing a block could cause an OOM. Also, since this is a concern, we have to keep the number of values lower than ideal. This PR instead stores a variable number of documents per block. It stores a minimum of 1 document per block and stops adding values when the size of a block exceeds a threshold. Like the previous version it stores an array of address for the start of each block. Additionally, it stores a parallel array with the value index at the start of each block. When looking up a given value index, if it is not in the current block, we binary search the array of value index starts to find the blockId containing the value. Then look up the address of the block.
…tic#137139) Add compression for binary doc values using Zstd and blocks with a variable number of values. Block-wise LZ4 compression for binary doc values was previously added to Lucene in LUCENE-9211. This was subsequently removed in LUCENE-9378 due to query performance issues. We investigated adding to adding the original Lucene implementation to ES in elastic#112416 and elastic#105301. This previous approach used a constant number of values per block (specifically 32 values). This is nice because it makes it very easy to map a given value index (eg docId for dense values) to the block containing it with blockId = docId / 32. Unfortunately, if values are very large we cannot reduce the number of values per block and (de)compressing a block could cause an OOM. Also, since this is a concern, we have to keep the number of values lower than ideal. This PR instead stores a variable number of documents per block. It stores a minimum of 1 document per block and stops adding values when the size of a block exceeds a threshold. Like the previous version it stores an array of address for the start of each block. Additionally, it stores a parallel array with the value index at the start of each block. When looking up a given value index, if it is not in the current block, we binary search the array of value index starts to find the blockId containing the value. Then look up the address of the block.
The index needs to be in tsdb mode. All fields will use the tsdb coded, except fields start with a _ (not excluding _tsid).
Before this change relies on MapperService to check whether a field needed to use tsdb doc values codec, but we missed many field types (ip field type, scaled float field type, unsigned long field type, etc.). Instead we wanted to depend on the doc values type in FieldInfo, but that information is not available in PerFieldMapperCodec.
Borrowed the binary doc values implementation from
Lucene90DocValuesFormat. This allows it to be used for any doc values field.Maybe we can implement some simple compression here? But on the other hand binary doc values shouldn't be made complex, otherwise it defeats the purpose of binary doc values. None of that has been done yet in this change.Binary doc values support is now added to the tsdb doc values codec. So that every field in a tsdb index can delegate to this codec. Only fields that start with_are delegating the default doc values codec (except_tsid). The binary doc values implementation from LUCENE-9211 (apache/lucene-solr#1234) is used here, which stores the values in LZ4 compressed blocks, in order to reduce storage usage (in the Lucene default doc values coded binary doc values are stored without any compression).Followup on #99747