Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use BINARY doc values instead of SORTED_SET doc values to store numeric data #4518

Closed

Conversation

Projects
None yet
2 participants
@jpountz
Copy link
Contributor

commented Dec 19, 2013

Although SORTED_SET doc values make things like terms aggregations very fast
thanks to the use of ordinals, ordinals are usually not that useful on numeric
data. We are more interested in the values themselves in order to be able to
compute sums, averages, etc. on these values. However, SORTED_SET is quite slow
at accessing values, so BINARY doc values are better suited at storing numeric
data.

It is only allowed to have a single BINARY doc values field instance per field
name per document, which makes it quite challenging to use for multi-valued
fields since all values need to be buffered in memory and converted to a single
field instance in the end. In order to do so easily, all mappers (not only the
root mappers) now have a preParse and a postParse phase, which are called before
and after all fields have been visited for a single document. In the case of the
numeric field mappers, parse now takes care to buffer values and postParse takes
these values, sorts them and deduplicates them before encoding them into a
BINARY doc values field.

floats and doubles are encoded without compression with little-endian byte order
(so that it may be optimizable through sun.misc.Unsafe in the future given that
most computers nowadays use the little-endian byte order) and byte, short, int,
and long are encoded using vLong encoding: they first encode the minimum value
using zig-zag encoding (so that negative values become positive) and then deltas
between successive values.

I ran TermsAggregationSearchBenchmark to get an idea of the impact of this
change and results are promising:

Task Before this change After this change Difference
terms_agg_l_dv 235 145 38% faster
terms_agg_lm_dv 1705 714 58% faster

For reference, terms_agg_l_dv is an aggregation on a single-valued long field
stored in doc values while terms_agg_lm_dv is an aggregation on a multi-valued
long field (10 values per document) stored in doc values.

Close #3993

@s1monw

View changes

src/main/java/org/elasticsearch/common/util/ByteUtils.java Outdated


/** Utility methods to do byte-level encoding. These methods are biased towards little-endian byte order because it is the most
* common byte order and reading several bytes at once may be optimizable in the future with the help of sun.mist.Unsafe. */

This comment has been minimized.

Copy link
@s1monw

s1monw Dec 20, 2013

Contributor

w00t

@s1monw

View changes

src/main/java/org/elasticsearch/index/fielddata/LongValues.java Outdated
@@ -39,6 +39,23 @@
*/
public abstract class LongValues {

/** Wrap a {@link DoubleValues} instance. */

This comment has been minimized.

Copy link
@s1monw

s1monw Dec 20, 2013

Contributor

just a style think, can we have this method at the bottom of the file?

This comment has been minimized.

Copy link
@jpountz

jpountz Dec 20, 2013

Author Contributor

Sure... what is the recommendation in general? static methods and inner classes at the end?

@s1monw

View changes

src/main/java/org/elasticsearch/index/fielddata/plain/BinaryDVNumericAtomicFieldData.java Outdated
values.get(docId, bytes);
in.reset(bytes.bytes, bytes.offset, bytes.length);
valueCount = 0;
if (!in.eof()) {

This comment has been minimized.

Copy link
@s1monw

s1monw Dec 20, 2013

Contributor

I wonder if it'd make sense to encode the size in the first byte or so? that way the setDocument call is cheap.

This comment has been minimized.

Copy link
@jpountz

jpountz Dec 20, 2013

Author Contributor

I wondered about it too and decided to do it this way, assuming there are not many cases when we need the number of values but not the values themselves?


@Override
public long nextValue() {
assert i < valueCount;

This comment has been minimized.

Copy link
@s1monw

s1monw Dec 20, 2013

Contributor

should we have this assert in there I guess this is hot but small enough in terms of bytecodes

This comment has been minimized.

Copy link
@jpountz

jpountz Dec 20, 2013

Author Contributor

I'd be happy to remove it if it proves to prevent inlining. But it has been quite useful when working on this PR. :)

@s1monw

View changes

src/main/java/org/elasticsearch/index/mapper/ParseContext.java Outdated

/**
*
*/
public class ParseContext {

public static class CacheKey<T> {

This comment has been minimized.

Copy link
@s1monw

s1monw Dec 20, 2013

Contributor

I guess thsi change is ok but @kimchy should look at it as well?

@s1monw

View changes

src/main/java/org/elasticsearch/index/mapper/core/AbstractIntegerFieldMapper.java Outdated
@@ -0,0 +1,126 @@
/*

This comment has been minimized.

Copy link
@s1monw

s1monw Dec 20, 2013

Contributor

so after reviewing this and trying to understand what it does I think I got it. But I am not sure if this is really the way to go. I rather think we should do this in a kind of a map&reduce process based on the context document. At this point we are using Document as the abstraction but we can just use our own that allows to group fields by key or anything like that which allows us to just preprocess the document and each mapper can just put the values in the document directly instead of going the indirection of the context cache? one way of implementing this would be a dedicated field that we can append on that then returns a byte value?

really I am just thinking out loud here :)

This comment has been minimized.

Copy link
@jpountz

jpountz Dec 20, 2013

Author Contributor

I agree this approach would make things much simpler! I didn't want to change document parsing too much in the context in this PR, do you think it is something that could be done later?

@jpountz

This comment has been minimized.

Copy link
Contributor Author

commented Dec 20, 2013

Simon and I discussed about a better way to do the buffering and I'll explore a different approach that would use the Document object instead of the field mapper to do the buffering.

Use BINARY doc values instead of SORTED_SET doc values to store numer…
…ic data.

Although SORTED_SET doc values make things like terms aggregations very fast
thanks to the use of ordinals, ordinals are usually not that useful on numeric
data. We are more interested in the values themselves in order to be able to
compute sums, averages, etc. on these values. However, SORTED_SET is quite slow
at accessing values, so BINARY doc values are better suited at storing numeric
data.

floats and doubles are encoded without compression with little-endian byte order
(so that it may be optimizable through sun.misc.Unsafe in the future given that
most computers nowadays use the little-endian byte order) and byte, short, int,
and long are encoded using vLong encoding: they first encode the minimum value
using zig-zag encoding (so that negative values become positive) and then deltas
between successive values.

Close #3993
@jpountz

This comment has been minimized.

Copy link
Contributor Author

commented Dec 23, 2013

Here is a new approach that does the buffering at the document level (see ParseContext.Document). Since many field mappers were just reverted to go back to what they are in master, I squashed the commits to make it easier to review.


/**
*
*/
public class ParseContext {

/** Fork of {@link org.apache.lucene.document.Document} with additional functionality. */
public static class Document implements Iterable<IndexableField> {

This comment has been minimized.

Copy link
@s1monw

s1monw Dec 23, 2013

Contributor

cool stuff - I wonder if we should only add fields to the multimap that are explicitly added like via add(IndexableField field, String key) for the most of the field this is not needed at all though.

This comment has been minimized.

Copy link
@jpountz

jpountz Dec 23, 2013

Author Contributor

I thought about it as well but on the other hand I wasn't unhappy that Document.get/getField/getBinaryField/... performs in constant time instead of linear time as in Lucene?

This comment has been minimized.

Copy link
@s1monw

s1monw Dec 24, 2013

Contributor

I am not sure though it adds a lot of additional objects per document possibly completely useless. Where do we call these methods?

This comment has been minimized.

Copy link
@jpountz

jpountz Dec 24, 2013

Author Contributor

Mainly in tests, but also in some field mappers. However when it happens in mappers, it is mostly for meta fields (_uid, ...) which are among the first fields so it should be OK.

@@ -474,7 +473,7 @@ public ParsedDocument parse(SourceToParse source, @Nullable ParseListener listen
if (parser == null) {
parser = XContentHelper.createParser(source.source());
}
context.reset(parser, new Document(), source, listener);
context.reset(parser, new ParseContext.Document(), source, listener);

This comment has been minimized.

Copy link
@s1monw

s1monw Dec 23, 2013

Contributor

hmm maybe it would make more sense to remove this argument and add a reset call tot he document as well so we can reuse the array list etc?

This comment has been minimized.

Copy link
@jpountz

jpountz Dec 23, 2013

Author Contributor

Looking at DocumentMapper.java:271, it looks like it works this way on purpose

         // reset the context to free up memory
        context.reset(null, null, null, null);
in.reset(bytes.bytes, bytes.offset, bytes.length);
if (!in.eof()) {
// first value uses vLong on top of zig-zag encoding, then deltas are encoded using vLong
long previousValue = longs[0] = ByteUtils.zigZagDecode(ByteUtils.readVLong(in));

This comment has been minimized.

Copy link
@s1monw

s1monw Dec 23, 2013

Contributor

I really like the zigZag idea here. Decoding is very fast IMO and compression might pay off very well since we only have positive deltas then. +1

@@ -309,8 +313,14 @@ protected void innerParseCreateField(ParseContext context, List<Field> fields) t
field.setBoost(boost);
fields.add(field);
}
if (hasDocValues()) {
fields.add(toDocValues(value));
if (hasDocValues() && !Double.isNaN(value)) {

This comment has been minimized.

Copy link
@s1monw

s1monw Dec 23, 2013

Contributor

hmm what happens if this is NaN?

@Override
public BytesRef binaryValue() {
// sort
new InPlaceMergeSorter() {

This comment has been minimized.

Copy link
@s1monw

s1monw Dec 23, 2013

Contributor

I wonder if those in place sorting could be in a utils class I can see that being used elsewhere and the code would look simpler here?

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Dec 23, 2013

I left some minor comments but this looks awesome. +1 in general I think the next iter goes in though!

@jpountz

This comment has been minimized.

Copy link
Contributor Author

commented Dec 23, 2013

New commit pushed:

  • Behavior of NaN is now consistent with uninverted field data, appearing at most once and always at the last position (compares greater than POSITIVE_INFINITY).
  • Sorting and deduplication utility methods moved to a utility class.

@Override
public Iterator<IndexableField> iterator() {
return fieldList.iterator();

This comment has been minimized.

Copy link
@s1monw

s1monw Dec 24, 2013

Contributor

It doesn't make sense to return unmodifiableList in getFields but return a mutable iterator here, I guess it's ok to get a mutable list or we should maybe make both unmodifiable but I don't think we should add yet another object wrapper here WDYT?

This comment has been minimized.

Copy link
@jpountz

jpountz Dec 24, 2013

Author Contributor

Right, I had forgotten that iterators also have a remove method. The unmodifiable wrapper was mainly useful for me to understand what consumers do with it. I'll remove the wrappers.

@@ -1 +1 @@
Subproject commit 2f5f78f24d8fbacf69c83ab7545654c83965e846
Subproject commit b3ab72486fae1b5c5a5397356a3e113bf72eb6d5

This comment has been minimized.

Copy link
@s1monw

s1monw Dec 25, 2013

Contributor

hmm I guess this one will be updated once you rebase :)


@Override
protected int compare(int i, int j) {
return CollectionUtils.compare(array[i], array[j]);

This comment has been minimized.

Copy link
@s1monw

s1monw Dec 25, 2013

Contributor

once we move to java 7 we can just use Long.compareTo() ;) good to have it encapsulated - we should check if we use it across the board one this is in.

}

@Override
public long getNumberUniqueValues() {

This comment has been minimized.

Copy link
@s1monw

s1monw Dec 25, 2013

Contributor

if this is unknown should we retrun -1? I mean we would need to redefine this in the interface since it's documented as an upper bound. so that is ok but I wonder if we pass this to array initializers or so? not sure what this is used for

This comment has been minimized.

Copy link
@jpountz

jpountz Dec 26, 2013

Author Contributor

Uri recently pushed a commit to use it in order to do the initial sizing of data-structures for aggregations: 7b13f19 But we cap it anyway: the query may only see a few values, and even when there are lots of unique values it may be faster to grow on demand.

Would -1 really be a better value when unknown? I think the common use for this number is to return the maximum cardinality of a segment. If -1 is the value meaning unknown, we would need to compute something like:

long maxCardinality = 0;
for (AtomicReaderContext readerContext : topLevelReader.leaves()) {
    AtomicFieldData fd = [...];
    if (fd.getNumberUniqueValues() == -1) {
      return -1;
    } else {
      maxCardinality = Math.max(fd.getNumberUniqueValues(), maxCardinality);
    }
}
return maxCardinality;

On the other hand, with MAX_VALUE meaning unknown, it becomes simpler:

long maxCardinality = 0;
for (AtomicReaderContext readerContext : topLevelReader.leaves()) {
    AtomicFieldData fd = [...];
    maxCardinality = Math.max(fd.getNumberUniqueValues(), maxCardinality);
}
return maxCardinality;
@s1monw

This comment has been minimized.

Copy link
Contributor

commented Dec 25, 2013

did another round and left some minor comments that can be handled once its pushed. +1 LGTM

@jpountz

This comment has been minimized.

Copy link
Contributor Author

commented Dec 26, 2013

Thanks for the reviews, Simon. Much appreciated!

@jpountz jpountz closed this Dec 26, 2013

@jpountz jpountz deleted the jpountz:enhancement/numeric_doc_values branch Dec 26, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.