Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch fielddata to use Lucene doc values APIs. #6908

Closed
wants to merge 9 commits into from

Commits on Jul 18, 2014

  1. Fielddata: Switch to Lucene DV APIs.

    This commits removes BytesValues/LongValues/DoubleValues/... and tries to use
    Lucene's APIs such as NumericDocValues or RandomAccessOrds instead whenever
    possible.
    
    The next step would be to take advantage of the fact that APIs are the same in
    Lucene and Elasticsearch in order to remove our custom comparators and use
    Lucene's.
    
    There are a few side-effects to this change:
     - GeoDistanceComparator has been removed, DoubleValuesComparator is used instead
       on top of dynamically computed values (was easier than migrating
       GeoDistanceComparator).
     - SortedNumericDocValues doesn't guarantee uniqueness so long/double terms
       aggregators have been updated to make sure a document cannot fall twice in
       the same bucket.
     - Sorting by maximum value of a field or running a `max` aggregation is
       potentially significantly faster thanks to the random-access API.
    
    Our aggs and p/c aggregations benchmarks don't report differences with this
    change on uninverted field data. However the fact that doc values don't need
    to be wrapped anymore seems to help a lot. For example
    TermsAggregationSearchBenchmark reports ~30% faster terms aggregations on doc
    values on string fields with this change, which are now only ~18% slower than
    uninverted field data although stored on disk.
    jpountz committed Jul 18, 2014
    11 Configuration menu
    Copy the full SHA
    8f45f46 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    47ad575 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    03a3e4a View commit details
    Browse the repository at this point in the history
  4. Minor changes.

    jpountz committed Jul 18, 2014
    Configuration menu
    Copy the full SHA
    3528cc2 View commit details
    Browse the repository at this point in the history

Commits on Jul 21, 2014

  1. Have an internal MurmurHash3Values class instead of reusing SortedNum…

    …ericDocValues since order is not guaranteed.
    jpountz committed Jul 21, 2014
    Configuration menu
    Copy the full SHA
    2094bca View commit details
    Browse the repository at this point in the history
  2. Remove TODO.

    jpountz committed Jul 21, 2014
    Configuration menu
    Copy the full SHA
    14f82a1 View commit details
    Browse the repository at this point in the history
  3. Add assert.

    jpountz committed Jul 21, 2014
    Configuration menu
    Copy the full SHA
    4809047 View commit details
    Browse the repository at this point in the history
  4. Address comments.

    jpountz committed Jul 21, 2014
    Configuration menu
    Copy the full SHA
    7084de0 View commit details
    Browse the repository at this point in the history
  5. Remove BytesRefSorter.

    jpountz committed Jul 21, 2014
    Configuration menu
    Copy the full SHA
    7c2c14c View commit details
    Browse the repository at this point in the history