Enable numeric sort optimisation for few numerical sort types #6321

gashutos · 2023-02-14T18:27:08Z

Description

Enabling numeric sort optimisation for below 4 numeric ypes.

DATE
DATE_NANOSECONDS
LONG
DOUBLE

For above 4 types, we have same sort type and point type. So should not be any harm doing like that.

Lucene gives us in-built ability to optimise sorting on certain sort field types where its point type is matching. i.e for fields with data type Date or Long, we will able to use this optimisation.
There already exists a check in our code to enable this optimisation for only Date/Long data types. This was introduced as part of PR https://github.com/opensearch-project/OpenSearch/pull/1974/files
As part of PR (during upgrade of Lucene 9.0.0) https://github.com/opensearch-project/OpenSearch/pull/1109/files, we have removed the numeric sort optimisations thus causing a regression in sort performance. I think this has been removed due to its deprecated method.
but Lucene didn't say it's numeric optimisation would be deprecated, but the method introduced newly to enable/disable optimisation would be deprecated. apache/lucene@cc58c51. Since these optimisations are enabled by default, earlier these optimisations were disabled by default.

I tried adding back the code in https://github.com/opensearch-project/OpenSearch/pull/1974/files with sortField.setOptimizeSortWithPoints(true); for Date and Long data types and with that we were able to achieve similar performance (before the removal) in Open Search 2.3.

Below are the results from OS_2.3. (ran on managed AOS)
Units in ms.

Data Node Type.	Sort Mode	OS 2.3	OS 2.3 (before)
r6g.2xlarge	asc	59	349
r6g.2xlarge	desc	170	1646

Issues Resolved

[OpenSearch-Project-5534]

Check List

New functionality includes testing.
- All tests pass
New functionality has been documented.
- New functionality has javadoc added
Commits are signed per the DCO using --signoff
Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

github-actions · 2023-02-14T18:44:24Z

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE ❌
URL: https://build.ci.opensearch.org/job/gradle-check/11134/
CommitID: 4530ec5
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green.
Is the failure a flaky test unrelated to your change?

Bukhtawar

Thanks Chaitanya, Can we add some tests around this change. Also might be good to see some perf numbers around this change

server/src/main/java/org/opensearch/index/fielddata/IndexNumericFieldData.java

andrross

@gashutos Can you fix the spotless errors?

server/src/main/java/org/opensearch/index/fielddata/IndexNumericFieldData.java

nknize · 2023-02-15T05:39:07Z

Bleh. This is a long standing todo from the Lucene 9 upgrade!

The problem really originates from LUCENE-9280 which introduced an optimization that skips non-competitive docs by using the BKD index when sorting by fields other than _score. The reason the sortField.setOptimizeSortWithPoints line was introduced is because Lucene 9 expects the BYTE size for the BKD index and docvalues to match along with the byte size of the SortField when comparing for sort. OpenSearch violates this assumption by using the wider byte size for every SortField.Type in NumericType. We saw this same problem in several test failures when working on the Lucene 9 upgrade PR.

The wider fix that @reta is referring to is much more involved as it's going to require logic for handling types when merging results coming from different indexes. So I'm +1 for a quick patch (w/ bugfix backport) to temporarily fix critical performance regressions like sorting by non _score fields in already index sorted indices (e.g., time series). (gripe: it's SUPER frustrating not having nightly benchmarks to expose these regressions to the wider audience so I may just go ahead and adapt LuceneBench python scripts to use OpenSearch-Benchmarks - unless someone else out there wants to pick up that task, hint hint!!!!!)

gashutos · 2023-02-15T06:07:35Z

Thanks Chaitanya, Can we add some tests around this change. Also might be good to see some perf numbers around this change

Thanks for the review Bukhtawar ! The results I've added already in description of this PR. But they were hard to read since I didnt format those :)

Below are the results from OS_2.3. (ran on managed AOS)
Units in ms.

Data Node Type.         Sort Mode             OS 2.3             OS 2.3(before)
r6g.2xlarge.               asc                264                  428
r6g.2xlarge.               desc               639                  2422

gashutos · 2023-02-15T15:14:57Z

Not often (fingers crossed) but it might, u64 could be considered somewhat esoteric but it would allow people to migrate more easily from Elasticsearch. Also, I think we could go as far as supporting BigInteger for example at some point. To conclude - it might happen, and since we know there are such decisions points buried in code - making them explicit would 100% help later on.

The worst case, newly introduced type would not able to get sort optimisation advantage. Correctness would be still intact. The plan is anyway to support all field types some time later on.

gashutos · 2023-02-15T15:17:13Z

Holding up on the merge until we sort out the optimization flag decision.

What about this option: add applySortOptimization(SortField field) to numeric type contract, the boolean is gone and each numeric type is in control of any possible optimization it could do?

Where will be the decider logic in that case ? if to apply optimisation or not ? You mean move the switch code block in that contract method ?

reta · 2023-02-15T15:24:06Z

Where will be the decider logic in that case ? if to apply optimisation or not ? You mean move the switch code block in that contract method ?

There won't be switch because each numeric type (LONG/INT/...) would implement the method, the types which do not support optimization(s) would do nothing.

github-actions · 2023-02-15T15:24:15Z

Gradle Check (Jenkins) Run Completed with:

RESULT: UNSTABLE ❕
TEST FAILURES:

      1 org.opensearch.indices.replication.SegmentReplicationStatsIT.testSegmentReplicationStatsResponse

URL: https://build.ci.opensearch.org/job/gradle-check/11182/
CommitID: 1617be2
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

github-actions · 2023-02-15T15:28:44Z

Gradle Check (Jenkins) Run Completed with:

RESULT: SUCCESS ✅
URL: https://build.ci.opensearch.org/job/gradle-check/11183/
CommitID: 4c19a9b

gashutos · 2023-02-15T15:38:40Z

Where will be the decider logic in that case ? if to apply optimisation or not ? You mean move the switch code block in that contract method ?

There won't be switch because each numeric type (LONG/INT/...) would implement the method, the types which do not support optimization(s) would do nothing.

that would be similar to boolean in ctor right ? Its just 50% of types wont implement the methods. (depending on how many types we support at that point of time).

How about adding a static set in NumericType,

// Lucene optimization enabled numeric types
public static final Set<NumericType> SORTOPTIMIZED_TYPES = new HashSet<>(Arrays.asList(DATE, 
        DATE_NANOSECONDS, 
        DOUBLE, 
        LONG));

And we will refer this here
sortField.setOptimizeSortWithPoints(NumericType.SORTOPTIMIZED_TYPES.contains(getNumericType()));

gashutos · 2023-02-15T15:40:53Z

Where will be the decider logic in that case ? if to apply optimisation or not ? You mean move the switch code block in that contract method ?

There won't be switch because each numeric type (LONG/INT/...) would implement the method, the types which do not support optimization(s) would do nothing.

that would be similar to boolean in ctor right ? Its just 50% of types wont implement the methods. (depending on how many types we support at that point of time).

How about adding a static set in NumericType,

// Lucene optimization enabled numeric types
public static final Set SORTOPTIMIZED_TYPES = new HashSet<>(Arrays.asList(DATE,
DATE_NANOSECONDS,
DOUBLE,
LONG));
And we will refer this here
sortField.setOptimizeSortWithPoints(NumericType.SORTOPTIMIZED_TYPES.contains(getNumericType()));

The comment here would easily hint dev who is adding new numerical type.

reta · 2023-02-15T15:43:09Z

Where will be the decider logic in that case ? if to apply optimisation or not ? You mean move the switch code block in that contract method ?

There won't be switch because each numeric type (LONG/INT/...) would implement the method, the types which do not support optimization(s) would do nothing.

that would be similar to boolean in ctor right ? Its just 50% of types wont implement the methods. (depending on how many types we support at that point of time).

How about adding a static set in NumericType,
// Lucene optimization enabled numeric types
public static final Set<NumericType> SORTOPTIMIZED_TYPES = new HashSet<>(Arrays.asList(DATE, 
        DATE_NANOSECONDS, 
        DOUBLE, 
        LONG)); 
And we will refer this here sortField.setOptimizeSortWithPoints(NumericType.SORTOPTIMIZED_TYPES.contains(getNumericType()));

I was thinking about something along these lines: sortField.setOptimizeSortWithPoints(NumericType.SORTOPTIMIZED_TYPES.contains(getNumericType())); -> getNumericType().applySortOptimization(sortField)

It is similar to the flag but a) we don't have this boolean flag anymore b) it becomes cleaner in a sense that sort optimization are delegated directly to numeric type

reta · 2023-02-15T15:55:50Z

The comment here would easily hint dev who is adding new numerical type.

The difference between the comment and contract: the dev has to actually run into this comment somehow vs contract - the new numeric type just will have to implement the method, and it will force the dev to do something about it.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>

nknize · 2023-02-15T16:18:57Z

The difference between the comment and contract: the dev has to actually run into this comment somehow vs contract

I agree with this. I pushed a commit to add a deprecated abstract shouldApplyPointSortOptimization method numeric data types have to implement. If set to false the optimization is explicitly disabled. I can add some better javadocs around this but wanted to get it in to see if y'all are good w/ this approach. I think it's less cryptic than the ctor boolean, while accomplishing @reta point that new types have to explicitly think about disabling or enabling the optimization.

gashutos · 2023-02-15T16:21:44Z

The difference between the comment and contract: the dev has to actually run into this comment somehow vs contract

I agree with this. I pushed a commit to add a deprecated abstract shouldApplyPointSortOptimization method numeric data types have to implement. If set to false the optimization is explicitly disabled. I can add some better javadocs around this but wanted to get it in to see if y'all are good w/ this approach. I think it's less cryptic than the ctor boolean, while accomplishing @reta point that new types have to explicitly think about disabling or enabling the optimization.

Looks good to me. I was just afraid about the readability part, but that's fine.

server/src/main/java/org/opensearch/index/fielddata/IndexNumericFieldData.java

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>

github-actions · 2023-02-15T16:39:21Z

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE ❌
URL: https://build.ci.opensearch.org/job/gradle-check/11185/
CommitID: 9d732aa
Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green.
Is the failure a flaky test unrelated to your change?

nknize

I like this balance of numeric API while keeping the tech debt guardrails in place! Thanks to everyone that helped weigh in but a HUGE thanks to @gashutos for the benchmarks and jumping on this long running todo!! Huge contribution!

gashutos · 2023-02-15T17:05:38Z

I like this balance of numeric API while keeping the tech debt guardrails in place! Thanks to everyone that helped weigh in but a HUGE thanks to @gashutos for the benchmarks and jumping on this long running todo!! Huge contribution!

Thank you @reta & @nknize @andrross @Bukhtawar for all in this PR. I will keep you guys posted about my finding for rest of the numeric types which we are not supporting. The optimization we are gaining is pretty good to enable rest of types as well.

github-actions · 2023-02-15T17:10:55Z

Gradle Check (Jenkins) Run Completed with:

RESULT: SUCCESS ✅
URL: https://build.ci.opensearch.org/job/gradle-check/11186/
CommitID: 78f2f6a

This commit restores the sort optimization to use BKD to skip non-competitive docs for numeric types whose BYTES size match between the BKD leaf and doc values encoding. For now this is only LONG, DOUBLE, DATE, and DATE_NANOSECONDS as the remaining NumericTypes use 64bit docvalue encoding while the BKD uses smaller byte encoded space. This also updates the QueryPhase to remove the long time unnecessary in order doc id check and minDoc boolean query for skipping non-competitive docs that is handled by all Lucene 7.0+ sorted indexes. Existing tests are updated. Signed-off-by: Nicholas Walter Knize <nknize@apache.org> Signed-off-by: gashutos <gashutos@amazon.com> Co-authored-by: Nicholas Walter Knize <nknize@apache.org> Co-authored-by: Chaitanya Gohel <gashutos@amazon.com> (cherry picked from commit 6bb9e3e) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

…#6330) This commit restores the sort optimization to use BKD to skip non-competitive docs for numeric types whose BYTES size match between the BKD leaf and doc values encoding. For now this is only LONG, DOUBLE, DATE, and DATE_NANOSECONDS as the remaining NumericTypes use 64bit docvalue encoding while the BKD uses smaller byte encoded space. This also updates the QueryPhase to remove the long time unnecessary in order doc id check and minDoc boolean query for skipping non-competitive docs that is handled by all Lucene 7.0+ sorted indexes. Existing tests are updated. (cherry picked from commit 6bb9e3e) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Nicholas Walter Knize <nknize@apache.org> Co-authored-by: Chaitanya Gohel <gashutos@amazon.com>

Bukhtawar reviewed Feb 14, 2023

View reviewed changes

andrross reviewed Feb 14, 2023

View reviewed changes

server/src/main/java/org/opensearch/index/fielddata/IndexNumericFieldData.java Outdated Show resolved Hide resolved

reta requested changes Feb 14, 2023

View reviewed changes

server/src/main/java/org/opensearch/index/fielddata/IndexNumericFieldData.java Outdated Show resolved Hide resolved

andrross reviewed Feb 14, 2023

View reviewed changes

gashutos changed the title ~~Enable numeric sort optimisation for few numerica sort types~~ Enable numeric sort optimisation for few numerical sort types Feb 15, 2023

add deprecated method to control points optimization on numeric types

9d732aa

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>

andrross reviewed Feb 15, 2023

View reviewed changes

server/src/main/java/org/opensearch/index/fielddata/IndexNumericFieldData.java Outdated Show resolved Hide resolved

be more explicit with the deprecated points sort numeric type API

78f2f6a

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>

reta approved these changes Feb 15, 2023

View reviewed changes

nknize approved these changes Feb 15, 2023

View reviewed changes

nknize merged commit 6bb9e3e into opensearch-project:main Feb 15, 2023

nknize added the backport 2.x Backport to 2.x branch label Feb 15, 2023

opensearch-trigger-bot bot mentioned this pull request Feb 15, 2023

[Backport 2.x] Enable numeric sort optimisation for few numerical sort types #6330

Merged

nknize mentioned this pull request Feb 15, 2023

[BUG] OpenSearch sort query performance regression #5534

Closed

This was referenced Feb 21, 2023

Enable sort optimization for all NumericTypes #6420

Closed

Enable numeric sort optimization support for all numeric types #6424

Merged

gashutos mentioned this pull request Mar 23, 2023

Time series based workload sort optimization through Lucene segment sorting in DocumentReader #6814

Closed

This was referenced Jun 20, 2023

Enable point based sort optimization for all custom comparators #8167

Closed

Enable Point based optimization for custom comparators #8168

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable numeric sort optimisation for few numerical sort types #6321

Enable numeric sort optimisation for few numerical sort types #6321

gashutos commented Feb 14, 2023 •

edited

github-actions bot commented Feb 14, 2023

Bukhtawar left a comment

andrross left a comment

nknize commented Feb 15, 2023 •

edited

gashutos commented Feb 15, 2023 •

edited

gashutos commented Feb 15, 2023

gashutos commented Feb 15, 2023 •

edited

reta commented Feb 15, 2023 •

edited

github-actions bot commented Feb 15, 2023

github-actions bot commented Feb 15, 2023

gashutos commented Feb 15, 2023

gashutos commented Feb 15, 2023

reta commented Feb 15, 2023

reta commented Feb 15, 2023

nknize commented Feb 15, 2023

gashutos commented Feb 15, 2023

github-actions bot commented Feb 15, 2023

nknize left a comment

gashutos commented Feb 15, 2023 •

edited

github-actions bot commented Feb 15, 2023

Enable numeric sort optimisation for few numerical sort types #6321

Enable numeric sort optimisation for few numerical sort types #6321

Conversation

gashutos commented Feb 14, 2023 • edited

Description

Issues Resolved

Check List

github-actions bot commented Feb 14, 2023

Gradle Check (Jenkins) Run Completed with:

Bukhtawar left a comment

Choose a reason for hiding this comment

andrross left a comment

Choose a reason for hiding this comment

nknize commented Feb 15, 2023 • edited

gashutos commented Feb 15, 2023 • edited

gashutos commented Feb 15, 2023

gashutos commented Feb 15, 2023 • edited

reta commented Feb 15, 2023 • edited

github-actions bot commented Feb 15, 2023

Gradle Check (Jenkins) Run Completed with:

github-actions bot commented Feb 15, 2023

Gradle Check (Jenkins) Run Completed with:

gashutos commented Feb 15, 2023

gashutos commented Feb 15, 2023

reta commented Feb 15, 2023

reta commented Feb 15, 2023

nknize commented Feb 15, 2023

gashutos commented Feb 15, 2023

github-actions bot commented Feb 15, 2023

Gradle Check (Jenkins) Run Completed with:

nknize left a comment

Choose a reason for hiding this comment

gashutos commented Feb 15, 2023 • edited

github-actions bot commented Feb 15, 2023

Gradle Check (Jenkins) Run Completed with:

gashutos commented Feb 14, 2023 •

edited

nknize commented Feb 15, 2023 •

edited

gashutos commented Feb 15, 2023 •

edited

gashutos commented Feb 15, 2023 •

edited

reta commented Feb 15, 2023 •

edited

gashutos commented Feb 15, 2023 •

edited