Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable point based sort optimization for all custom comparators #8167

Closed
gashutos opened this issue Jun 20, 2023 · 5 comments
Closed

Enable point based sort optimization for all custom comparators #8167

gashutos opened this issue Jun 20, 2023 · 5 comments
Assignees
Labels
enhancement Enhancement or improvement to existing feature or request Indexing & Search Performance This is for any performance related enhancements or bugs v2.9.0 'Issues and PRs related to version v2.9.0'

Comments

@gashutos
Copy link
Contributor

gashutos commented Jun 20, 2023

As a part of #6321 and #6424 , we enabled Lucene's numeric sort optimization which are based on BKD point values skipping logic.
We can enable this for our custom comparator as well. Which were disabled while upgrading to Lucene 9.1.0 in this PR 2487.

There is no harm adding this back, queries like below in nyc_taxis are showing 15x improvement on scaled_float numeric types. total_amount is scaled_float.

GET nyc_taxis/_search?size=1000
{
  "sort": [
    {
      "total_amount": {
        "order": "asc"
      }
    }
  ]
}

Edit : Above query takes 1350~ ms without optimization, 75~ ms with optimization.

@gashutos gashutos added enhancement Enhancement or improvement to existing feature or request untriaged labels Jun 20, 2023
@anasalkouz
Copy link
Member

Hi @gashutos, could you share the performance testing results?

@reta
Copy link
Collaborator

reta commented Jun 27, 2023

#8168

@gashutos
Copy link
Contributor Author

@anasalkouz In OSB, we dont have such numeric types queries where it use custom comparators.
But if you trigger asc/desc order sort queries on nyc_taxis like mentioned below, without this optimization it takes 1300/1400 ms, while with this optimization, it came down to 80 ms.

GET nyc_taxis/_search?size=1000
{
  "sort": [
    {
      "total_amount": {
        "order": "asc"
      }
    }
  ]
}

@gashutos gashutos self-assigned this Jun 28, 2023
@gashutos gashutos added the Performance This is for any performance related enhancements or bugs label Jun 28, 2023
@gashutos gashutos changed the title Enable point based sort optimization for all cusom comparators Enable point based sort optimization for all custom comparators Jun 28, 2023
@reta
Copy link
Collaborator

reta commented Jun 28, 2023

@anasalkouz In OSB, we dont have such numeric types queries where it use custom comparators.

@gashutos @anasalkouz we could actually come up with the relevant benchmarks since we have unsigned_long now, that implements custom comparator, added opensearch-project/opensearch-benchmark-workloads#85

@reta reta added the v2.9.0 'Issues and PRs related to version v2.9.0' label Jun 29, 2023
@reta
Copy link
Collaborator

reta commented Jun 29, 2023

Fixed by #8168

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Indexing & Search Performance This is for any performance related enhancements or bugs v2.9.0 'Issues and PRs related to version v2.9.0'
Projects
None yet
Development

No branches or pull requests

4 participants