Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to Lucene's new IntField/LongField/FloatField/DoubleField. #93165

Merged
merged 7 commits into from
Jan 31, 2023

Conversation

jpountz
Copy link
Contributor

@jpountz jpountz commented Jan 23, 2023

Lucene introduced new numeric fields that index both points and doc values. This has the same semantics as indexing one field for points and another one for doc values as we did before, but covering both data structures in a single field yielded a speedup in Lucene's nightly benchmarks (see annotation AH) which would be interesting to get too.

This commit does not switch to factory methods for queries such as LongField#newRangeQuery for now, we'll need to look into it in a follow-up.

Lucene introduced new numeric fields that index both points and doc values.
This has the same semantics as indexing one field for points and another one
for doc values as we did before, but covering both data structures in a single
field yielded a speedup in Lucene's nightly benchmarks (see annotation
[AH](http://people.apache.org/~mikemccand/lucenebench/sparseResults.html#index_throughput))
which would be interesting to get too.

This commit does not switch to factory methods for queries such as
`LongField#newRangeQuery` for now, we'll need to look into it in a follow-up.
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v8.7.0 labels Jan 23, 2023
@jpountz jpountz added >enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types auto-merge Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) labels Jan 23, 2023
@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label Jan 23, 2023
@elasticsearchmachine
Copy link
Collaborator

Hi @jpountz, I've created a changelog YAML for you.

@elasticsearchmachine elasticsearchmachine removed the needs:triage Requires assignment of a team area label label Jan 23, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add to UnsignedLongFieldMapper as well I think. It also has doc values and is indexed.

Comment on lines 919 to 921
} else {
context.addToFieldNames(fieldType().name());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, we only called addToFieldNames if at least store was true. Now, we call addToFieldNames no matter the store value. Is this intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this, I had in my mind that we should add to field names whenever doc values is false, but this field and others do indeed only do this if the field is either indexed or stored as well.

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@javanna
Copy link
Member

javanna commented Jan 27, 2023

the bwc test failure is one that has happened before, does not look like it's caused by your test, see #92058 . The percolator failure seems legit :)

@jpountz
Copy link
Contributor Author

jpountz commented Jan 27, 2023

The percolator failure seems legit

Absolutely, it's caused by apache/lucene#12109. The test failure should go away when we upgrade to the final release.

@javanna
Copy link
Member

javanna commented Jan 31, 2023

@elasticmachine update branch

@elasticsearchmachine elasticsearchmachine merged commit c21ee47 into elastic:main Jan 31, 2023
@jpountz jpountz deleted the lucene_new_fields branch January 31, 2023 21:10
@javanna
Copy link
Member

javanna commented Jan 31, 2023

@jpountz you were right about the percolator test failure, also auto-merge for the win here :)

jpountz added a commit to jpountz/elasticsearch that referenced this pull request Feb 2, 2023
…astic#93165)

Lucene introduced new numeric fields that index both points and doc
values. This has the same semantics as indexing one field for points and
another one for doc values as we did before, but covering both data
structures in a single field yielded a speedup in Lucene's nightly
benchmarks (see annotation
[AH](http://people.apache.org/~mikemccand/lucenebench/sparseResults.html#index_throughput))
which would be interesting to get too.

This commit does not switch to factory methods for queries such as
`LongField#newRangeQuery` for now, we'll need to look into it in a
follow-up.
@jpountz jpountz mentioned this pull request Feb 27, 2023
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team v8.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants