Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow doc-values only search on geo_point fields #83395

Merged
merged 8 commits into from Feb 2, 2022

Conversation

ywelsch
Copy link
Contributor

@ywelsch ywelsch commented Feb 2, 2022

Similar to #82409, but for geo_point fields.

Allows searching on geo_point fields when those fields are not indexed (index: false) but just doc values are enabled.

Also adds distance feature query support for date fields (bringing date field to feature parity with runtime fields)

This enables searches on archive data, which has access to doc values but not index structures. When combined with searchable snapshots, it allows downloading only data for a given (doc value) field to quickly filter down to a select set of documents.

Relates #81210 and #52728

@ywelsch ywelsch added >enhancement :Search/Mapping Index mappings, including merging and defining field types labels Feb 2, 2022
@elasticsearchmachine
Copy link
Collaborator

Hi @ywelsch, I've created a changelog YAML for you.

@ywelsch ywelsch changed the title Date script doc values Allow doc-values only search on geo_point fields Feb 2, 2022
@ywelsch ywelsch mentioned this pull request Feb 2, 2022
32 tasks
@ywelsch ywelsch marked this pull request as ready for review February 2, 2022 09:26
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Feb 2, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @ywelsch

@ywelsch ywelsch merged commit eec2682 into elastic:master Feb 2, 2022
@ywelsch
Copy link
Contributor Author

ywelsch commented Feb 2, 2022

Thank you @romseygeek!

@ywelsch ywelsch mentioned this pull request Feb 2, 2022
@@ -596,10 +601,22 @@ public static long parseToLong(

@Override
public Query distanceFeatureQuery(Object origin, String pivot, SearchExecutionContext context) {
failIfNotIndexedNorDocValuesFallback(context);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

were we silently not returning any result when the field was not indexed? Or failing later with some other exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we were silently not returning any result...
We have this kind of bug in some other places unfortunately as well

return LongPoint.newDistanceFeatureQuery(name(), 1.0f, originLong, pivotLong);
} else {
return new LongScriptFieldDistanceFeatureQuery(
new Script(""),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes me a bit nervous. We use the original script only in equals/hashcode, and we use the empty script already for script-less runtime fields that load from _source: https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/index/mapper/AbstractScriptFieldType.java#L212 . I am not entirely sure what this may cause down the line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script is actually used nowhere. The reason it's there is because AbstractScriptFieldQuery requires a non-null script. I think we could also make it nullable there and make it clear to callers that a script is optional. I don't see your worry though, how this would interact badly in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Luca is right, it's used in equals/hashcode which is further used by the query cache. So if we're not careful here we could end up with a source-only field and a doc-values field stepping on each other's cache values. I don't think that would actually be a problem here because the source-only field for this fieldname should return the same values as the doc-values field (we also check the query class in equals) but I need to think harder about it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, thanks for pointing that out @romseygeek. I wasn't aware of this dependency.

tlrx pushed a commit to tlrx/elasticsearch that referenced this pull request Mar 3, 2022
Similar to elastic#82409, but for geo_point fields.

Allows searching on geo_point fields when those fields are not indexed (index: false) but just doc values are enabled.

Also adds distance feature query support for date fields (bringing date field to feature parity with runtime fields)

This enables searches on archive data, which has access to doc values but not index structures. When combined with
searchable snapshots, it allows downloading only data for a given (doc value) field to quickly filter down to a select set
of documents.

Relates elastic#81210 and elastic#52728
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Mapping Index mappings, including merging and defining field types Team:Search Meta label for search team v8.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants