Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement distance_feature for runtime dates #60851

Merged

Conversation

nik9000
Copy link
Member

@nik9000 nik9000 commented Aug 6, 2020

This implements the distance_feature for date valued runtime_scripts. This produces the same numbers running against an indexed date, but it doesn't have the same performance characteristics at all. Which is normal for runtime_scripts. But distance_feature` against an indexes fields does a lot of work to refine the query as it goes, limiting the number of documents that it has to visit. We can't do that because we don't have an index. So we just spit out the same numbers and hope it is good enough.

@nik9000 nik9000 added the :Search/Search Search-related issues that do not fall into other categories label Aug 6, 2020
@nik9000 nik9000 requested a review from javanna August 6, 2020 21:23
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Search)

@elasticmachine elasticmachine added the Team:Search Meta label for search team label Aug 6, 2020
@javanna javanna mentioned this pull request Aug 6, 2020
30 tasks
Copy link
Contributor

@mayya-sharipova mayya-sharipova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nik9000, I have left a couple of comments


@Override
public float getMaxScore(int upTo) throws IOException {
return boost;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should a potential max score be: weight instead of boost?
This method and several other methods don't seem to throw IOException

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! good catch.


@Override
public int hashCode() {
return Objects.hash(super.hashCode(), origin, pivot);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we incorporate the initial boost to hashCode, equals, toString ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

protected DistanceScorer(Weight weight, AbstractLongScriptFieldScript script, int maxDoc, float boost) {
super(weight);
this.script = script;
twoPhase = new TwoPhaseIterator(DocIdSetIterator.all(maxDoc)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not familiar with runtime fields, but I am wondering if we intend always to create an iterator across all documents? Do we plan to add support to limit number of docs (e.g. only docs returned by a top filter)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we're hoping for bool queries to AND together a "normal" query and a runtime field query. I've experimented with this for our term and match style queries and it seems to work pretty well. If the "normal" query is selective then the runtime query won't be asked if it matches most documents. On the flip side, if the runtime field query non-selective then we'll quickly fill up the 10,000 hits and terminate early.

Copy link
Member Author

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll push a patch to address your comments soon!


@Override
public float getMaxScore(int upTo) throws IOException {
return boost;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! good catch.


@Override
public int hashCode() {
return Objects.hash(super.hashCode(), origin, pivot);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

protected DistanceScorer(Weight weight, AbstractLongScriptFieldScript script, int maxDoc, float boost) {
super(weight);
this.script = script;
twoPhase = new TwoPhaseIterator(DocIdSetIterator.all(maxDoc)) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we're hoping for bool queries to AND together a "normal" query and a runtime field query. I've experimented with this for our term and match style queries and it seems to work pretty well. If the "normal" query is selective then the runtime query won't be asked if it matches most documents. On the flip side, if the runtime field query non-selective then we'll quickly fill up the 10,000 hits and terminate early.

boost
);
});
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what the plan is for the instanceof checks in DistanceFeatureQueryBuilder#doToQuery . Are we ok with keeping those?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh actually those are gone upstream, great! sorry for the noise then, you already did what I would asked you to do

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nik9000 nik9000 merged commit f3b65eb into elastic:feature/runtime_fields Aug 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants