Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up named queries. #18470

Merged
merged 1 commit into from May 23, 2016

Conversation

Projects
None yet
3 participants
@jpountz
Copy link
Contributor

commented May 19, 2016

Named queries have a performance bug when they are used with expensive queries
that need to perform a lot of work up-front like fuzzy or range queries
(including with points). The reason is that they currently re-create the weight
and scorer for every hit. Instead we should create weights exactly once and
use a single Scorer for all documents that are on the same segment.

@martijnvg

This comment has been minimized.

Copy link
Member

commented May 19, 2016

Nice speedup! LGTM

@s1monw

View changes

...src/main/java/org/elasticsearch/search/fetch/matchedqueries/MatchedQueriesFetchSubPhase.java Outdated
return;
}
hits = hits.clone(); // don't modify the incoming hits
Arrays.sort(hits, (a, b) -> a.docId() - b.docId());

This comment has been minimized.

Copy link
@s1monw

s1monw May 20, 2016

Contributor

should we use Integer.compareTo() here instead? I wonder if that is correct on very large segments?

This comment has been minimized.

Copy link
@jpountz

jpountz May 23, 2016

Author Contributor

It is correct, but I can switch to compareTo if you like it better

}
}
for (int i = 0; i < hits.length; ++i) {
hits[i].matchedQueries(matchedQueries[i].toArray(new String[0]));

This comment has been minimized.

Copy link
@s1monw

s1monw May 20, 2016

Contributor

unrelated but maybe we can clean this further up and rename matchedQueries to setMatchedQueries and make it use a List instead of a String array.

This comment has been minimized.

Copy link
@jpountz

jpountz May 23, 2016

Author Contributor

I'd like it to be done in another change that would change all InternalSearchHit setters in one go.

String name = entry.getKey();
Query filter = entry.getValue();
@Override
public boolean hitExecutionNeeded(SearchContext context) {

This comment has been minimized.

Copy link
@s1monw

s1monw May 20, 2016

Contributor

maybe use default in FetchSubPhase?

This comment has been minimized.

Copy link
@jpountz

jpountz May 23, 2016

Author Contributor

actually I have plans to remove FetchSubPhase.hitExecute/hitExecutionNeeded and make all impls implement hitsExecute (note the additional s after hit) instead

}
}
@Override
public void hitExecute(SearchContext context, HitContext hitContext) {

This comment has been minimized.

Copy link
@s1monw

s1monw May 20, 2016

Contributor

same here?

@s1monw

This comment has been minimized.

Copy link
Contributor

commented May 20, 2016

left some minors but LGTM too

Speed up named queries. #18470
Named queries have a performance bug when they are used with expensive queries
that need to perform a lot of work up-front like fuzzy or range queries
(including with points). The reason is that they currently re-create the weight
and scorer for every hit. Instead we should create weights exactly once and
use a single Scorer for all documents that are on the same segment.

@jpountz jpountz force-pushed the jpountz:fix/speed_up_named_queries branch to cb2cfdd May 23, 2016

@jpountz jpountz merged commit cb2cfdd into elastic:master May 23, 2016

1 check passed

CLA Commit author is a member of Elasticsearch
Details

@jpountz jpountz deleted the jpountz:fix/speed_up_named_queries branch May 23, 2016

jpountz added a commit that referenced this pull request May 23, 2016

Speed up named queries. #18470
Named queries have a performance bug when they are used with expensive queries
that need to perform a lot of work up-front like fuzzy or range queries
(including with points). The reason is that they currently re-create the weight
and scorer for every hit. Instead we should create weights exactly once and
use a single Scorer for all documents that are on the same segment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.