Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search enhancement: pinned queries #44345

Merged
merged 2 commits into from
Aug 16, 2019
Merged

Conversation

markharwood
Copy link
Contributor

New query type allows selected documents to be promoted above “any organic” search results.

This is the first feature in a new module search-business-rules which will house licensed (non OSS) logic for rewriting queries according to business rules.
The new PinnedQueryBuilder class offers a new pinned query in the DSL that takes an array of promoted IDs and an “organic” query and ensures the documents with the promoted IDs rank higher than those matched purely by the organic query.

This first commit contains the core query and tests - docs to follow

Closes #44074

@markharwood markharwood requested a review from jpountz July 15, 2019 13:19
@markharwood markharwood self-assigned this Jul 15, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

classname 'org.elasticsearch.xpack.searchbusinessrules.SearchBusinessRules'
extendedPlugins = ['x-pack-core']
}
archivesBaseName = 'x-pack-searchbusinessrules'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we have this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's based on a cut-and-paste of vectors' build.gradle with modifications where it seemed appropriate.


@Override
public Weight createWeight(IndexSearcher searcher, ScoreMode scoreMode, float boost) throws IOException {
final Weight innerWeight = searcher.createWeight(query, ScoreMode.COMPLETE, 1f);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we pass scoreMode rather than hardcode ScoreMode.COMPLETE, and boost rather than 1f?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apologies, this was carried over from ConstantScoreQuery too.

organicAndPinned.add(pinnedQueries.build());
// Cap the scores of the organic query
organicAndPinned.add(new CappedScoreQuery(organicQuery.toQuery(context), MAX_ORGANIC_SCORE));
return new DisjunctionMaxQuery(organicAndPinned, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI seeing this made me notice that disjunction max queries are a bit inefficient at retrieving top hits at the moment, even though there are some easy wins, especially when the tie-break multiplier is 0, so I opened https://issues.apache.org/jira/browse/LUCENE-8922

@markharwood
Copy link
Contributor Author

@hub-cap This basic-licensed Query will presumably need an Apache-licensed client-side equivalent for use in HLRC. Much code would be duplicated between client and server-side classes so my assumption is server-side QueryBuilder with the execution logic should extend a client-side QueryBuilder that holds the search criteria. The server would override this method in the client-side class:

public class PinnedQueryBuilder extends AbstractQueryBuilder<PinnedQueryBuilder> {
...
@Override
protected Query doToQuery(QueryShardContext context) throws IOException {
    throw new UnsupportedOperationException("Client side-only class for use in HLRC");
}

Does this seem a reasonable way forward?

@javanna
Copy link
Member

javanna commented Jul 24, 2019

heya @markharwood it is correct that client and server code should diverge, especially in this case. I am not sure about server code extending client code though. We don't want server to depend on HLRC? I think the way forward may be to have two separate classes.

@markharwood
Copy link
Contributor Author

markharwood commented Jul 24, 2019

We don't want server to depend on HLRC?

My concern with duplicating code is the potential to diverge e.g. it then relies on 2 classes agreeing on search criteria choices like:

  1. the name of the query
  2. the validation of params (eg if we reject an empty set for IDs or not)

I guess adding test code helps ensure they don't diverge but it may require more HLRC tests to guarantee that.
I suppose the fundamental challenge in organising code here is that the Basic-license code can depend on Apache licensed classes but not vice-versa. Maybe some of the constants and validation logic that would be useful to share needs to be put somewhere common to both HLRC and server-side?

I am not sure about server code extending client code though

To be clear in this case it's a QueryBuilder class - and they currently all exist in package org.elasticsearch.index.query in server. I'm guessing the long-term plan is to move them out of there to break HLRC dependencies on server code

@javanna
Copy link
Member

javanna commented Jul 25, 2019

I'm guessing the long-term plan is to move them out of there to break HLRC dependencies on server code

heya @markharwood indeed, you should chat with @hub-cap to get more details on this though. Not sure if we plan on having some common library shared between server and client. Probably not in the short term.

@jpountz
Copy link
Contributor

jpountz commented Jul 25, 2019

Agreed with @javanna I think it's unlikely we'll get a shared library in the short term. There are already a number of request objects that we duplicated between server and client, I'm leaning towards doing the same here, at least in the initial version. Furthermore I don't think we have to have the validation logic on the client side, we could only have it on the server side to address the concern that validation logic diverges across the client and the server?

@markharwood
Copy link
Contributor Author

OK - I'll put a copy of the existing PinnedQueryBuilder into server's org.elasticsearch.index.query package (minus the "toQuery" implementation) for use by HLRC.
Thanks for the comments @jpountz and @javanna

@jpountz
Copy link
Contributor

jpountz commented Jul 25, 2019

Is there a reason why it couldn't go directly into the client module?

@markharwood
Copy link
Contributor Author

Is there a reason why it couldn't go directly into the client module?

I was putting it with all the other existing QueryBuilders in server but I can break with that tradition if you like

@markharwood
Copy link
Contributor Author

@hub-cap would appreciate your ruling on this packaging issue - new Basic-licensed query (a first) and where to put the Apache-licensed QueryBuilder for use in HLRC

@polyfractal
Copy link
Contributor

Registering interest in this discussion :) @imotov, @nknize and I are running into the same issue on other in-flight PRs (query or agg in Basic+ that needs to be exposed in HLRC). So there will at least be a handful of functionality that will go down whatever route is decided.

@markharwood
Copy link
Contributor Author

@hub-cap added you as a reviewer to check out the HLRC approach to packaging this basic-licensed query.

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left one comment, I think we should duplicate the request/response like we did for other APIs in the client module and not in server.

@markharwood
Copy link
Contributor Author

markharwood commented Aug 13, 2019

Following a conversation with Jim we've decided to pull out HLRC support from this PR to allow us to get it into master and will then open a separate discussion to figure out the right approach for HLRC

@markharwood markharwood force-pushed the fix/44074 branch 2 times, most recently from de74c31 to 73c479d Compare August 13, 2019 15:11
@markharwood
Copy link
Contributor Author

Good to go, @jimczi ?
I pulled the HLRC stuff and got a green build here.

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -0,0 +1 @@
/bin/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed ?

@@ -722,5 +722,5 @@ public static GeoShapeQueryBuilder geoDisjointQuery(String name, String indexedS
*/
public static ExistsQueryBuilder existsQuery(String name) {
return new ExistsQueryBuilder(name);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you revert this ?

…promoted above “any organic” search results.

This is the first feature in a new module `search-business-rules` which will house licensed (non OSS) logic for rewriting queries according to business rules.
The PinnedQueryBuilder class offers a new `pinned` query in the DSL that takes an array of promoted IDs and an “organic” query and ensures the documents with the promoted IDs rank higher than the organic matches.

Closes elastic#44074
@markharwood markharwood merged commit f0dbc68 into elastic:master Aug 16, 2019
markharwood added a commit to markharwood/elasticsearch that referenced this pull request Aug 20, 2019
Search enhancement: - new query type allows selected documents to be promoted above any "organic” search results.
This is the first feature in a new module `search-business-rules` which will house licensed (non OSS) logic for rewriting queries according to business rules.
The PinnedQueryBuilder class offers a new `pinned` query in the DSL that takes an array of promoted IDs and an “organic” query and ensures the documents with the promoted IDs rank higher than the organic matches.

Closes elastic#44074
markharwood added a commit that referenced this pull request Aug 20, 2019
* Search enhancement: pinned queries (#44345)

Search enhancement: - new query type allows selected documents to be promoted above any "organic” search results.
This is the first feature in a new module `search-business-rules` which will house licensed (non OSS) logic for rewriting queries according to business rules.
The PinnedQueryBuilder class offers a new `pinned` query in the DSL that takes an array of promoted IDs and an “organic” query and ensures the documents with the promoted IDs rank higher than the organic matches.

Closes #44074
Mpdreamz added a commit to elastic/elasticsearch-net that referenced this pull request Oct 14, 2019
Mpdreamz added a commit to elastic/elasticsearch-net that referenced this pull request Oct 15, 2019
* Search enhancement: pinned queries

Addresses elastic/elasticsearch#44345

* update docs in test

* remove params overloads on Ids that already convert to Ids

* add note to IdsQuery
Mpdreamz added a commit to elastic/elasticsearch-net that referenced this pull request Oct 15, 2019
* Search enhancement: pinned queries

Addresses elastic/elasticsearch#44345

* update docs in test

* remove params overloads on Ids that already convert to Ids

* add note to IdsQuery
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Search result pinning
8 participants