Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query rewrite context and search execution context refactoring #96353

Conversation

salvatore-campagna
Copy link
Contributor

@salvatore-campagna salvatore-campagna commented May 25, 2023

The idea is to cleanup the code in order to make sure we have a better structure around the way we use QueryRewriteContext and SearchExecutionContext. Both of them are used by the rewrite logic to simplify queries and improve query execution latency and resource usage.

We would like to enable the following three scenarios:

  • Rewrite happening on the coordinator which requires the QueryRewriteContext and a few more information but not requiring a SearchExecutionContext. This is the case for rewrite operations which happen entirely on the coordinator node and spare us from executing the query on the data node.
  • Rewrite happening on the data node which requires the QueryRewriteContext and some information available in the SearchExecutionContext. In some scenarios an IndexSearcher is not required. Not using it saves us a few unnecessary IO operations and allows us to complete query execution before we even need to read the index.
  • Rewrite happening on the data node which requires the QueryRewriteContext and the full SearchExecutionContext. In this scenario an IndexSearcher is required for the rewrite logic to be fully exploited.

For this reason we will refactor the SearchExecutionContext pulling up into the QueryRewriteContext a few fields/methods (like getFieldType and getIndexSettings and so on). Moreover, we will need to revisit the implementation of doRewrite for subclasses of AbstractQueryBuilder in such a way to enable the three rewrite logic branches described above.

Resolves #96280

@salvatore-campagna salvatore-campagna self-assigned this May 25, 2023
@salvatore-campagna salvatore-campagna added :Search/Search Search-related issues that do not fall into other categories >refactoring v8.9.0 test-full-bwc Trigger full BWC version matrix tests labels May 25, 2023
@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label May 25, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@salvatore-campagna
Copy link
Contributor Author

@elasticsearchmachine run elasticsearch-ci/full-bwc

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a small comment, LGTM otherwise

queryRewriteContext
);
final MappedFieldType fieldType = queryRewriteContext.getFieldType(fieldName);
if (queryRewriteContext instanceof final CoordinatorRewriteContext coordinatorRewriteContext
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the convertToCoordinatorRewriteContext was better in that it let us avoid this instanceof check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I was considering a few more small changes before asking for a review...I wanted to run the full CI on it including full bwc (even if I don't expect bwc issues). Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also need to go through the JavaDoc to make sure I update everything to reflect the refactoring.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! ping me for review when ready ;)

@salvatore-campagna
Copy link
Contributor Author

@elasticsearchmachine run elasticsearch-ci/full-bwc

@salvatore-campagna
Copy link
Contributor Author

salvatore-campagna commented May 26, 2023

As it is now it is possible in the QueryBuilder to call both convertToCoordinatorRewriteContext and convertToSearhExecutionContext in a way similar to what we do in RangeQueryBuilder#doRewrite and RangeQueryBuilder#getRelation. This way we can enable rewriting of all queries on the coordinator node and on the data node (after updating PR #96161). We will still need to actually do the rewrite in the SearchService.

@salvatore-campagna
Copy link
Contributor Author

I was wondering I can move the logic to call the two rewrite (coordinator and data node rewrite) up in to the AbstractQueryBuilder and then have subclasses override two methods...doCoordinatorRewrite and doSearchRewrite.

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks great. I like the idea of explicit coordinatorRewrite and searchRewrite methods as well. Maybe a canMatchRewrite too?

}
}

public void setAllowUnmappedFields(boolean allowUnmappedFields) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can remove these two by having a special PercolatorRewriteContext that wraps an incoming rewrite context and overrides failIfFieldMappingNotFound()? For a follow-up maybe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to understand how to deal with SearchExecutionContext#reset calling setAllowUnmappedFields...maybe we can just get rid of reset() by having a specific PercolatorExecutionContext?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To do in a followup PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method (setAllowUnmappedFields) has been bugging me for a while :) sounds good to tackle in a follow-up

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, looks like a clean move of a number of methods and fields to QueryRewriteContext.


public QueryRewriteContext(XContentParserConfiguration parserConfiguration, Client client, LongSupplier nowInMillis) {
public QueryRewriteContext(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea to create a new instance of this class with mapper service etc. in #96161?

Copy link
Contributor Author

@salvatore-campagna salvatore-campagna May 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is to just move stuff and later on for the other PR just:

  • remove unnecessary things that we have here as a result of moving fields into the QueryRewriteContext
  • include the rewrite in the SearchService so to actually trigger the rewrite in the data node
  • eventually change the code for at least MatchPhaseQueryBuilder to actually implement the shard skipping logic for the query used by Observability integrations.

Not sure why we need to create a new class here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally here I would like to have everything excluding what is strictly needed for skipping shards. I intend a refactor as "change structure without changing behavior". Ideally this means green tests -> refactor -> green tests (even if I will probably include a couple of tests to test the two new methods doCoordinatorRewrite and doSearchRewrite.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main difference will be in creating a QueryRewriteContext where in the other PR we create a SearchExecutionContext. This way we will not need to set the IndexSearcher to null, which was one of the issues there.

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM it's great to see this cleanup. Thanks!

@salvatore-campagna
Copy link
Contributor Author

@elasticsearchmachine run elasticsearch-ci/full-bwc

@salvatore-campagna
Copy link
Contributor Author

@javanna @romseygeek do you see any reason why full bwc test (specifically against version 8.8.0) fail? I did that "just to be sure" without expecting any of them to fail...the test fails because of a timeout waiting for the cluster to to go from yellow to green...

@salvatore-campagna salvatore-campagna removed the test-full-bwc Trigger full BWC version matrix tests label May 30, 2023
@salvatore-campagna
Copy link
Contributor Author

@elasticsearchmachine test this please

@salvatore-campagna
Copy link
Contributor Author

salvatore-campagna commented May 30, 2023

So it looks like the PercolatorExecutionContext refactoring breaks BWC. I didn't get where is the point where serialization happens. I discussed with Martijn and I understand seriealization kicks in when we store the percolator query. I will revert the Percolator changes and leave them out of this PR. We can take care of it in another PR.

@salvatore-campagna
Copy link
Contributor Author

@elasticsearchmachine run elasticsearch-ci/part-1 please

Copy link
Member

@javanna javanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a couple of questions, LGTM otherwise.

@@ -286,6 +286,36 @@ public final QueryBuilder rewrite(QueryRewriteContext queryRewriteContext) throw
}

protected QueryBuilder doRewrite(QueryRewriteContext queryRewriteContext) throws IOException {
if (queryRewriteContext == null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when can this be null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is at least case when this can be null and it is uncovered by a test that deals with fuzzy queries.

./gradlew ':rest-api-spec:yamlRestTest' --tests "org.elasticsearch.test.rest.ClientYamlTestSuiteIT.test {yaml=search/320_disallow_queries/Test disallow expensive queries}" -Dtests.seed=C6EE7229D969ECFD -Dtests.locale=es-SV -Dtests.timezone=Asia/Novosibirsk -Druntime.java=20

I will try to debug it and see if I can understand why it is null in such a scenario.

Copy link
Contributor Author

@salvatore-campagna salvatore-campagna May 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also when we do convertToSearchExecutionContext the default implementation of this method returns null. Probably if a rewrite is tried later, after invoking another rewrite somewhere else which returns null (because of no override) we might end up in a situation where this is null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately the dafult implementation of convertToSearchExecutionContext cannot return this because a SearchExecutionContext is expected instead of a QueryRewriteContext as a return value...so we need to return null there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, maybe add a comment on why this is needed? Seems like it's a question we may ask ourselves again soon :)

}
}

public void setAllowUnmappedFields(boolean allowUnmappedFields) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method (setAllowUnmappedFields) has been bugging me for a while :) sounds good to tackle in a follow-up

);
}
}
// Overridable for testing onl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing y at the end of only

import java.util.function.BiConsumer;
import java.util.function.LongSupplier;
import java.util.function.Predicate;

/**
* Context object used to rewrite {@link QueryBuilder} instances into simplified version.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great (as a followup) to clarify javadocs for these 3 contexts. e.g. what's the difference between coordinator rewrite context and query rewrite context if both may happen on the coordinating node?

@salvatore-campagna
Copy link
Contributor Author

@elasticsearchmachine run elasticsearch-ci/part-1 please

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@salvatore-campagna
Copy link
Contributor Author

@elasticsearchmachine run elasticsearch-ci/part-1 please

@salvatore-campagna
Copy link
Contributor Author

@elasticsearchmachine update branch

@salvatore-campagna
Copy link
Contributor Author

@elasticsearchmachine run elasticsearch-ci/part-1 please

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>refactoring :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team v8.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Query rewrite refactoring
5 participants