-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query rewrite context and search execution context refactoring #96353
Query rewrite context and search execution context refactoring #96353
Conversation
Pinging @elastic/es-search (Team:Search) |
@elasticsearchmachine run elasticsearch-ci/full-bwc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a small comment, LGTM otherwise
queryRewriteContext | ||
); | ||
final MappedFieldType fieldType = queryRewriteContext.getFieldType(fieldName); | ||
if (queryRewriteContext instanceof final CoordinatorRewriteContext coordinatorRewriteContext |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe the convertToCoordinatorRewriteContext was better in that it let us avoid this instanceof check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I was considering a few more small changes before asking for a review...I wanted to run the full CI on it including full bwc (even if I don't expect bwc issues). Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also need to go through the JavaDoc to make sure I update everything to reflect the refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! ping me for review when ready ;)
@elasticsearchmachine run elasticsearch-ci/full-bwc |
As it is now it is possible in the QueryBuilder to call both |
I was wondering I can move the logic to call the two rewrite (coordinator and data node rewrite) up in to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks great. I like the idea of explicit coordinatorRewrite
and searchRewrite
methods as well. Maybe a canMatchRewrite
too?
} | ||
} | ||
|
||
public void setAllowUnmappedFields(boolean allowUnmappedFields) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can remove these two by having a special PercolatorRewriteContext that wraps an incoming rewrite context and overrides failIfFieldMappingNotFound()
? For a follow-up maybe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to understand how to deal with SearchExecutionContext#reset
calling setAllowUnmappedFields
...maybe we can just get rid of reset()
by having a specific PercolatorExecutionContext
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To do in a followup PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method (setAllowUnmappedFields) has been bugging me for a while :) sounds good to tackle in a follow-up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, looks like a clean move of a number of methods and fields to QueryRewriteContext
.
|
||
public QueryRewriteContext(XContentParserConfiguration parserConfiguration, Client client, LongSupplier nowInMillis) { | ||
public QueryRewriteContext( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the idea to create a new instance of this class with mapper service etc. in #96161?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea here is to just move stuff and later on for the other PR just:
- remove unnecessary things that we have here as a result of moving fields into the
QueryRewriteContext
- include the rewrite in the SearchService so to actually trigger the rewrite in the data node
- eventually change the code for at least
MatchPhaseQueryBuilder
to actually implement the shard skipping logic for the query used by Observability integrations.
Not sure why we need to create a new class here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally here I would like to have everything excluding what is strictly needed for skipping shards. I intend a refactor as "change structure without changing behavior". Ideally this means green tests -> refactor -> green tests (even if I will probably include a couple of tests to test the two new methods doCoordinatorRewrite
and doSearchRewrite
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main difference will be in creating a QueryRewriteContext
where in the other PR we create a SearchExecutionContext
. This way we will not need to set the IndexSearcher
to null, which was one of the issues there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM it's great to see this cleanup. Thanks!
@elasticsearchmachine run elasticsearch-ci/full-bwc |
@javanna @romseygeek do you see any reason why full bwc test (specifically against version 8.8.0) fail? I did that "just to be sure" without expecting any of them to fail...the test fails because of a timeout waiting for the cluster to to go from yellow to green... |
@elasticsearchmachine test this please |
So it looks like the |
This reverts commit 4d079da.
This reverts commit 63c85df.
…tionContext" This reverts commit 209ef04.
This reverts commit e9f7770.
This reverts commit 65dda6b.
This reverts commit acb0a29.
This reverts commit f825743.
This reverts commit 1068b4b.
…ring from SearchExecutionContext" This reverts commit cca67ac.
@elasticsearchmachine run elasticsearch-ci/part-1 please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a couple of questions, LGTM otherwise.
@@ -286,6 +286,36 @@ public final QueryBuilder rewrite(QueryRewriteContext queryRewriteContext) throw | |||
} | |||
|
|||
protected QueryBuilder doRewrite(QueryRewriteContext queryRewriteContext) throws IOException { | |||
if (queryRewriteContext == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when can this be null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is at least case when this can be null and it is uncovered by a test that deals with fuzzy queries.
./gradlew ':rest-api-spec:yamlRestTest' --tests "org.elasticsearch.test.rest.ClientYamlTestSuiteIT.test {yaml=search/320_disallow_queries/Test disallow expensive queries}" -Dtests.seed=C6EE7229D969ECFD -Dtests.locale=es-SV -Dtests.timezone=Asia/Novosibirsk -Druntime.java=20
I will try to debug it and see if I can understand why it is null in such a scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also when we do convertToSearchExecutionContext
the default implementation of this method returns null. Probably if a rewrite is tried later, after invoking another rewrite somewhere else which returns null (because of no override) we might end up in a situation where this is null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately the dafult implementation of convertToSearchExecutionContext
cannot return this
because a SearchExecutionContext
is expected instead of a QueryRewriteContext
as a return value...so we need to return null there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, maybe add a comment on why this is needed? Seems like it's a question we may ask ourselves again soon :)
} | ||
} | ||
|
||
public void setAllowUnmappedFields(boolean allowUnmappedFields) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method (setAllowUnmappedFields) has been bugging me for a while :) sounds good to tackle in a follow-up
); | ||
} | ||
} | ||
// Overridable for testing onl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing y at the end of only
import java.util.function.BiConsumer; | ||
import java.util.function.LongSupplier; | ||
import java.util.function.Predicate; | ||
|
||
/** | ||
* Context object used to rewrite {@link QueryBuilder} instances into simplified version. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great (as a followup) to clarify javadocs for these 3 contexts. e.g. what's the difference between coordinator rewrite context and query rewrite context if both may happen on the coordinating node?
@elasticsearchmachine run elasticsearch-ci/part-1 please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@elasticsearchmachine run elasticsearch-ci/part-1 please |
@elasticsearchmachine update branch |
@elasticsearchmachine run elasticsearch-ci/part-1 please |
The idea is to cleanup the code in order to make sure we have a better structure around the way we use
QueryRewriteContext
andSearchExecutionContext
. Both of them are used by the rewrite logic to simplify queries and improve query execution latency and resource usage.We would like to enable the following three scenarios:
QueryRewriteContext
and a few more information but not requiring aSearchExecutionContext
. This is the case for rewrite operations which happen entirely on the coordinator node and spare us from executing the query on the data node.QueryRewriteContext
and some information available in theSearchExecutionContext
. In some scenarios anIndexSearcher
is not required. Not using it saves us a few unnecessary IO operations and allows us to complete query execution before we even need to read the index.QueryRewriteContext
and the fullSearchExecutionContext
. In this scenario anIndexSearcher
is required for the rewrite logic to be fully exploited.For this reason we will refactor the
SearchExecutionContext
pulling up into theQueryRewriteContext
a few fields/methods (likegetFieldType
andgetIndexSettings
and so on). Moreover, we will need to revisit the implementation ofdoRewrite
for subclasses ofAbstractQueryBuilder
in such a way to enable the three rewrite logic branches described above.Resolves #96280