-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supporting filters in the left base table for join datasources #10697
Supporting filters in the left base table for join datasources #10697
Conversation
9c58008
to
8fe0229
Compare
8fe0229
to
538244a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
this.preJoinableClauses = preJoinableClauses; | ||
} | ||
|
||
public static DataSourceAnalysis forDataSource(final DataSource dataSource) | ||
{ | ||
// Strip outer queries, retaining querySegmentSpecs as we go down (lowest will become the 'baseQuerySegmentSpec'). | ||
// Strip outer queries, retaining querySegmentSpecs as we go down (lowest will become the 'baseQuerySegmentSpec'o). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was probably an accident?
return Pair.of(dataSource, toFiltration(filter, virtualColumnRegistry)); | ||
} | ||
//TODO: We should avoid promoting the time filter as interval for right outer and full outer joins. This is not | ||
// donw now as we apply the intervals to left base table today irrespective of the join type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typo donw
-> done
?
processing/src/main/java/org/apache/druid/segment/join/HashJoinSegmentStorageAdapter.java
Show resolved
Hide resolved
final List<RexNode> newProjectExprs = new ArrayList<>(); | ||
|
||
// Already verified to be present in "matches", so just call "get". | ||
// Can't be final, because we're going to reassign it up to a couple of times. | ||
ConditionAnalysis conditionAnalysis = analyzeCondition(join.getCondition(), join.getLeft().getRowType()).get(); | ||
|
||
if (left.getPartialDruidQuery().stage() == PartialDruidQuery.Stage.SELECT_PROJECT | ||
&& left.getPartialDruidQuery().getWhereFilter() == null) { | ||
if (left.getPartialDruidQuery().stage() == PartialDruidQuery.Stage.SELECT_PROJECT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may not work right if left
is a DruidJoinQueryRel or DruidOuterQueryRel instead of a DruidQueryRel (since leftFilter
is only supported for regular tables).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops. good catch. will fix this.
As of now, if the left query in the JOIN has a filter present, we plan it as the join between two query data sources even if the left source is a base table. This is of course not efficient. We had even introduced some custom calcite rules so that filters on join query are not pushed into the children tables. This issue is more significant for correlated queries since the generated physical plan is a left join with a filter on the left base table. Hence, the execution of correlation queries is not efficient.
This PR adds the support for pushing filters into the left child. We would add the support for the right child in a different PR and also get rid of the custom calcite rules once that happens.
What does not work
This PR has:
Key changed/added classes in this PR