Skip to content

[WIP] Introduce PinotLogicalJoin to encapsulate join strategy#14524

Draft
Jackie-Jiang wants to merge 1 commit intoapache:masterfrom
Jackie-Jiang:v2_lookup_join_followup
Draft

[WIP] Introduce PinotLogicalJoin to encapsulate join strategy#14524
Jackie-Jiang wants to merge 1 commit intoapache:masterfrom
Jackie-Jiang:v2_lookup_join_followup

Conversation

@Jackie-Jiang
Copy link
Copy Markdown
Contributor

Add PinotLogicalJoin to encapsulate join strategy instead of parsing it from hints in the rules

@Jackie-Jiang Jackie-Jiang added cleanup Code cleanup or removal of dead code multi-stage Related to the multi-stage query engine labels Nov 23, 2024
@Jackie-Jiang Jackie-Jiang force-pushed the v2_lookup_join_followup branch from 9603684 to d0d506a Compare November 23, 2024 01:45
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Nov 23, 2024

Codecov Report

❌ Patch coverage is 47.16981% with 56 lines in your changes missing coverage. Please review.
✅ Project coverage is 34.55%. Comparing base (59551e4) to head (ca857ca).
⚠️ Report is 3839 commits behind head on master.

Files with missing lines Patch % Lines
...he/pinot/calcite/rel/logical/PinotLogicalJoin.java 0.00% 21 Missing ⚠️
...t/calcite/rel/logical/PinotLogicalJoinFactory.java 6.66% 14 Missing ⚠️
...lcite/rel/rules/PinotRelDistributionTraitRule.java 0.00% 4 Missing and 1 partial ⚠️
.../query/planner/logical/RelToPlanNodeConverter.java 0.00% 4 Missing ⚠️
...inot/calcite/rel/logical/PinotLogicalExchange.java 0.00% 3 Missing ⚠️
...ite/rel/rules/PinotJoinToDynamicBroadcastRule.java 0.00% 3 Missing ⚠️
...ite/rel/rules/PinotJoinExchangeNodeInsertRule.java 0.00% 2 Missing ⚠️
...lcite/rel/rules/PinotProjectJoinTransposeRule.java 60.00% 2 Missing ⚠️
...e/pinot/calcite/rel/rules/PinotFilterJoinRule.java 80.00% 1 Missing ⚠️
.../apache/pinot/query/planner/plannode/JoinNode.java 0.00% 1 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (59551e4) and HEAD (ca857ca). Click for more details.

HEAD has 43 uploads less than BASE
Flag BASE (59551e4) HEAD (ca857ca)
integration 7 0
integration2 3 0
temurin 12 3
java-21 7 2
skip-bytebuffers-true 3 1
skip-bytebuffers-false 7 2
unittests 5 3
unittests1 2 0
java-11 5 1
integration1 2 0
custom-integration1 2 0
Additional details and impacted files
@@              Coverage Diff              @@
##             master   #14524       +/-   ##
=============================================
- Coverage     61.75%   34.55%   -27.20%     
- Complexity      207      773      +566     
=============================================
  Files          2436     2677      +241     
  Lines        133233   147042    +13809     
  Branches      20636    22563     +1927     
=============================================
- Hits          82274    50814    -31460     
- Misses        44911    92117    +47206     
+ Partials       6048     4111     -1937     
Flag Coverage Δ
custom-integration1 ?
integration ?
integration1 ?
integration2 ?
java-11 34.54% <47.16%> (-27.17%) ⬇️
java-21 34.54% <47.16%> (-27.08%) ⬇️
skip-bytebuffers-false 34.54% <47.16%> (-27.20%) ⬇️
skip-bytebuffers-true 34.54% <47.16%> (+6.81%) ⬆️
temurin 34.55% <47.16%> (-27.20%) ⬇️
unittests 34.55% <47.16%> (-27.19%) ⬇️
unittests1 ?
unittests2 34.55% <47.16%> (+6.82%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@gortiz
Copy link
Copy Markdown
Contributor

gortiz commented Nov 25, 2024

This approach is interesting, but I can see why you said it requires a lot of changes.

IICU here you propose to generate PinotLogicalJoins instead of joins, which means we need to change (and even worse, copy) a lot of code from Calcite.

My suggestion is different. I would keep the code as it is in master right now. We keep using PinotJoin and therefore for example a filter will be pushed down into the right hand. In parallel, we create our own LookupPinotJoin that extends RelNode and a rule that transforms LogicalJoin into LookupPinotJoin in the specific conditions (ie, hint enabled, one of the sides is a dim table, etc).

This rule should be applied in the latest phases of the rule pipeline and could also transform a LogicalJoin + LogicalFilter into a single LookupPinotJoin, which could also keep the optional filter. This node therefore won't be logical but closer to physical given it would be semantically more complex.

Finally we could have a LookupJoinOperator whose nextBlock should be something like:

leftBlock = read block from left
if (!isEos(block))
  return ...
end if
resultBlock = new empty block
for each row in leftBlock
  row = execute the lookup
  if (filter == null or filter.accept(row))
    resultBlock += row
  end if
end for
return resultBlock

By doing that we can keep using all the standard Calcite rules but end up producing our own nodes at the end of the pipeline, where all standard relational logic optimizations have been applied. We still may need to modify some calcite rules (for example, we may don't want to push a group by into a logical join in order to use lookup join) but that would be something closer to what we have in #14523, which would be even simpler if we can contribute to Calcite to make it easier to implement (without copying code)

@Jackie-Jiang
Copy link
Copy Markdown
Contributor Author

I see. I was trying to avoid the per rule hint parsing, but some calcite rules are looking for LogicalJoin instead of general Join.

@Jackie-Jiang Jackie-Jiang force-pushed the v2_lookup_join_followup branch from d0d506a to ca857ca Compare November 25, 2024 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cleanup Code cleanup or removal of dead code multi-stage Related to the multi-stage query engine

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants