TAJO-1352: Improve the join order algorithm to consider missed cases of associative join operators#593
TAJO-1352: Improve the join order algorithm to consider missed cases of associative join operators#593jihoonson wants to merge 99 commits into
Conversation
…into TAJO-1352_4
…into TAJO-1352_4 Conflicts: tajo-plan/src/main/java/org/apache/tajo/plan/expr/RowConstantEval.java tajo-plan/src/main/java/org/apache/tajo/plan/joinorder/GreedyHeuristicJoinOrderAlgorithm.java
…into TAJO-1352_4
…into TAJO-1352_4
…into TAJO-1352_4
…into TAJO-1352_4
|
Here is the simple evaluation result. QueryData
Cluster
Performance comparison
Query planThe query execution time is reduced due to the improved query plan as follows. Before patchAfter patch |
|
I expect that the performance will be more improved by simultaneously executing multiple execution blocks after our scheduler is implemented. |
…into TAJO-1352_4
…into TAJO-1352_4
…into TAJO-1352_4 Conflicts: tajo-core/src/main/java/org/apache/tajo/master/exec/DDLExecutor.java tajo-storage/tajo-storage-common/src/main/java/org/apache/tajo/storage/StorageUtil.java
…into TAJO-1352_4
…into TAJO-1352_4 Conflicts: CHANGES
There was a problem hiding this comment.
It seems to need some comment because it is hard to image its return value from the function name.
Also, estimateRowSize rather than schema may be more intuitive.
…into TAJO-1352_4
|
Not all changes are shown in the diff in github due to lots of changes. So, I leave some comments here.
I'm still reviewing the patch. I'll give more comments soon. |
|
I'm leaving additional comments.
|
|
Additional comments:
|
|
@hyunsik thanks for your review. I've reflected all your comments and added more comments to help your understanding. |
|
Thank you for your work. I leave additional trivial comments.
The patch seems to be ready to be committed. After your answer, I'll finish the review. |
|
Thank you for the detailed review. |
|
Test failures are fixed. |
|
+1 The latest patch looks good to me. |
Main changes are found at the GreedyHeuristicJoinOrderAlgorithm class. The findBestOrder() function finds the best relation pair among remaining join candidates based on join commutativity and associativity.