Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix explain plan ALL_SEGMENTS_PRUNED_ON_SERVER node #9572

Merged
merged 4 commits into from Oct 12, 2022

Conversation

somandal
Copy link
Contributor

@somandal somandal commented Oct 11, 2022

Today EXPLAIN PLAN sets the operator ID and parent ID of ALL_SEGMENTS_PRUNED_ON_SERVER as 2 and 1 respectively. This PR modifies ALL_SEGMENTS_PRUNED_ON_SERVER to use the operator ID and parent ID as 3 and 2 respectively instead. This is to bring consistency with the other explain plan nodes. The root is BROKER_REDUCE which has the ID 1 and parent ID 0, followed by the COMBINE node which has ID 2 and parent ID 1.

For scenarios where all segments were pruned across all servers, added special handling to keep the ALL_SEGMENTS_PRUNED_ON_SERVER operator ID and parent ID as 2 and 1 respectively. This is because for such scenarios no COMBINE node is present.

@somandal
Copy link
Contributor Author

@codecov-commenter
Copy link

codecov-commenter commented Oct 12, 2022

Codecov Report

Merging #9572 (b001ae4) into master (03121cf) will increase coverage by 9.62%.
The diff coverage is 80.00%.

@@             Coverage Diff              @@
##             master    #9572      +/-   ##
============================================
+ Coverage     60.34%   69.97%   +9.62%     
- Complexity     5072     5217     +145     
============================================
  Files          1919     1931      +12     
  Lines        102764   103119     +355     
  Branches      15616    15655      +39     
============================================
+ Hits          62017    72154   +10137     
+ Misses        36089    25889   -10200     
- Partials       4658     5076     +418     
Flag Coverage Δ
integration1 25.97% <80.00%> (?)
integration2 24.66% <80.00%> (-0.04%) ⬇️
unittests1 67.30% <80.00%> (-0.01%) ⬇️
unittests2 15.64% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...core/query/reduce/ExplainPlanDataTableReducer.java 82.99% <75.00%> (-0.46%) ⬇️
...core/query/executor/ServerQueryExecutorV1Impl.java 82.85% <100.00%> (ø)
...ore/query/scheduler/resources/ResourceManager.java 84.00% <0.00%> (-12.00%) ⬇️
.../pinot/core/query/scheduler/PriorityScheduler.java 77.77% <0.00%> (-5.56%) ⬇️
.../impl/dictionary/BaseOffHeapMutableDictionary.java 81.33% <0.00%> (-3.34%) ⬇️
...perator/filter/SortedIndexBasedFilterOperator.java 85.71% <0.00%> (-0.96%) ⬇️
...ion/function/DistinctCountAggregationFunction.java 56.50% <0.00%> (ø)
...rg/apache/pinot/ingestion/jobs/BaseSegmentJob.java 31.57% <0.00%> (ø)
...not/ingestion/common/DefaultControllerRestApi.java 0.00% <0.00%> (ø)
.../pinot/ingestion/jobs/SegmentPreprocessingJob.java 0.00% <0.00%> (ø)
... and 366 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@vvivekiyer
Copy link
Contributor

vvivekiyer commented Oct 12, 2022

As discussed offline, we should probably have a dummy passthrough node in the "all segments pruned" case so that we don't show a disconnected tree.

+ "[\"PLAN_START(numSegmentsForThisPlan:12)\",-1,-1],[\"ALL_SEGMENTS_PRUNED_ON_SERVER\",2,1]]}");
In the above example, we'll not have any node with operatorId=2 and parentId=1 unless we have the dummy node.

@somandal
Copy link
Contributor Author

somandal commented Oct 12, 2022

As discussed offline, we should probably have a dummy passthrough node in the "all segments pruned" case so that we don't show a disconnected tree.

+ "[\"PLAN_START(numSegmentsForThisPlan:12)\",-1,-1],[\"ALL_SEGMENTS_PRUNED_ON_SERVER\",2,1]]}");

In the above example, we'll not have any node with operatorId=2 and parentId=1 unless we have the dummy node.

good catch. i've fixed this now. also updated the code to handle selecting the correct COMBINE node depending on if we have any which aren't PASSTHROUGH or not. (e.g. if a better node like COMBINE_AGGREGATE is available then that should be used instead of COMBINE_PASSTHROUGH in the final plan)

Copy link
Contributor

@vvivekiyer vvivekiyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@siddharthteotia siddharthteotia changed the title Fix explain plan to set id and parent id of ALL_SEGMENTS_PRUNED_ON_SERVER to 3 and 2 respectively Fix explain plan ALL_SEGMENTS_PRUNED_ON_SERVER node Oct 12, 2022
@@ -1614,7 +1614,7 @@ public void testSelectColumnsVariationsOfAndOperatorsVerbose() {
7, 6});
result4.add(new Object[]{"PLAN_START(numSegmentsForThisPlan:2)", ExplainPlanRows.PLAN_START_IDS,
ExplainPlanRows.PLAN_START_IDS});
result4.add(new Object[]{"ALL_SEGMENTS_PRUNED_ON_SERVER", 2, 1});
result4.add(new Object[]{"ALL_SEGMENTS_PRUNED_ON_SERVER", 3, 2});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test for the scenario where everything got pruned across all servers and thus it should be 2, 1 because COMBINE will be absent.

Copy link
Contributor Author

@somandal somandal Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siddharthteotia this file already has tests for this scenario. e.g.

    // All segments are pruned
    String query10 = "EXPLAIN PLAN FOR SELECT count(*) FROM testTable WHERE invertedIndexCol3 = 'roadrunner' AND "
        + "noIndexCol1 = 100 LIMIT 100";
    List<Object[]> result10 = new ArrayList<>();
    result10.add(new Object[]{"BROKER_REDUCE(limit:100)", 1, 0});
    result10.add(new Object[]{"PLAN_START(numSegmentsForThisPlan:4)", ExplainPlanRows.PLAN_START_IDS,
        ExplainPlanRows.PLAN_START_IDS});
    result10.add(new Object[]{"ALL_SEGMENTS_PRUNED_ON_SERVER", 2, 1});
    check(query10, new ResultTable(DATA_SCHEMA, result10));

This doesn't show up in the diff since the original values for the operator ID and parent ID are the same

@somandal somandal requested review from siddharthteotia and removed request for vvivekiyer October 12, 2022 02:28
@siddharthteotia siddharthteotia merged commit a71a2b9 into apache:master Oct 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants