Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect query semantic analysis query statistic #1084

Merged
merged 6 commits into from Aug 17, 2019

Conversation

@kokosing
Copy link
Member

commented Jul 4, 2019

Collect query semantic analysis query statistic

@cla-bot cla-bot bot added the cla-signed label Jul 4, 2019

@kokosing kokosing requested a review from dain Jul 4, 2019

@kokosing kokosing force-pushed the kokosing:origin/master/136_analysis_time branch from 1e46f48 to 22cb75c Jul 4, 2019

return doAnalyzeQuery();
}
catch (StackOverflowError e) {
throw new PrestoException(NOT_SUPPORTED, "statement is too large (stack overflow during analysis)", e);

This comment has been minimized.

Copy link
@dain

dain Jul 28, 2019

Member

I'm ok with this, but want to check with @martint and @electrum

This comment has been minimized.

Copy link
@martint

martint Jul 28, 2019

Member

Is this being replaced elsewhere?

This comment has been minimized.

Copy link
@kokosing

kokosing Jul 29, 2019

Author Member

Is this being replaced elsewhere?

No, it is just removal. As I understand it was here because it was wrapping antlr parser in the past. But now parsing is done earlier and here we only analyse the AST. Then only place where StackOverflowError could be thrown here are visitors in the plan. Do you think it is possible that query can pass parser but then fail during analysis?

Anyway, this commit is not much important in this PR so I extracted it to: #1205

@@ -311,7 +311,7 @@ public void start()
}

// analyze query

This comment has been minimized.

Copy link
@dain

dain Jul 28, 2019

Member

This comment is wrong

This comment has been minimized.

Copy link
@kokosing

kokosing Jul 29, 2019

Author Member

Let me remove such comments.

@@ -358,10 +358,10 @@ public void addFinalQueryInfoListener(StateChangeListener<QueryInfo> stateChange
stateMachine.addQueryInfoStateChangeListener(stateChangeListener);
}

private PlanRoot analyzeQuery()
private PlanRoot logicalPlan()
{
// time analysis phase

This comment has been minimized.

Copy link
@dain

dain Jul 28, 2019

Member

same problem with comments here also

@@ -358,10 +358,10 @@ public void addFinalQueryInfoListener(StateChangeListener<QueryInfo> stateChange
stateMachine.addQueryInfoStateChangeListener(stateChangeListener);
}

private PlanRoot analyzeQuery()
private PlanRoot logicalPlan()

This comment has been minimized.

Copy link
@dain

dain Jul 28, 2019

Member

I'm not sure logicalPlan is the right name here? I'm not super familiar with the terms used for planning, but method captures the initial plan building and all of the optimizations. The only thing sit doesn't capture is analysis and distributed planning (i.e., fragmentation). @martint or @findepi thoughts?

This comment has been minimized.

Copy link
@martint

martint Jul 28, 2019

Member

It should just be called “planQuery”. The stat should be called “planningTime”

This comment has been minimized.

Copy link
@kokosing

kokosing Jul 29, 2019

Author Member

The problem here is that there is also PLANNING query state so planning time is currently measured and it tracks time spent in this state. Notice that during that state we do analyzeQuery/logicalPlan and planDistribution.

This comment has been minimized.

Copy link
@kokosing

kokosing Jul 29, 2019

Author Member

Maybe we could remove this measurement at all, because it is already captured by time spent in PLANNING state?

Then what about measuring time for planDistribution. In the first place I don't know why it was captured. Where there any issues with this?

This comment has been minimized.

Copy link
@martint

martint Aug 13, 2019

Member

Then what about measuring time for planDistribution. In the first place I don't know why it was captured. Where there any issues with this?

Originally. it included some interactions with HDFS for enumerating splits, so that's why we measured it. I don't think it matters anymore, so we could remove it.

@@ -63,6 +63,7 @@ public void testConstructor()
Duration.valueOf("35m"),
Duration.valueOf("44m"),
Duration.valueOf("9m"),
Duration.valueOf("10s"),

This comment has been minimized.

Copy link
@dain

dain Jul 28, 2019

Member

Maybe pick a different, unique, value.

This comment has been minimized.

Copy link
@kokosing

kokosing Jul 29, 2019

Author Member

Any suggestion? What is wrong here. It would be nice that all the values here would be unique and follow some pattern so it could be ease to validate the assertion. However, because all the changes around this code the pattern is no longer visible. So I tried to use an unique value.

This comment has been minimized.

Copy link
@dain

dain Aug 1, 2019

Member

99?

@dain dain removed their assignment Jul 28, 2019

@kokosing kokosing force-pushed the kokosing:origin/master/136_analysis_time branch 2 times, most recently from ea12b1b to 34dbcae Jul 29, 2019

@kokosing
Copy link
Member Author

left a comment

Comments addressed

return doAnalyzeQuery();
}
catch (StackOverflowError e) {
throw new PrestoException(NOT_SUPPORTED, "statement is too large (stack overflow during analysis)", e);

This comment has been minimized.

Copy link
@kokosing

kokosing Jul 29, 2019

Author Member

Is this being replaced elsewhere?

No, it is just removal. As I understand it was here because it was wrapping antlr parser in the past. But now parsing is done earlier and here we only analyse the AST. Then only place where StackOverflowError could be thrown here are visitors in the plan. Do you think it is possible that query can pass parser but then fail during analysis?

Anyway, this commit is not much important in this PR so I extracted it to: #1205

@@ -358,10 +358,10 @@ public void addFinalQueryInfoListener(StateChangeListener<QueryInfo> stateChange
stateMachine.addQueryInfoStateChangeListener(stateChangeListener);
}

private PlanRoot analyzeQuery()
private PlanRoot logicalPlan()

This comment has been minimized.

Copy link
@kokosing

kokosing Jul 29, 2019

Author Member

The problem here is that there is also PLANNING query state so planning time is currently measured and it tracks time spent in this state. Notice that during that state we do analyzeQuery/logicalPlan and planDistribution.

@@ -63,6 +63,7 @@ public void testConstructor()
Duration.valueOf("35m"),
Duration.valueOf("44m"),
Duration.valueOf("9m"),
Duration.valueOf("10s"),

This comment has been minimized.

Copy link
@kokosing

kokosing Jul 29, 2019

Author Member

Any suggestion? What is wrong here. It would be nice that all the values here would be unique and follow some pattern so it could be ease to validate the assertion. However, because all the changes around this code the pattern is no longer visible. So I tried to use an unique value.

@@ -311,7 +311,7 @@ public void start()
}

// analyze query

This comment has been minimized.

Copy link
@kokosing

kokosing Jul 29, 2019

Author Member

Let me remove such comments.

@dain
Copy link
Member

left a comment

Looks good to me, but you and @martint need to resolve the name of the planning timer.

}
catch (StackOverflowError e) {
throw new PrestoException(NOT_SUPPORTED, "statement is too large (stack overflow during analysis)", e);
}
}

private PlanRoot doAnalyzeQuery()
private PlanRoot doLogicalPlann()

This comment has been minimized.

Copy link
@dain

dain Aug 1, 2019

Member

typo: doLogicalPlanning

@@ -63,6 +63,7 @@ public void testConstructor()
Duration.valueOf("35m"),
Duration.valueOf("44m"),
Duration.valueOf("9m"),
Duration.valueOf("10s"),

This comment has been minimized.

Copy link
@dain

dain Aug 1, 2019

Member

99?

@kokosing kokosing force-pushed the kokosing:origin/master/136_analysis_time branch from 34dbcae to 35a4744 Aug 2, 2019

@kokosing

This comment has been minimized.

Copy link
Member Author

commented Aug 2, 2019

@dain Thanks. @martint I will assign to this to you.

@kokosing kokosing assigned martint and unassigned dain Aug 2, 2019

@kokosing kokosing force-pushed the kokosing:origin/master/136_analysis_time branch 2 times, most recently from 46ec97e to 33efe6c Aug 14, 2019

@kokosing

This comment has been minimized.

Copy link
Member Author

commented Aug 14, 2019

@dain, @martint Removed planDistributionTime and so there is single planningTime and analysisTime which is clear and simple.

@@ -307,21 +307,17 @@ public void start()
{
try (SetThreadName ignored = new SetThreadName("Query-%s", stateMachine.getQueryId())) {
try {
// transition to planning

This comment has been minimized.

Copy link
@dain

dain Aug 15, 2019

Member

Typo in commit message coments => comments

@dain

This comment has been minimized.

Copy link
Member

commented Aug 15, 2019

I think the one thing to finishup is the naming of the logicalPlan. @martint had suggested a new name.

kokosing added some commits Jul 29, 2019

Remove analysisTime and distributedPlanningTime
 - analysisTime is no longer measuring query analysis time, but time
 that is spent in logical planner
 - distributedPlanningTime no longer measures anything meaningful

@kokosing kokosing force-pushed the kokosing:origin/master/136_analysis_time branch from 33efe6c to 50c4a34 Aug 16, 2019

@kokosing

This comment has been minimized.

Copy link
Member Author

commented Aug 16, 2019

I think the one thing to finishup is the naming of the logicalPlan

Thanks, I missed that. Fixed.

@dain

dain approved these changes Aug 16, 2019

@dain

This comment has been minimized.

Copy link
Member

commented Aug 16, 2019

Looks good

@kokosing kokosing merged commit 8a9bcf0 into prestosql:master Aug 17, 2019

2 checks passed

Travis CI - Pull Request Build Passed
Details
verification/cla-signed
Details

@kokosing kokosing deleted the kokosing:origin/master/136_analysis_time branch Aug 17, 2019

@kokosing kokosing referenced this pull request Aug 17, 2019

Open

Release notes for 318 #1238

1 of 7 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
3 participants
You can’t perform that action at this time.