New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove redundant TypeProvider creation at PlanPrinter #8650
Conversation
@@ -386,6 +386,16 @@ private static String formatFragment( | |||
return builder.toString(); | |||
} | |||
|
|||
private static TypeProvider getTypeProvider(List<PlanFragment> fragments) | |||
{ | |||
// somewhat faster than java stream |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is true, but we optimize coordinator code for readability.
While we have a rule not to use Streams in performance sensitive code (like query execution), planner uses streams a lot.
Is this place any special?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see. I think we could use the streams here.
// somewhat faster than java stream | ||
Map<Symbol, Type> symbols = new HashMap<>(); | ||
for (PlanFragment fragment : fragments) { | ||
fragment.getSymbols().forEach(symbols::put); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
previous code used distinct
on entries, so any duplicates where "coherent"
Here, you overwrite entries in symbols
without checking that symbol types do agree.
Please verify that.
Best way to do it is by using toImmutableMap
collector or ImmutableMap.builder()
.
@kokosing looks good to merge? |
I think so. |
We sometimes hit high coordinator CPU usage at generating text query plan. We found that tens of threads were generating test query plan at query-completion event. More specifically, it was building the same TypeProvider from fragments per each stage. When there're huge fragments and symbols in a query, building TypeProvider from all symbols could be heavy task so it should be generated once and reused.