Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove redundant TypeProvider creation at PlanPrinter #8650

Merged
merged 1 commit into from Oct 29, 2021

Conversation

miniway
Copy link
Contributor

@miniway miniway commented Jul 23, 2021

We sometimes hit high coordinator CPU usage at generating text query plan. We found that tens of threads were generating test query plan at query-completion event. More specifically, it was building the same TypeProvider from fragments per each stage. When there're huge fragments and symbols in a query, building TypeProvider from all symbols could be heavy task so it should be generated once and reused.

"dispatcher-query-3176" #102392 daemon prio=5 os_prio=0 cpu=5758130.66ms elapsed=9524.91s tid=0x0000ffceec0e6860 nid=0x2486 runnable  [0x0000ffcdc8e73000]
   java.lang.Thread.State: RUNNABLE
    at com.google.common.collect.Hashing.smearedHash(Hashing.java:54)
    at com.google.common.collect.RegularImmutableSet.contains(RegularImmutableSet.java:56)
    at com.google.common.base.Predicates$InPredicate.apply(Predicates.java:556)
    at com.google.common.base.Predicates$CompositionPredicate.apply(Predicates.java:596)
    at com.google.common.base.Predicate.test(Predicate.java:79)
    at com.google.common.collect.CollectSpliterators$1Splitr.tryAdvance(CollectSpliterators.java:154)
    at java.util.Spliterator.forEachRemaining(java.base@11.0.10/Spliterator.java:326)
    at java.util.stream.ReferencePipeline$Head.forEach(java.base@11.0.10/ReferencePipeline.java:658)
    at java.util.stream.ReferencePipeline$7$1.accept(java.base@11.0.10/ReferencePipeline.java:274)
    at java.util.Spliterators$ArraySpliterator.forEachRemaining(java.base@11.0.10/Spliterators.java:948)
    at java.util.stream.AbstractPipeline.copyInto(java.base@11.0.10/AbstractPipeline.java:484)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(java.base@11.0.10/AbstractPipeline.java:474)
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(java.base@11.0.10/ReduceOps.java:913)
    at java.util.stream.AbstractPipeline.evaluate(java.base@11.0.10/AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.collect(java.base@11.0.10/ReferencePipeline.java:578)
    at io.prestosql.sql.planner.planprinter.PlanPrinter.formatFragment(PlanPrinter.java:331)
    at io.prestosql.sql.planner.planprinter.PlanPrinter.textDistributedPlan(PlanPrinter.java:239)
    at io.prestosql.event.QueryMonitor.createTextQueryPlan(QueryMonitor.java:282)
    at io.prestosql.event.QueryMonitor.createQueryMetadata(QueryMonitor.java:211)
    at io.prestosql.event.QueryMonitor.queryCompletedEvent(QueryMonitor.java:189)
    at io.prestosql.execution.SqlQueryManager.lambda$createQuery$4(SqlQueryManager.java:229)
    at io.prestosql.execution.SqlQueryManager$$Lambda$2680/0x0000ffce6b3aecb0.stateChanged(Unknown Source)
    at io.prestosql.execution.QueryStateMachine.lambda$addQueryInfoStateChangeListener$15(QueryStateMachine.java:882)
    at io.prestosql.execution.QueryStateMachine$$Lambda$2681/0x0000ffce6b3ad0b0.stateChanged(Unknown Source)
    at io.prestosql.execution.StateMachine.fireStateChangedListener(StateMachine.java:229)
    at io.prestosql.execution.StateMachine.lambda$fireStateChanged$0(StateMachine.java:221)
    at io.prestosql.execution.StateMachine$$Lambda$2671/0x0000ffce6b3a1508.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.10/ThreadPoolExecutor.java:1128)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.10/ThreadPoolExecutor.java:628)
    at java.lang.Thread.run(java.base@11.0.10/Thread.java:829)

@cla-bot cla-bot bot added the cla-signed label Jul 23, 2021
@@ -386,6 +386,16 @@ private static String formatFragment(
return builder.toString();
}

private static TypeProvider getTypeProvider(List<PlanFragment> fragments)
{
// somewhat faster than java stream
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true, but we optimize coordinator code for readability.
While we have a rule not to use Streams in performance sensitive code (like query execution), planner uses streams a lot.

Is this place any special?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see. I think we could use the streams here.

// somewhat faster than java stream
Map<Symbol, Type> symbols = new HashMap<>();
for (PlanFragment fragment : fragments) {
fragment.getSymbols().forEach(symbols::put);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previous code used distinct on entries, so any duplicates where "coherent"

Here, you overwrite entries in symbols without checking that symbol types do agree.
Please verify that.
Best way to do it is by using toImmutableMap collector or ImmutableMap.builder().

@hashhar
Copy link
Member

hashhar commented Oct 28, 2021

@kokosing looks good to merge?

@kokosing
Copy link
Member

I think so.

@kokosing kokosing merged commit 8d3d4c1 into trinodb:master Oct 29, 2021
@github-actions github-actions bot added this to the 364 milestone Oct 29, 2021
@kokosing kokosing mentioned this pull request Oct 29, 2021
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

4 participants