Skip to content

[GLUTEN-4170][VL] Decouple partitions from plan to avoid driver stalled#4177

Merged
ulysses-you merged 12 commits into
apache:mainfrom
Yohahaha:decouple-partition
Jan 18, 2024
Merged

[GLUTEN-4170][VL] Decouple partitions from plan to avoid driver stalled#4177
ulysses-you merged 12 commits into
apache:mainfrom
Yohahaha:decouple-partition

Conversation

@Yohahaha
Copy link
Copy Markdown
Contributor

@Yohahaha Yohahaha commented Dec 25, 2023

What changes were proposed in this pull request?

There are two parts will lead driver stalled when scan contains lots of partitions,

  1. plan serialization happens in every GlutenPartition construction.
  2. GlutenWholeStageColumnarRDD#getPartitions
22374 partitions 44611 partitions
before #1 880ms 1352ms
#2 3662ms 17186ms
after #1 21ms 106ms
#2 6ms 25ms

This patch decouple scan splitInfo(LocalFileNodes) from ReadRel to avoid serialize substrait plan for each partition in Driver, when the plan is complex or the number of partitions is particularly large, the cost of this serialization cannot be ignored.

Stream splitInfo(inputIterator) still kept in ReadRel for now.

(Fixes: #4170)

How was this patch tested?

@github-actions
Copy link
Copy Markdown

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

@Yohahaha Yohahaha changed the title [GLUTEN-][VL] Decouple partitions from plan to avoid driver hang [GLUTEN-4170][VL] Decouple partitions from plan to avoid driver hang Dec 25, 2023
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI

@github-actions
Copy link
Copy Markdown

#4170

@Yohahaha Yohahaha changed the title [GLUTEN-4170][VL] Decouple partitions from plan to avoid driver hang [GLUTEN-4170][VL] Decouple partitions from plan to avoid driver stalled Dec 25, 2023
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI

@Yohahaha
Copy link
Copy Markdown
Contributor Author

Yohahaha commented Dec 25, 2023

@FelixYBW @rui-mo @philo-he @ulysses-you
let's discuss the solution based on current patch, the key is decouple scan partition from plan when serialization. I know this patch may seems tricky, and open to accept more better advices.

Copy link
Copy Markdown
Contributor

@ulysses-you ulysses-you left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @Yohahaha for the idea, looks reasonable to me. One concern is about the name, can we avoid using streamXxx ? I think iteratorXxx is easy to follow. We are using iterator in both java and native side.

Comment thread gluten-core/src/main/resources/substrait/proto/substrait/algebra.proto Outdated
Comment thread gluten-core/src/main/java/io/glutenproject/substrait/rel/LocalFilesNode.java Outdated
Comment thread cpp/velox/substrait/SubstraitToVeloxPlan.h Outdated
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI

@rui-mo rui-mo requested a review from zzcclp January 2, 2024 05:41
Copy link
Copy Markdown
Contributor

@rui-mo rui-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the improvement.

Comment thread cpp/velox/benchmarks/common/BenchmarkUtils.h Outdated
/// File schema
NamedStruct schema = 17;
/// File schema
NamedStruct schema = 17;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems these changes are not needed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's needed, there should be 6 leading space.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it about the format? How do you meet these errors?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just found this format issue and fix it.

Comment thread cpp/velox/substrait/SubstraitToVeloxPlan.cc
Comment thread cpp/velox/substrait/SubstraitToVeloxPlan.cc Outdated
Comment thread cpp/velox/substrait/SubstraitToVeloxPlan.cc
case (splitInfos, index) =>
wsCtx.substraitContext.initSplitInfosIndex(0)
wsCtx.substraitContext.setSplitInfos(splitInfos)
val substraitPlan = wsCtx.root.toProtobuf
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming the proposed optimization can also be applied for CH backend. If so, it will need some follow-up work from CH engineer. @baibaichen

@Yohahaha
Copy link
Copy Markdown
Contributor Author

Yohahaha commented Jan 4, 2024

I assume this solution looks well for all your guys.

@zhouyuan
Copy link
Copy Markdown
Member

zhouyuan commented Jan 4, 2024

@lgbo-ustc hi, this patch tries to refactor on gen file partitions, please take a look if it will impact CK backend

thanks, -yuan

@Yohahaha Yohahaha force-pushed the decouple-partition branch from 996bccc to 422a220 Compare January 4, 2024 04:03
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jan 4, 2024

Run Gluten Clickhouse CI

3 similar comments
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jan 4, 2024

Run Gluten Clickhouse CI

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jan 4, 2024

Run Gluten Clickhouse CI

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jan 4, 2024

Run Gluten Clickhouse CI

@Yohahaha
Copy link
Copy Markdown
Contributor Author

Yohahaha commented Jan 4, 2024

I see QueryBenchmark is similar with GenericBenchmark and QueryBenchmark does not covered by CI, could we remove it? @marin-ma @jinchengchenghh

@marin-ma
Copy link
Copy Markdown
Contributor

marin-ma commented Jan 4, 2024

I see QueryBenchmark is similar with GenericBenchmark and QueryBenchmark does not covered by CI, could we remove it? @marin-ma @jinchengchenghh

cc: @rui-mo

@rui-mo
Copy link
Copy Markdown
Contributor

rui-mo commented Jan 4, 2024

I see QueryBenchmark is similar with GenericBenchmark and QueryBenchmark does not covered by CI, could we remove it?

@Yohahaha GenericBenchmark uses arrow to read files, while QueryBenchmark uses Velox. So QueryBenchmark is useful when we want to test Velox TableScan. I think the better option is to enable QueryBenchmark on CI. @marin-ma Please help to confirm, thanks.

@marin-ma
Copy link
Copy Markdown
Contributor

marin-ma commented Jan 4, 2024

I see QueryBenchmark is similar with GenericBenchmark and QueryBenchmark does not covered by CI, could we remove it?

@Yohahaha GenericBenchmark uses arrow to read files, while QueryBenchmark uses Velox. So QueryBenchmark is useful when we want to test Velox TableScan. I think the better option is to enable QueryBenchmark on CI. @marin-ma Please help to confirm, thanks.

@rui-mo If input is from middle stage, GenericBenchmark will use arrow reader to load the input iterator. If input is from first stage, the whole pipeline is offloaded including table scan. Here's the doc https://github.com/oap-project/gluten/blob/main/docs/developers/MicroBenchmarks.md#generate-substrait-plan-and-input-for-any-query

Comment thread cpp/velox/benchmarks/GenericBenchmark.cc Outdated
Comment thread cpp/velox/compute/VeloxPlanConverter.h Outdated
Comment thread cpp/velox/substrait/SubstraitToVeloxPlan.h Outdated
@rui-mo
Copy link
Copy Markdown
Contributor

rui-mo commented Jan 4, 2024

@marin-ma Thanks for confirming. @Yohahaha We can remove QueryBenchmark because its functionality is covered by GenericBenchmark.

@Yohahaha Yohahaha force-pushed the decouple-partition branch from 375baed to c9fd781 Compare January 4, 2024 08:14
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI

Copy link
Copy Markdown
Member

@zhztheplayer zhztheplayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

Comment thread cpp/core/jni/JniWrapper.cc Outdated
Comment thread cpp/core/jni/JniWrapper.cc Outdated
Comment thread cpp/velox/compute/VeloxPlanConverter.cc Outdated
Comment thread cpp/velox/substrait/SubstraitToVeloxPlan.cc Outdated
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI

ulysses-you
ulysses-you previously approved these changes Jan 18, 2024
Copy link
Copy Markdown
Contributor

@ulysses-you ulysses-you left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thank you @Yohahaha

Copy link
Copy Markdown
Contributor

@marin-ma marin-ma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this work! Could you please also update the micro benchmark documentation? Noticed that we need to specify split files for first stages.

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI

@marin-ma
Copy link
Copy Markdown
Contributor

marin-ma commented Jan 18, 2024

LGTM. Thanks!

@zzcclp Do you have any further comments?

@ulysses-you
Copy link
Copy Markdown
Contributor

We can create followups if there are some new finding issues

@ulysses-you ulysses-you merged commit 2fc4503 into apache:main Jan 18, 2024
@Yohahaha Yohahaha deleted the decouple-partition branch January 18, 2024 10:19
@GlutenPerfBot
Copy link
Copy Markdown
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_4177_time.csv log/native_master_01_17_2024_6e070aee2_time.csv difference percentage
q1 32.83 32.53 -0.299 99.09%
q2 23.86 25.15 1.288 105.40%
q3 37.16 35.63 -1.529 95.89%
q4 38.23 39.47 1.240 103.24%
q5 69.55 69.91 0.365 100.52%
q6 6.74 7.16 0.425 106.31%
q7 80.76 83.43 2.667 103.30%
q8 84.40 86.98 2.576 103.05%
q9 118.49 125.57 7.081 105.98%
q10 40.96 42.09 1.132 102.76%
q11 19.53 20.23 0.697 103.57%
q12 25.49 27.47 1.981 107.77%
q13 44.31 44.85 0.538 101.21%
q14 20.65 17.86 -2.795 86.47%
q15 26.55 29.77 3.213 112.10%
q16 12.55 13.95 1.398 111.13%
q17 101.78 100.92 -0.855 99.16%
q18 147.60 146.37 -1.232 99.17%
q19 12.47 13.91 1.442 111.56%
q20 28.27 26.50 -1.771 93.74%
q21 222.72 226.25 3.531 101.59%
q22 13.74 13.71 -0.030 99.78%
total 1208.65 1229.72 21.063 101.74%

// Get the input schema of this iterator.
uint64_t colNum = 0;
std::vector<TypePtr> veloxTypeList;
if (readRel.has_base_schema()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find here you change the input iterator names, and comment ValueStreamNode in Velox does not support name change, but here updates the output type based on output type, do you meet any problem? @Yohahaha
https://github.com/apache/incubator-gluten/blob/c5b7e59335201f960cda49dff7edc315b36ed05e/cpp/velox/operators/plannodes/RowVectorStream.h#L100

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this update only take effect on the top level columns, what if there is nested column?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is used for constructing ValueStreamNode outputType, may also need to update nested column

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[VL] driver stalled before first job starts