Skip to content

[SPARK-21509][SQL] Add a config to enable adaptive query execution only for the last que…#18713

Closed
jinxing64 wants to merge 1 commit intoapache:masterfrom
jinxing64:SPARK-21509
Closed

[SPARK-21509][SQL] Add a config to enable adaptive query execution only for the last que…#18713
jinxing64 wants to merge 1 commit intoapache:masterfrom
jinxing64:SPARK-21509

Conversation

@jinxing64
Copy link

@jinxing64 jinxing64 commented Jul 22, 2017

What changes were proposed in this pull request?

Feature of adaptive query execution is a good way to avoid generating too many small files on HDFS, like mentioned in SPARK-16188.
When feature of adaptive query execution is enabled, all shuffles will be coordinated. The drawbacks:

  1. It's hard to balance the num of reducers(this decides the processing speed) and file size on HDFS;
  2. It generates some more shuffles(https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala#L101);
  3. It generates lots of jobs, which have extra cost for scheduling.

We can add a config and enable adaptive query execution only for the last shuffle.

How was this patch tested?

Unit test.

@SparkQA
Copy link

SparkQA commented Jul 22, 2017

Test build #79869 has finished for PR 18713 at commit efb8cd3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jinxing64
Copy link
Author

jinxing64 commented Jul 24, 2017

cc @cloud-fan @jiangxb1987
Would you please help comment this?

@cloud-fan
Copy link
Contributor

adaptive query is really an experimental and uncompleted feature, I'm hesitant to modify it unless we have a holistic plan about how to improve it.

@jinxing64
Copy link
Author

Ok, I will close this for now.

@jinxing64 jinxing64 closed this Jul 27, 2017
@jinxing64
Copy link
Author

cc @cenyuhai
As we talked offline, maybe your have interest on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants