[SPARK-35332][SQL] Make cache plan disable configs configurable #32482

ulysses-you · 2021-05-09T05:20:26Z

What changes were proposed in this pull request?

Add a new config to make cache plan disable configs configurable.

Why are the changes needed?

The disable configs of cache plan if to avoid the perfermance regression, but not all the query will slow than before due to AQE or bucket scan enabled. It's useful to make a new config so that user can decide if some configs should be disabled during cache plan.

Does this PR introduce any user-facing change?

Yes, a new config.

How was this patch tested?

Add test.

SparkQA · 2021-05-09T06:07:43Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42825/

SparkQA · 2021-05-09T06:07:45Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42825/

SparkQA · 2021-05-09T09:44:18Z

Test build #138303 has finished for PR 32482 at commit f44e0f1.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

maropu

cc: @cloud-fan @c21

maropu · 2021-05-09T12:48:34Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+      .doc("Configurations needs to be turned off, to avoid regression for cached query, so that " +
+        "the outputPartitioning of the underlying cached query plan can be leveraged later.")


I think this comment is for developers, so it is difficult for a user to understand it. Could you brush up it more?

maropu · 2021-05-09T12:50:02Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+      .checkValue(_.forall(v => sqlConfEntries.containsKey(v) &&
+        sqlConfEntries.get(v).defaultValue.exists(_.isInstanceOf[Boolean])),
+        "config should be boolean type")
+      .createWithDefault(Seq(ADAPTIVE_EXECUTION_ENABLED.key, AUTO_BUCKETED_SCAN_ENABLED.key))


Any usecase to turn off these rules separately? I think it's okay just to use a boolean flag for this though.

+1, I agree with starting with simpler config format if possible.

maropu · 2021-05-09T12:55:19Z

sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala

@@ -1554,4 +1554,39 @@ class CachedTableSuite extends QueryTest with SQLTestUtils
      assert(!spark.catalog.isCached(viewName))
    }
  }
+
+  test("SPARK-35332: Make cache plan disable configs configurable") {


Could you add tests for more patterns? e.g.,

sql("""SET spark.sql.cache.disableConfigs=spark.sql.adaptive.enabled""") sql("CACHE TABLE test_table1 AS <query 1>") spark.table("test_table1").explain(true) <= AQE disabled sql("""SET spark.sql.cache.disableConfigs=""") sql("CACHE TABLE test_table2 AS <query 2>") spark.table("test_table2").explain(true) <= AQE enabled spark.table("test_table1").explain(true) <= AQE disabled sql("CACHE TABLE test_table3 AS <query 1>") spark.table("test_table3").explain(true) <= AQE disabled

dongjoon-hyun · 2021-05-09T14:05:08Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

@@ -1090,6 +1090,18 @@ object SQLConf {
      .booleanConf
      .createWithDefault(true)

+  val CACHE_DISABLE_CONFIGS =
+    buildConf("spark.sql.cache.disableConfigs")
+      .doc("Configurations needs to be turned off, to avoid regression for cached query, so that " +


Maybe, internal?

.internal()

dongjoon-hyun

When we expose something to the users, we have to consider the side-effects when the users make mistake. What happen when they touch this config in a wrong way? For example, the case removing ADAPTIVE_EXECUTION_ENABLED from the conf. Is it okay and safe to expose this?

Also, cc @sunchao since this is related to the bucketed scan too.

c21 · 2021-05-09T19:42:35Z

sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala

+    }.getMessage
+    assert(msg.contains("config should be boolean type"))
+
+    Seq(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key, "").foreach { disableConfig =>


could we also test for auto bucketed scan? thanks.

c21 · 2021-05-09T19:44:11Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+      .checkValue(_.forall(v => sqlConfEntries.containsKey(v) &&
+        sqlConfEntries.get(v).defaultValue.exists(_.isInstanceOf[Boolean])),
+        "config should be boolean type")
+      .createWithDefault(Seq(ADAPTIVE_EXECUTION_ENABLED.key, AUTO_BUCKETED_SCAN_ENABLED.key))


+1, I agree with starting with simpler config format if possible.

ulysses-you · 2021-05-10T04:19:07Z

Thank you @maropu @c21 @dongjoon-hyun .

Agree, the current config seems overkill to user, it's better to just make it as enabled.

Refactor this PR to address:

make the new config simple and improve the doc.
improve the test for two things, 1) more pattern with AQE test, 2) bucketed test

ulysses-you · 2021-05-10T04:20:43Z

sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala

@@ -1175,7 +1175,7 @@ class CachedTableSuite extends QueryTest with SQLTestUtils
  }

  test("cache supports for intervals") {
-    withTable("interval_cache") {
+    withTable("interval_cache", "t1") {


not related this pr, but affected the new added test with t1.

SparkQA · 2021-05-10T05:02:50Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42838/

SparkQA · 2021-05-10T05:02:51Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42838/

cloud-fan · 2021-05-10T06:52:40Z

In general, I think it's better to optimize the cached plan more aggressively for better performance, even though it may cause perf regression due to output partitioning change, which should be rare.

About the config name, how about spark.sql.optimizer.canChangeCachedPlanOutputPartitioning?

c21 · 2021-05-10T07:19:50Z

cc @viirya as well as he was finding the bug and raised the concern for auto bucketed scan for cached query.

viirya · 2021-05-10T07:23:29Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+      .doc(s"When true, some configs are disabled during executing cache plan that is to avoid " +
+        s"performance regression if other queries hit the cached plan. Currently, the disabled " +


nit: don't need s"".

viirya · 2021-05-10T07:31:54Z

sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala

   */
  private[sql] def getOrCloneSessionWithConfigsOff(
      session: SparkSession,
      configurations: Seq[ConfigEntry[Boolean]]): SparkSession = {
-    val configsEnabled = configurations.filter(session.sessionState.conf.getConf(_))
-    if (configsEnabled.isEmpty) {
+    if (!session.sessionState.conf.getConf(SQLConf.CACHE_DISABLE_CONFIGS_ENABLED)) {


If getOrCloneSessionWithConfigsOff is used for disabling other configs? It is also automatically under control by CACHE_DISABLE_CONFIGS_ENABLED?

agree, it would introduce potential issue in future. Moved to CacheManager.

viirya · 2021-05-10T07:33:35Z

sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala

@@ -1069,21 +1069,26 @@ object SparkSession extends Logging {
  }

  /**
-   * Returns a cloned SparkSession with all specified configurations disabled, or
-   * the original SparkSession if all configurations are already disabled.
+   * When CACHE_DISABLE_CONFIGS_ENABLED is enabled, returns a cloned SparkSession with all


This getOrCloneSessionWithConfigsOff is not claimed to be only used for cache plan. It is a bit logically wrong to have a so called cache disable config here.

viirya · 2021-05-10T07:37:56Z

So it sounds like this wants to have an aggressive option when preparing cache plan, right? The config name can be refined actually. The first glance confuses me.

SparkQA · 2021-05-10T08:48:36Z

Test build #138316 has finished for PR 32482 at commit 7625677.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

SparkQA · 2021-05-11T01:59:34Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42869/

SparkQA · 2021-05-11T01:59:35Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42869/

SparkQA · 2021-05-11T02:20:43Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42871/

SparkQA · 2021-05-11T02:34:07Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42871/

SparkQA · 2021-05-11T03:12:18Z

Test build #138349 has finished for PR 32482 at commit 30b0572.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2021-05-11T05:07:51Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42877/

cloud-fan · 2021-05-11T08:32:56Z

sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala

+            sql("CACHE TABLE t1 as SELECT /*+ REPARTITION */ * FROM values(1) as t(c)")
+            assert(spark.table("t1").rdd.partitions.length == 2)
+
+            sql(s"SET ${SQLConf.CAN_CHANGE_CACHED_PLAN_OUTPUT_PARTITIONING.key} = true")


let's turn each of these SET ... to withSQLConf

SparkQA · 2021-05-11T08:42:32Z

Test build #138355 has finished for PR 32482 at commit 708bb0c.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2021-05-11T09:17:09Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42894/

SparkQA · 2021-05-11T09:17:10Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42894/

SparkQA · 2021-05-11T10:07:44Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42896/

SparkQA · 2021-05-11T10:14:07Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42896/

SparkQA · 2021-05-11T12:53:00Z

Test build #138371 has finished for PR 32482 at commit a83e9cf.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2021-05-11T13:48:35Z

Test build #138373 has finished for PR 32482 at commit 515aeba.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

maropu

Looks fine otherwise.

maropu · 2021-05-12T07:49:04Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+  val CAN_CHANGE_CACHED_PLAN_OUTPUT_PARTITIONING =
+    buildConf("spark.sql.optimizer.canChangeCachedPlanOutputPartitioning")
+      .internal()
+      .doc(s"When false, some configs are disabled during executing cache plan that is to avoid " +


How about this?

Whether to forcibly enable some optimization rules that can change the output partitioning of a cached query when executing it for caching. If it is set to true, queries may need an extra shuffle to read the cached data. This configuration is disabled by default. Currently, the optimization rules enabled by this configuration are ${ADAPTIVE_EXECUTION_ENABLED.key} and ${AUTO_BUCKETED_SCAN_ENABLED.key}.

maropu · 2021-05-12T08:18:00Z

sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala

+      SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "true") {
+
+      withTempView("t1", "t2", "t3") {
+        withCache("t1", "t2", "t3") {


withTempView drops caches, too?

yes, removed it.

SparkQA · 2021-05-12T11:28:14Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42970/

SparkQA · 2021-05-12T11:35:27Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42970/

SparkQA · 2021-05-12T15:05:35Z

Test build #138449 has finished for PR 32482 at commit 16b8a22.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2021-05-13T07:33:34Z

sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala

+            sql("CACHE TABLE t2 as SELECT /*+ REPARTITION */ * FROM values(2) as t(c)")
+            assert(spark.table("t2").rdd.partitions.length == 1)
+
+            withSQLConf(SQLConf.CAN_CHANGE_CACHED_PLAN_OUTPUT_PARTITIONING.key -> "false") {


why is this one nested?

I'm expecting something like

withSQLConf(SQLConf.CAN_CHANGE_CACHED_PLAN_OUTPUT_PARTITIONING.key -> "false") { test1 } withSQLConf(SQLConf.CAN_CHANGE_CACHED_PLAN_OUTPUT_PARTITIONING.key -> "true") { test2 } withSQLConf(SQLConf.CAN_CHANGE_CACHED_PLAN_OUTPUT_PARTITIONING.key -> "false") { test3 }

Updated it.

cloud-fan · 2021-05-13T07:35:15Z

sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala

+            sql("CACHE TABLE t2")
+            assert(spark.table("t2").rdd.partitions.length == 1)
+
+            withSQLConf(SQLConf.CAN_CHANGE_CACHED_PLAN_OUTPUT_PARTITIONING.key -> "false") {


SparkQA · 2021-05-13T10:15:58Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43024/

SparkQA · 2021-05-13T10:22:12Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43024/

SparkQA · 2021-05-13T13:51:43Z

Test build #138504 has finished for PR 32482 at commit 2e8492b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2021-05-13T14:49:04Z

thanks, merging to master!

viirya · 2021-05-13T17:33:10Z

sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala

@@ -328,4 +326,15 @@ class CacheManager extends Logging with AdaptiveSparkPlanHelper {
    if (needToRefresh) fileIndex.refresh()
    needToRefresh
  }
+
+  /**
+   * If CAN_CHANGE_CACHED_PLAN_OUTPUT_PARTITIONING is disabled, just return original session.


"If CAN_CHANGE_CACHED_PLAN_OUTPUT_PARTITIONING is disabled" -> do we mean enabled?

good catch!

Sorry, forgot to fix the comment, create followup #32543.

viirya

lgtm with one minor comment.

github-actions bot added the SQL label May 9, 2021

maropu reviewed May 9, 2021

View reviewed changes

dongjoon-hyun reviewed May 9, 2021

View reviewed changes

c21 reviewed May 9, 2021

View reviewed changes

refactor

7625677

ulysses-you force-pushed the SPARK-35332 branch from f44e0f1 to 7625677 Compare May 10, 2021 04:18

ulysses-you commented May 10, 2021

View reviewed changes

viirya reviewed May 10, 2021

View reviewed changes

HyukjinKwon reviewed May 10, 2021

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala Outdated Show resolved Hide resolved

ulysses-you added 2 commits May 11, 2021 09:14

address comment

f0c99db

refine default value

30b0572

fix

708bb0c

cloud-fan reviewed May 11, 2021

View reviewed changes

fix

515aeba

maropu reviewed May 12, 2021

View reviewed changes

ulysses-you added 2 commits May 12, 2021 17:58

test

763312f

docs

16b8a22

maropu approved these changes May 13, 2021

View reviewed changes

cloud-fan reviewed May 13, 2021

View reviewed changes

comment

2e8492b

cloud-fan closed this in 6f63057 May 13, 2021

viirya reviewed May 13, 2021

View reviewed changes

ulysses-you deleted the SPARK-35332 branch May 14, 2021 01:22

		.doc("Configurations needs to be turned off, to avoid regression for cached query, so that " +
		"the outputPartitioning of the underlying cached query plan can be leveraged later.")

		.doc(s"When true, some configs are disabled during executing cache plan that is to avoid " +
		s"performance regression if other queries hit the cached plan. Currently, the disabled " +

[SPARK-35332][SQL] Make cache plan disable configs configurable #32482

[SPARK-35332][SQL] Make cache plan disable configs configurable #32482

Conversation

ulysses-you commented May 9, 2021

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

SparkQA commented May 9, 2021

SparkQA commented May 9, 2021

SparkQA commented May 9, 2021

maropu left a comment

Choose a reason for hiding this comment

maropu May 9, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ulysses-you commented May 10, 2021

Choose a reason for hiding this comment

SparkQA commented May 10, 2021

SparkQA commented May 10, 2021

cloud-fan commented May 10, 2021 • edited Loading

c21 commented May 10, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

viirya commented May 10, 2021

SparkQA commented May 10, 2021

SparkQA commented May 11, 2021

SparkQA commented May 11, 2021

SparkQA commented May 11, 2021

SparkQA commented May 11, 2021

SparkQA commented May 11, 2021

SparkQA commented May 11, 2021

Choose a reason for hiding this comment

SparkQA commented May 11, 2021

SparkQA commented May 11, 2021

SparkQA commented May 11, 2021

SparkQA commented May 11, 2021

SparkQA commented May 11, 2021

SparkQA commented May 11, 2021

SparkQA commented May 11, 2021

maropu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented May 12, 2021

SparkQA commented May 12, 2021

SparkQA commented May 12, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented May 13, 2021

SparkQA commented May 13, 2021

SparkQA commented May 13, 2021

cloud-fan commented May 13, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

viirya left a comment

Choose a reason for hiding this comment

maropu May 9, 2021 •

edited

Loading

cloud-fan commented May 10, 2021 •

edited

Loading