Skip to content

Commit

Permalink
[SPARK-35792][SQL] View should not capture configs used in `RelationC…
Browse files Browse the repository at this point in the history
…onversions`

### What changes were proposed in this pull request?
`RelationConversions` is actually an optimization rule while it's executed in the analysis phase.
For view, it's designed to only capture semantic configs, so we should ignore the optimization
configs that will be used in the analysis phase.

This PR also fixes the issue that view resolution will always use the default value for uncaptured config

### Why are the changes needed?
Bugfix

### Does this PR introduce _any_ user-facing change?
Yes, after this PR view resolution will respect the values set in the current session for the below configs
```
"spark.sql.hive.convertMetastoreParquet"
"spark.sql.hive.convertMetastoreOrc"
"spark.sql.hive.convertInsertingPartitionedTable"
"spark.sql.hive.convertMetastoreCtas"
```

### How was this patch tested?
By running new UT:
```
$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *HiveSQLViewSuite"
```

Closes #32941 from linhongliu-db/SPARK-35792-ignore-convert-configs.

Authored-by: Linhong Liu <linhong.liu@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit b86a69f)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
  • Loading branch information
linhongliu-db authored and cloud-fan committed Jun 17, 2021
1 parent 7338ab3 commit 33ee3d9
Show file tree
Hide file tree
Showing 3 changed files with 40 additions and 2 deletions.
Expand Up @@ -500,7 +500,18 @@ object View {
if (activeConf.useCurrentSQLConfigsForView && !isTempView) return activeConf

val sqlConf = new SQLConf()
for ((k, v) <- configs) {
// We retain below configs from current session because they are not captured by view
// as optimization configs but they are still needed during the view resolution.
// TODO: remove this `retainedConfigs` after the `RelationConversions` is moved to
// optimization phase.
val retainedConfigs = activeConf.getAllConfs.filterKeys(key =>
Seq(
"spark.sql.hive.convertMetastoreParquet",
"spark.sql.hive.convertMetastoreOrc",
"spark.sql.hive.convertInsertingPartitionedTable",
"spark.sql.hive.convertMetastoreCtas"
).contains(key))
for ((k, v) <- configs ++ retainedConfigs) {
sqlConf.settings.put(k, v)
}
sqlConf
Expand Down
Expand Up @@ -352,6 +352,11 @@ object ViewHelper {
"spark.sql.execution.",
"spark.sql.shuffle.",
"spark.sql.adaptive.",
// ignore optimization configs used in `RelationConversions`
"spark.sql.hive.convertMetastoreParquet",
"spark.sql.hive.convertMetastoreOrc",
"spark.sql.hive.convertInsertingPartitionedTable",
"spark.sql.hive.convertMetastoreCtas",
SQLConf.ADDITIONAL_REMOTE_REPOSITORIES.key)

private val configAllowList = Seq(
Expand Down
Expand Up @@ -19,8 +19,9 @@ package org.apache.spark.sql.hive.execution

import org.apache.spark.sql.{AnalysisException, Row}
import org.apache.spark.sql.catalyst.TableIdentifier
import org.apache.spark.sql.catalyst.catalog.{CatalogStorageFormat, CatalogTable, CatalogTableType}
import org.apache.spark.sql.catalyst.catalog.{CatalogStorageFormat, CatalogTable, CatalogTableType, HiveTableRelation}
import org.apache.spark.sql.execution.SQLViewSuite
import org.apache.spark.sql.hive.HiveUtils
import org.apache.spark.sql.hive.test.TestHiveSingleton
import org.apache.spark.sql.types.{NullType, StructType}

Expand Down Expand Up @@ -157,4 +158,25 @@ class HiveSQLViewSuite extends SQLViewSuite with TestHiveSingleton {
)
}
}

test("SPARK-35792: ignore optimization configs used in RelationConversions") {
withTable("t_orc") {
withView("v_orc") {
withSQLConf(HiveUtils.CONVERT_METASTORE_ORC.key -> "true") {
spark.sql("create table t_orc stored as orc as select 1 as a, 2 as b")
spark.sql("create view v_orc as select * from t_orc")
}
withSQLConf(HiveUtils.CONVERT_METASTORE_ORC.key -> "false") {
val relationInTable = sql("select * from t_orc").queryExecution.analyzed.collect {
case r: HiveTableRelation => r
}.headOption
val relationInView = sql("select * from v_orc").queryExecution.analyzed.collect {
case r: HiveTableRelation => r
}.headOption
assert(relationInTable.isDefined)
assert(relationInView.isDefined)
}
}
}
}
}

0 comments on commit 33ee3d9

Please sign in to comment.