Skip to content

Commit

Permalink
[SPARK-34359][SQL] Add a legacy config to restore the output schema o…
Browse files Browse the repository at this point in the history
…f SHOW DATABASES

This is a followup of #26006

In #26006 , we merged the v1 and v2 SHOW DATABASES/NAMESPACES commands, but we missed a behavior change that the output schema of SHOW DATABASES becomes different.

This PR adds a legacy config to restore the old schema, with a migration guide item to mention this behavior change.

Improve backward compatibility

No (the legacy config is false by default)

a new test

Closes #31474 from cloud-fan/command-schema.

Lead-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
  • Loading branch information
cloud-fan and cloud-fan committed Feb 5, 2021
1 parent 5f3b8b8 commit 4534b51
Show file tree
Hide file tree
Showing 6 changed files with 33 additions and 5 deletions.
2 changes: 2 additions & 0 deletions docs/sql-migration-guide.md
Expand Up @@ -72,6 +72,8 @@ license: |

- In Spark 3.0.2, `PARTITION(col=null)` is always parsed as a null literal in the partition spec. In Spark 3.0.1 or earlier, it is parsed as a string literal of its text representation, e.g., string "null", if the partition column is string type. To restore the legacy behavior, you can set `spark.sql.legacy.parseNullPartitionSpecAsStringLiteral` as true.

- In Spark 3.0.0, the output schema of `SHOW DATABASES` becomes `namespace: string`. In Spark version 2.4 and earlier, the schema was `databaseName: string`. Since Spark 3.0.2, you can restore the old schema by setting `spark.sql.legacy.keepCommandOutputSchema` to `true`.

## Upgrading from Spark SQL 3.0 to 3.0.1

- In Spark 3.0, JSON datasource and JSON function `schema_of_json` infer TimestampType from string values if they match to the pattern defined by the JSON option `timestampFormat`. Since version 3.0.1, the timestamp type inference is disabled by default. Set the JSON option `inferTimestamp` to `true` to enable such type inference.
Expand Down
Expand Up @@ -321,11 +321,14 @@ case class AlterNamespaceSetLocation(
*/
case class ShowNamespaces(
namespace: LogicalPlan,
pattern: Option[String]) extends Command {
pattern: Option[String],
override val output: Seq[Attribute] = ShowNamespaces.OUTPUT) extends Command {
override def children: Seq[LogicalPlan] = Seq(namespace)
override def producedAttributes: AttributeSet = outputSet
}

override val output: Seq[Attribute] = Seq(
AttributeReference("namespace", StringType, nullable = false)())
object ShowNamespaces {
val OUTPUT = Seq(AttributeReference("namespace", StringType, nullable = false)())
}

/**
Expand Down
Expand Up @@ -3017,6 +3017,15 @@ object SQLConf {
.booleanConf
.createWithDefault(false)

val LEGACY_KEEP_COMMAND_OUTPUT_SCHEMA =
buildConf("spark.sql.legacy.keepCommandOutputSchema")
.internal()
.doc("When true, Spark will keep the output schema of commands such as SHOW DATABASES " +
"unchanged, for v1 catalog and/or table.")
.version("3.0.2")
.booleanConf
.createWithDefault(false)

/**
* Holds information about keys that have been deprecated.
*
Expand Down
Expand Up @@ -237,6 +237,14 @@ class ResolveSessionCatalog(
}
AlterDatabaseSetLocationCommand(ns.head, location)

case s @ ShowNamespaces(ResolvedNamespace(cata, _), _, output) if isSessionCatalog(cata) =>
if (conf.getConf(SQLConf.LEGACY_KEEP_COMMAND_OUTPUT_SCHEMA)) {
assert(output.length == 1)
s.copy(output = Seq(output.head.withName("databaseName")))
} else {
s
}

// v1 RENAME TABLE supports temp view.
case RenameTableStatement(TempViewOrV1Table(oldName), newName, isView) =>
AlterTableRenameCommand(oldName.asTableIdentifier, newName.asTableIdentifier, isView)
Expand Down
Expand Up @@ -315,8 +315,8 @@ class DataSourceV2Strategy(session: SparkSession) extends Strategy with Predicat
case DropNamespace(ResolvedNamespace(catalog, ns), ifExists, cascade) =>
DropNamespaceExec(catalog, ns, ifExists, cascade) :: Nil

case r @ ShowNamespaces(ResolvedNamespace(catalog, ns), pattern) =>
ShowNamespacesExec(r.output, catalog.asNamespaceCatalog, ns, pattern) :: Nil
case ShowNamespaces(ResolvedNamespace(catalog, ns), pattern, output) =>
ShowNamespacesExec(output, catalog.asNamespaceCatalog, ns, pattern) :: Nil

case r @ ShowTables(ResolvedNamespace(catalog, ns), pattern) =>
ShowTablesExec(r.output, catalog.asTableCatalog, ns, pattern) :: Nil
Expand Down
Expand Up @@ -1324,6 +1324,12 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils {
Nil)
}

test("SPARK-34359: keep the legacy output schema") {
withSQLConf(SQLConf.LEGACY_KEEP_COMMAND_OUTPUT_SCHEMA.key -> "true") {
assert(sql("SHOW NAMESPACES").schema.fieldNames.toSeq == Seq("databaseName"))
}
}

test("drop view - temporary view") {
val catalog = spark.sessionState.catalog
sql(
Expand Down

0 comments on commit 4534b51

Please sign in to comment.