[SPARK-42398][SQL] Refine default column value framework #39942

cloud-fan · 2023-02-08T11:56:18Z

What changes were proposed in this pull request?

The current column default value framework has a few problems:

The DDL commands can only hold StructType. We must store the default value as the SQL text in StructField metadata, then parse and resolve it with a hacky analyzer later.
The default column validation logic is a bit messy. The consumer side (DML commands) requires the default value to be foldable but the producer side (DDL commands) does not.
The DS V2 API is a bit inconsistent. The createTable API only takes StructType, so implementations must know the special metadata key of the default value to access it. The TableChange API has the default value as an individual field.

This PR refines the default value framework to solve these problems:

The DDL commands now hold Column, which keeps the default value as a catalyst Expression. Then it can be resolved with the DDL commands by the main analyzer.
Due to the above, now we can centralize the validation logic in CheckAnalysis.
The DS v2 createTable API now takes Column (different from the catalyst Column), which holds the default value as a v2 Literal as well as the original SQL. This avoids the need of special metadata key and is also more extensible when adding more special cols like generated cols. This is also type-safe and makes sure the default value is literal. The implementation is free to decide how to encode and store default values. Note: backward compatibility is taken care.

For custom v2 catalogs, the workflow is:

parser creates DDL commands, with default value as a catalyst expression.
analyzer resolves DDL commands and the default value expression.
CheckAnalysis validates default value expression and makes sure it's literal
optimizer optimizes default value to constant
Spark turns the default value constant to v2 literal and calls DS v2 APIs.
v2 catalogs should properly implement default value in their writers.

For v1 catalog (the Hive catalog), the workflow is:

parser creates DDL commands, with default value as a catalyst expression.
analyzer resolves DDL commands and the default value expression.

After step 2, we have 2 branches:
1.ResolveSessionCatalog turns some DDL commands to v1 version. It invokes CheckAnalysis to do validation, invokes ContantFolding to create constants, and put default value constant as a SQL text in StructField metadata.

Another branch is still running as v2 commands:

CheckAnalysis validates default value expression and makes sure it's literal
optimizer optimizes default value to constant
Spark turns the default value constant to v2 literal and calls DS v2 APIs.
V2SessionCatalog, the v2 API adaptor of v1 catalog, turns v2 literal to SQL text and put it in StructField metadata.

TODO (in followup PRs): the consumer side (DML commands) should only parse the default value text and use the main analyzer to resolve it. Then we can remove the hacky analyzer completely.

Why are the changes needed?

Refine the framework before releasing it

Does this PR introduce any user-facing change?

no

How was this patch tested?

existing tests

cloud-fan · 2023-02-10T03:24:53Z

...cker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCNamespaceTest.scala

@@ -118,7 +119,7 @@ private[v2] trait V2JDBCNamespaceTest extends SharedSparkSession with DockerInte
      // Drop non empty namespace without cascade
      catalog.createNamespace(Array("foo"), commentMap.asJava)
      assert(catalog.namespaceExists(Array("foo")) === true)
-      catalog.createTable(ident1, schema, Array.empty, emptyProps)
+      catalog.createTable(ident1, schema, Array.empty[Transform], emptyProps)


Scala compiler type inference doesn't work well with array and function overloads. This is not a breaking change, as users only implement createTable and Spark calls it.

cloud-fan · 2023-02-10T07:48:50Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala

-      dataType = typedVisit[DataType](ctx.dataType),
-      nullable = ctx.NULL == null,
-      comment = Option(ctx.commentSpec()).map(visitCommentSpec),
+      path = if (name.length > 1) UnresolvedFieldName(name.init) else RootTableSchema,


This change is necessary to make QualifiedColType an expression, otherwise we can't properly implement withChildren as QualifiedColType has 2 optional children.

Sounds good, it is convenient if QualifiedColType is an expression.

cloud-fan · 2023-02-10T07:49:09Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/v2ResolutionPlans.scala

@@ -195,6 +196,10 @@ case class ResolvedFieldName(path: Seq[String], field: StructField) extends Fiel
  def name: Seq[String] = path :+ field.name
 }

+case object RootTableSchema extends FieldName {


The reason to add it : https://github.com/apache/spark/pull/39942/files#r1102393297

cloud-fan · 2023-02-10T07:50:44Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala

+    val defaultExpression: Option[Expression] = if (action.defaultExpression != null) {
+      Some(visitDefaultExpression(action.defaultExpression))
+    } else if (action.dropDefault != null) {
+      Some(DropDefaultColumnValue)


We can use Some(null) to indicate removing default value but I think this is better.

Yes, this is more explicit.

cloud-fan · 2023-02-10T08:09:29Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

@@ -3091,12 +3091,11 @@ object SQLConf {
    buildConf("spark.sql.defaultColumn.allowedProviders")
      .internal()
      .doc("List of table providers wherein SQL commands are permitted to assign DEFAULT column " +
-        "values. Comma-separated list, whitespace ignored, case-insensitive. If an asterisk " +
-        "appears after any table provider in this list, any command may assign DEFAULT column " +
-        "except `ALTER TABLE ... ADD COLUMN`. Otherwise, if no asterisk appears, all commands " +


I don't think this is reasonable to allow a builtin data source to partially support column default value. Custom v2 catalogs can reject add columns by themselves in TableCatalog.alterTable, and builtin files sources must fully support default column as add column is not the only issue. People can do CREATE TABLE t (with default value) USING parquet LOCATION ... and the data files do not have the columns with default value.

The argument makes sense in spirit, but at least one data source has already used this functionality of the default column providers which would be broken with this change. Let's talk offline.

cloud-fan · 2023-02-10T08:11:07Z

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala

-
-      // Analyze the default column values.
-      val statementType = "CREATE TABLE"
-      assert(ResolveDefaultColumns.analyze(columnA, statementType).sql == "42")


no need to test this specific analyzer any more. The default expression will be resolved together with the commands by the main analyzer.

cloud-fan · 2023-02-10T08:21:38Z

sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala

@@ -1043,9 +1043,8 @@ class InsertSuite extends DataSourceTest with SharedSparkSession {

  test("SPARK-38336 INSERT INTO statements with tables with default columns: negative tests") {
    object Errors {
-      val COMMON_SUBSTRING = " has a DEFAULT value"
+      val COMMON_SUBSTRING = "has an invalid DEFAULT value"


TODO: we should move default value tests to a new suite and use the new error framework to check errors.

cloud-fan · 2023-02-10T08:21:50Z

sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala

-    }
-  }
-
-  test("SPARK-39844 Restrict adding DEFAULT columns for existing tables to certain sources") {


see https://github.com/apache/spark/pull/39942/files#r1102410248

Per response to that comment, it looks likely we'll have to retain this functionality.

cloud-fan · 2023-02-10T15:24:41Z

connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala

@@ -48,18 +48,16 @@ private[v2] trait V2JDBCTest extends SharedSparkSession with DockerIntegrationFu

  def notSupportsTableComment: Boolean = false

-  val defaultMetadata = new MetadataBuilder().putLong("scale", 0).build()


with the new Column abstraction, jdbc tables no longer need to expose the internal special field metadata, which is only used by JDBCRDD.

gengliangwang · 2023-02-14T00:39:38Z

cc @dtenedor

dtenedor · 2023-02-14T01:34:35Z

Thanks @cloud-fan for working on this, I reviewed the first 20/68 files (up through and including SQLConf.scala).

sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/Column.java

dtenedor · 2023-02-14T01:08:25Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala

@@ -1429,4 +1435,97 @@ trait CheckAnalysis extends PredicateHelper with LookupCatalog with QueryErrorsB
      case _ =>
    }
  }
+
+  def checkDefaultValuesInPlan(plan: LogicalPlan, isForV1: Boolean = false): Unit = {


This helper could go in the ResolveDefaultColumns object in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala.

dtenedor · 2023-02-14T01:08:54Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala

+
+  def checkDefaultValuesInPlan(plan: LogicalPlan, isForV1: Boolean = false): Unit = {
+    // Do not check anything if the children are not resolved yet.
+    if (!plan.childrenResolved) return


optional: instead of the return, we could just add a case _ if !plan.childrenResolved => return below. This might be generally safer in the vicinity of Catalyst rules which accept partial functions as arguments.

dtenedor · 2023-02-14T01:10:30Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala

+        checkDefaultValue(default, t.name, col.name, col.field.dataType, isForV1)
+
+      case cmd: V2CreateTablePlan if cmd.columns.exists(_.defaultValue.isDefined) =>
+        val ident = cmd.name.asInstanceOf[ResolvedIdentifier]


should we bind this in the pattern match instead of using asInstanceOf? Same on L1459 below.

It will make the case statement very long and hard to read. Let me improve it a little bit.

dtenedor · 2023-02-14T01:12:06Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala

+      provider: Option[String]): Unit = {
+    // We only need to check table provider for the session catalog. Other custom v2 catalogs
+    // can check table providers in their implementations of createTable, alterTable, etc.
+    if (!CatalogV2Util.isSessionCatalog(catalog)) return


same as above, we should probably reverse the logic and put the below in the if block body instead.

dtenedor · 2023-02-14T01:14:54Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala

-      dataType = typedVisit[DataType](ctx.dataType),
-      nullable = ctx.NULL == null,
-      comment = Option(ctx.commentSpec()).map(visitCommentSpec),
+      path = if (name.length > 1) UnresolvedFieldName(name.init) else RootTableSchema,


Sounds good, it is convenient if QualifiedColType is an expression.

dtenedor · 2023-02-14T01:15:28Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala

+    val defaultExpression: Option[Expression] = if (action.defaultExpression != null) {
+      Some(visitDefaultExpression(action.defaultExpression))
+    } else if (action.dropDefault != null) {
+      Some(DropDefaultColumnValue)


Yes, this is more explicit.

dtenedor · 2023-02-14T01:30:59Z

sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogV2Util.scala

            case names =>
              replace(schema, names.init, parent => parent.dataType match {
                case parentType: StructType =>
                  val field = StructField(names.last, add.dataType, nullable = add.isNullable)
-                  val fieldWithDefault: StructField =
-                    Option(add.defaultValue).map(field.withCurrentDefaultValue).getOrElse(field)
+                  val fieldWithDefault: StructField = Option(add.defaultValue).map { default =>


this is duplicated with L145-148 above, should we dedup into one place?

dtenedor · 2023-02-14T01:31:47Z

sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogV2Util.scala

+        if (!e.resolved || !e.foldable) {
+          // This should not happen as we do validation before setting column default values, but
+          // in case something goes wrong, we treat it as no default value.
+          null


we should throw an exception here instead of returning NULL, since that would comprise an incorrect result.

dtenedor · 2023-02-14T01:33:50Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

@@ -3091,12 +3091,11 @@ object SQLConf {
    buildConf("spark.sql.defaultColumn.allowedProviders")
      .internal()
      .doc("List of table providers wherein SQL commands are permitted to assign DEFAULT column " +
-        "values. Comma-separated list, whitespace ignored, case-insensitive. If an asterisk " +
-        "appears after any table provider in this list, any command may assign DEFAULT column " +
-        "except `ALTER TABLE ... ADD COLUMN`. Otherwise, if no asterisk appears, all commands " +


The argument makes sense in spirit, but at least one data source has already used this functionality of the default column providers which would be broken with this change. Let's talk offline.

cloud-fan · 2023-02-14T09:27:56Z

connector/connect/common/src/test/resources/query-tests/queries/function_udf_2.12.json

@@ -21,7 +21,7 @@
          }
        }],
        "scalarScalaUdf": {
-          "payload": "rO0ABXNyAC1vcmcuYXBhY2hlLnNwYXJrLnNxbC5jb25uZWN0LmNvbW1vbi5VZGZQYWNrZXR7DRDpFmMz1QIAA0wACGZ1bmN0aW9udAASTGphdmEvbGFuZy9PYmplY3Q7TAANaW5wdXRFbmNvZGVyc3QAFkxzY2FsYS9jb2xsZWN0aW9uL1NlcTtMAA1vdXRwdXRFbmNvZGVydAA4TG9yZy9hcGFjaGUvc3Bhcmsvc3FsL2NhdGFseXN0L2VuY29kZXJzL0Fnbm9zdGljRW5jb2Rlcjt4cHNyACVvcmcuYXBhY2hlLnNwYXJrLnNxbC5UZXN0VURGcyQkYW5vbiQyWQlE7Ce2cPkCAAB4cHNyACRzY2FsYS5jb2xsZWN0aW9uLm11dGFibGUuQXJyYXlCdWZmZXIVOLBTg4KOcwIAA0kAC2luaXRpYWxTaXplSQAFc2l6ZTBbAAVhcnJheXQAE1tMamF2YS9sYW5nL09iamVjdDt4cAAAABAAAAABdXIAE1tMamF2YS5sYW5nLk9iamVjdDuQzlifEHMpbAIAAHhwAAAAAXNyAE1vcmcuYXBhY2hlLnNwYXJrLnNxbC5jYXRhbHlzdC5lbmNvZGVycy5BZ25vc3RpY0VuY29kZXJzJFByaW1pdGl2ZUxvbmdFbmNvZGVyJE3e2W1O/wUaAgAAeHIATG9yZy5hcGFjaGUuc3Bhcmsuc3FsLmNhdGFseXN0LmVuY29kZXJzLkFnbm9zdGljRW5jb2RlcnMkUHJpbWl0aXZlTGVhZkVuY29kZXLf4z0LufMQmwIAAHhyAENvcmcuYXBhY2hlLnNwYXJrLnNxbC5jYXRhbHlzdC5lbmNvZGVycy5BZ25vc3RpY0VuY29kZXJzJExlYWZFbmNvZGVyKtoOee2SqKMCAANaAAtpc1ByaW1pdGl2ZUwABmNsc1RhZ3QAGExzY2FsYS9yZWZsZWN0L0NsYXNzVGFnO0wACGRhdGFUeXBldAAlTG9yZy9hcGFjaGUvc3Bhcmsvc3FsL3R5cGVzL0RhdGFUeXBlO3hwAXNyACpzY2FsYS5yZWZsZWN0Lk1hbmlmZXN0RmFjdG9yeSRMb25nTWFuaWZlc3QAAAAAAAAAAQIAAHhyABxzY2FsYS5yZWZsZWN0LkFueVZhbE1hbmlmZXN0AAAAAAAAAAECAAFMAAh0b1N0cmluZ3QAEkxqYXZhL2xhbmcvU3RyaW5nO3hwdAAETG9uZ3NyACRvcmcuYXBhY2hlLnNwYXJrLnNxbC50eXBlcy5Mb25nVHlwZSStAg1XC0z3OwIAAHhwc3IARm9yZy5hcGFjaGUuc3Bhcmsuc3FsLmNhdGFseXN0LmVuY29kZXJzLkFnbm9zdGljRW5jb2RlcnMkUHJvZHVjdEVuY29kZXIcqKluUDodYQIAA0wABmNsc1RhZ3EAfgAPTAAGZmllbGRzcQB+AAJMAAZzY2hlbWF0ACdMb3JnL2FwYWNoZS9zcGFyay9zcWwvdHlwZXMvU3RydWN0VHlwZTt4cHNyACZzY2FsYS5yZWZsZWN0LkNsYXNzVGFnJEdlbmVyaWNDbGFzc1RhZwAAAAAAAAABAgABTAAMcnVudGltZUNsYXNzdAARTGphdmEvbGFuZy9DbGFzczt4cHZyAAxzY2FsYS5UdXBsZTHPDUf18JuzPAIAAUwAAl8xcQB+AAF4cHNyADJzY2FsYS5jb2xsZWN0aW9uLmltbXV0YWJsZS5MaXN0JFNlcmlhbGl6YXRpb25Qcm94eQAAAAAAAAABAwAAeHBzcgBEb3JnLmFwYWNoZS5zcGFyay5zcWwuY2F0YWx5c3QuZW5jb2RlcnMuQWdub3N0aWNFbmNvZGVycyRFbmNvZGVyRmllbGTkpQPl0rttUgIABloACG51bGxhYmxlTAADZW5jcQB+AANMAAhtZXRhZGF0YXQAJUxvcmcvYXBhY2hlL3NwYXJrL3NxbC90eXBlcy9NZXRhZGF0YTtMAARuYW1lcQB+ABRMAApyZWFkTWV0aG9kdAAOTHNjYWxhL09wdGlvbjtMAAt3cml0ZU1ldGhvZHEAfgAleHAAcQB+ABFzcgAjb3JnLmFwYWNoZS5zcGFyay5zcWwudHlwZXMuTWV0YWRhdGFtjWLvldkl+wIAA0kACV9oYXNoQ29kZVoACGJpdG1hcCQwTAADbWFwdAAgTHNjYWxhL2NvbGxlY3Rpb24vaW1tdXRhYmxlL01hcDt4cAAAAAAAc3IAKHNjYWxhLmNvbGxlY3Rpb24uaW1tdXRhYmxlLk1hcCRFbXB0eU1hcCSx6xuFbkKAywIAAHhwdAACXzFzcgALc2NhbGEuTm9uZSRGUCT2U8qUrAIAAHhyAAxzY2FsYS5PcHRpb27+aTf92w5mdAIAAHhwcQB+AC9zcgAsc2NhbGEuY29sbGVjdGlvbi5pbW11dGFibGUuTGlzdFNlcmlhbGl6ZUVuZCSKXGNb91MLbQIAAHhweHNyACVvcmcuYXBhY2hlLnNwYXJrLnNxbC50eXBlcy5TdHJ1Y3RUeXBl9I34EN1tjkUCAAlJAAlfaGFzaENvZGVCAAhiaXRtYXAkMFoAGWhhc0V4aXN0ZW5jZURlZmF1bHRWYWx1ZXNbABZleGlzdGVuY2VEZWZhdWx0VmFsdWVzcQB+AAhbABhleGlzdGVuY2VEZWZhdWx0c0JpdG1hc2t0AAJbWkwADWZpZWxkTmFtZXNTZXR0ACBMc2NhbGEvY29sbGVjdGlvbi9pbW11dGFibGUvU2V0O1sABmZpZWxkc3QAKVtMb3JnL2FwYWNoZS9zcGFyay9zcWwvdHlwZXMvU3RydWN0RmllbGQ7TAALbmFtZVRvRmllbGR0ABZMc2NhbGEvY29sbGVjdGlvbi9NYXA7TAALbmFtZVRvSW5kZXhxAH4ANnhwAAAAAAAAcHBwdXIAKVtMb3JnLmFwYWNoZS5zcGFyay5zcWwudHlwZXMuU3RydWN0RmllbGQ7tWPEaGAaDUcCAAB4cAAAAAFzcgAmb3JnLmFwYWNoZS5zcGFyay5zcWwudHlwZXMuU3RydWN0RmllbGQrgSSJZ9l3nwIABFoACG51bGxhYmxlTAAIZGF0YVR5cGVxAH4AEEwACG1ldGFkYXRhcQB+ACRMAARuYW1lcQB+ABR4cABxAH4AGHEAfgApcQB+ACxwcA==",


this change is due to some methods in StructType are marked as private.

dtenedor

Reviewed the rest of the files. Thanks again for doing this.

sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ColumnDefaultValue.java

dtenedor · 2023-02-14T18:01:28Z

sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructType.scala

+  private[sql] def toColumns: Array[Column] = fields.map { f =>
+    val defaultValue = f.getCurrentDefaultValue().map { sql =>
+      val existDefaultOpt = f.getExistenceDefaultValue()
+      assert(existDefaultOpt.isDefined, "current and exist default must be both set or neither")


Technically, the existence default value can be set but the current default value can be absent if we do create table t (col int default 42) using parquet and then alter table t alter column col drop default. Maybe reword this message to "if the current default value is set, the existence default value must also be set"?

dtenedor · 2023-02-14T18:02:00Z

sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructType.scala

+    val defaultValue = f.getCurrentDefaultValue().map { sql =>
+      val existDefaultOpt = f.getExistenceDefaultValue()
+      assert(existDefaultOpt.isDefined, "current and exist default must be both set or neither")
+      val e = CatalystSqlParser.parseExpression(f.getExistenceDefaultValue().get)


Suggested change

val e = CatalystSqlParser.parseExpression(f.getExistenceDefaultValue().get)

val e = CatalystSqlParser.parseExpression(existDefaultOpt.get)

dtenedor · 2023-02-14T19:32:02Z

sql/catalyst/src/test/scala/org/apache/spark/sql/types/StructTypeSuite.scala

@@ -471,34 +471,22 @@ class StructTypeSuite extends SparkFunSuite with SQLHelper {
    assert(source1.existenceDefaultValues(1) == UTF8String.fromString("abc"))
    assert(source1.existenceDefaultValues(2) == null)

-    // Positive test: StructType.defaultValues works because the existence default value parses and
-    // resolves successfully, then evaluates to a non-literal expression: this is constant-folded at


do we still test this constant-folding somewhere?

This is done in a more type-safe way, via DefaultValueExpression. The analyzer makes sure DefaultValueExpression.child is foldable, and at execution time we require DefaultValueExpression.child to be a literal before passing it to catalogs.

dtenedor · 2023-02-14T19:33:25Z

sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala

+      // Check default values before converting to v1 command.
+      DefaultColumnUtil.checkDefaultValuesInPlan(a, isForV1 = true)
+      val cols = if (colsToAdd.exists(_.column.defaultValue.isDefined)) {
+        // Do a constant-folding, as we need to store the expression SQL string which should be in


we should probably put this into a util method in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala for better clarity of the code

dtenedor · 2023-02-14T19:35:36Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DescribeTableExec.scala

@@ -37,6 +38,7 @@ case class DescribeTableExec(
    addPartitioning(rows)

    if (isExtended) {
+      addColumnDefaultValue(rows)


does this add the column default information earlier in the result than before? that might be OK...

yea, I think this is more important than general table properties like owner and location.

dtenedor · 2023-02-14T20:28:53Z

sql/core/src/test/scala/org/apache/spark/sql/connector/TestV2SessionCatalogBase.scala

+    createTable(ident, CatalogV2Util.v2ColumnsToStructType(columns), partitions, properties)
+  }
+
+  // TODO: remove it when no tests calling this deprecated method.


There are a lot of method changes like this in this PR. Would it be better to clean up the code by updating the call sites? Or should we do that in a separate PR?

There are too many call sides in the test, I'd like to clean it up in a separate PR.

dtenedor · 2023-02-14T20:30:36Z

sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala

@@ -1058,21 +1057,21 @@ class InsertSuite extends DataSourceTest with SharedSparkSession {
      assert(intercept[AnalysisException] {
        sql("create table t(i boolean, s bigint default (select min(x) from badtable)) " +
          "using parquet")
-      }.getMessage.contains(Errors.BAD_SUBQUERY))


this no longer returns an error that subquery expressions are not allowed in default values. Can we have a test that checks this case?

dtenedor · 2023-02-14T20:31:19Z

sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala

-    }
-  }
-
-  test("SPARK-39844 Restrict adding DEFAULT columns for existing tables to certain sources") {


Per response to that comment, it looks likely we'll have to retain this functionality.

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

…onf.scala

github-actions · 2023-05-29T00:18:59Z

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

github-actions bot added SQL STRUCTURED STREAMING labels Feb 8, 2023

cloud-fan force-pushed the table branch 2 times, most recently from 775cf24 to 2e5f661 Compare February 9, 2023 15:08

github-actions bot added the CORE label Feb 9, 2023

cloud-fan force-pushed the table branch from 2e5f661 to e51403a Compare February 10, 2023 02:40

cloud-fan commented Feb 10, 2023

View reviewed changes

cloud-fan force-pushed the table branch from e51403a to c183d99 Compare February 10, 2023 06:52

cloud-fan commented Feb 10, 2023

View reviewed changes

cloud-fan force-pushed the table branch 3 times, most recently from 92e8fb0 to 0be5856 Compare February 10, 2023 15:12

cloud-fan commented Feb 10, 2023

View reviewed changes

cloud-fan changed the title ~~[WIP] refine default column value framework~~ [SPARK-42398][SQL] Refine default column value framework Feb 10, 2023

cloud-fan force-pushed the table branch 4 times, most recently from 486288b to a24edc8 Compare February 13, 2023 14:59

dtenedor reviewed Feb 14, 2023

View reviewed changes

cloud-fan force-pushed the table branch 2 times, most recently from 793e2ee to 9c78055 Compare February 14, 2023 09:22

github-actions bot added the CONNECT label Feb 14, 2023

cloud-fan commented Feb 14, 2023

View reviewed changes

refine default column value framework

84cbea7

cloud-fan force-pushed the table branch from 9c78055 to 84cbea7 Compare February 14, 2023 13:30

dtenedor reviewed Feb 14, 2023

View reviewed changes

cloud-fan added 2 commits February 15, 2023 18:18

improve

f5bc509

revert for proto

f07c991

github-actions bot removed the CONNECT label Feb 15, 2023

cloud-fan force-pushed the table branch 2 times, most recently from 06550d9 to 2e92f2e Compare February 15, 2023 10:31

fix javadoc

1034eaa

cloud-fan force-pushed the table branch from 2e92f2e to 1034eaa Compare February 15, 2023 10:34

cloud-fan added 2 commits February 15, 2023 20:25

more improvement

d1979e8

rename

9ec993b

cloud-fan commented Feb 17, 2023

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala Outdated Show resolved Hide resolved

Update sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLC…

514b59c

…onf.scala

github-actions bot added the Stale label May 29, 2023

github-actions bot closed this May 30, 2023

		@@ -48,18 +48,16 @@ private[v2] trait V2JDBCTest extends SharedSparkSession with DockerIntegrationFu

		def notSupportsTableComment: Boolean = false

		val defaultMetadata = new MetadataBuilder().putLong("scale", 0).build()

	val e = CatalystSqlParser.parseExpression(f.getExistenceDefaultValue().get)
	val e = CatalystSqlParser.parseExpression(existDefaultOpt.get)

[SPARK-42398][SQL] Refine default column value framework #39942

[SPARK-42398][SQL] Refine default column value framework #39942

Conversation

cloud-fan commented Feb 8, 2023 • edited Loading

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

cloud-fan Feb 10, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cloud-fan Feb 10, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gengliangwang commented Feb 14, 2023

dtenedor commented Feb 14, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dtenedor left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented May 29, 2023

cloud-fan commented Feb 8, 2023 •

edited

Loading

cloud-fan Feb 10, 2023 •

edited

Loading

cloud-fan Feb 10, 2023 •

edited

Loading