-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-42398][SQL] Refine default column value framework #39942
Conversation
775cf24
to
2e5f661
Compare
@@ -118,7 +119,7 @@ private[v2] trait V2JDBCNamespaceTest extends SharedSparkSession with DockerInte | |||
// Drop non empty namespace without cascade | |||
catalog.createNamespace(Array("foo"), commentMap.asJava) | |||
assert(catalog.namespaceExists(Array("foo")) === true) | |||
catalog.createTable(ident1, schema, Array.empty, emptyProps) | |||
catalog.createTable(ident1, schema, Array.empty[Transform], emptyProps) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scala compiler type inference doesn't work well with array and function overloads. This is not a breaking change, as users only implement createTable
and Spark calls it.
dataType = typedVisit[DataType](ctx.dataType), | ||
nullable = ctx.NULL == null, | ||
comment = Option(ctx.commentSpec()).map(visitCommentSpec), | ||
path = if (name.length > 1) UnresolvedFieldName(name.init) else RootTableSchema, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is necessary to make QualifiedColType
an expression, otherwise we can't properly implement withChildren
as QualifiedColType
has 2 optional children.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, it is convenient if QualifiedColType
is an expression.
@@ -195,6 +196,10 @@ case class ResolvedFieldName(path: Seq[String], field: StructField) extends Fiel | |||
def name: Seq[String] = path :+ field.name | |||
} | |||
|
|||
case object RootTableSchema extends FieldName { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason to add it : https://github.com/apache/spark/pull/39942/files#r1102393297
val defaultExpression: Option[Expression] = if (action.defaultExpression != null) { | ||
Some(visitDefaultExpression(action.defaultExpression)) | ||
} else if (action.dropDefault != null) { | ||
Some(DropDefaultColumnValue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use Some(null)
to indicate removing default value but I think this is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is more explicit.
@@ -3091,12 +3091,11 @@ object SQLConf { | |||
buildConf("spark.sql.defaultColumn.allowedProviders") | |||
.internal() | |||
.doc("List of table providers wherein SQL commands are permitted to assign DEFAULT column " + | |||
"values. Comma-separated list, whitespace ignored, case-insensitive. If an asterisk " + | |||
"appears after any table provider in this list, any command may assign DEFAULT column " + | |||
"except `ALTER TABLE ... ADD COLUMN`. Otherwise, if no asterisk appears, all commands " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is reasonable to allow a builtin data source to partially support column default value. Custom v2 catalogs can reject add columns by themselves in TableCatalog.alterTable
, and builtin files sources must fully support default column as add column is not the only issue. People can do CREATE TABLE t (with default value) USING parquet LOCATION ...
and the data files do not have the columns with default value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The argument makes sense in spirit, but at least one data source has already used this functionality of the default column providers which would be broken with this change. Let's talk offline.
|
||
// Analyze the default column values. | ||
val statementType = "CREATE TABLE" | ||
assert(ResolveDefaultColumns.analyze(columnA, statementType).sql == "42") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to test this specific analyzer any more. The default expression will be resolved together with the commands by the main analyzer.
@@ -1043,9 +1043,8 @@ class InsertSuite extends DataSourceTest with SharedSparkSession { | |||
|
|||
test("SPARK-38336 INSERT INTO statements with tables with default columns: negative tests") { | |||
object Errors { | |||
val COMMON_SUBSTRING = " has a DEFAULT value" | |||
val COMMON_SUBSTRING = "has an invalid DEFAULT value" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: we should move default value tests to a new suite and use the new error framework to check errors.
} | ||
} | ||
|
||
test("SPARK-39844 Restrict adding DEFAULT columns for existing tables to certain sources") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per response to that comment, it looks likely we'll have to retain this functionality.
92e8fb0
to
0be5856
Compare
@@ -48,18 +48,16 @@ private[v2] trait V2JDBCTest extends SharedSparkSession with DockerIntegrationFu | |||
|
|||
def notSupportsTableComment: Boolean = false | |||
|
|||
val defaultMetadata = new MetadataBuilder().putLong("scale", 0).build() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with the new Column
abstraction, jdbc tables no longer need to expose the internal special field metadata, which is only used by JDBCRDD
.
486288b
to
a24edc8
Compare
cc @dtenedor |
Thanks @cloud-fan for working on this, I reviewed the first 20/68 files (up through and including |
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/Column.java
Show resolved
Hide resolved
@@ -1429,4 +1435,97 @@ trait CheckAnalysis extends PredicateHelper with LookupCatalog with QueryErrorsB | |||
case _ => | |||
} | |||
} | |||
|
|||
def checkDefaultValuesInPlan(plan: LogicalPlan, isForV1: Boolean = false): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This helper could go in the ResolveDefaultColumns
object in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala.
|
||
def checkDefaultValuesInPlan(plan: LogicalPlan, isForV1: Boolean = false): Unit = { | ||
// Do not check anything if the children are not resolved yet. | ||
if (!plan.childrenResolved) return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional: instead of the return
, we could just add a case _ if !plan.childrenResolved => return
below. This might be generally safer in the vicinity of Catalyst rules which accept partial functions as arguments.
checkDefaultValue(default, t.name, col.name, col.field.dataType, isForV1) | ||
|
||
case cmd: V2CreateTablePlan if cmd.columns.exists(_.defaultValue.isDefined) => | ||
val ident = cmd.name.asInstanceOf[ResolvedIdentifier] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we bind this in the pattern match instead of using asInstanceOf
? Same on L1459 below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will make the case
statement very long and hard to read. Let me improve it a little bit.
provider: Option[String]): Unit = { | ||
// We only need to check table provider for the session catalog. Other custom v2 catalogs | ||
// can check table providers in their implementations of createTable, alterTable, etc. | ||
if (!CatalogV2Util.isSessionCatalog(catalog)) return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above, we should probably reverse the logic and put the below in the if
block body instead.
dataType = typedVisit[DataType](ctx.dataType), | ||
nullable = ctx.NULL == null, | ||
comment = Option(ctx.commentSpec()).map(visitCommentSpec), | ||
path = if (name.length > 1) UnresolvedFieldName(name.init) else RootTableSchema, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, it is convenient if QualifiedColType
is an expression.
val defaultExpression: Option[Expression] = if (action.defaultExpression != null) { | ||
Some(visitDefaultExpression(action.defaultExpression)) | ||
} else if (action.dropDefault != null) { | ||
Some(DropDefaultColumnValue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is more explicit.
case names => | ||
replace(schema, names.init, parent => parent.dataType match { | ||
case parentType: StructType => | ||
val field = StructField(names.last, add.dataType, nullable = add.isNullable) | ||
val fieldWithDefault: StructField = | ||
Option(add.defaultValue).map(field.withCurrentDefaultValue).getOrElse(field) | ||
val fieldWithDefault: StructField = Option(add.defaultValue).map { default => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is duplicated with L145-148 above, should we dedup into one place?
if (!e.resolved || !e.foldable) { | ||
// This should not happen as we do validation before setting column default values, but | ||
// in case something goes wrong, we treat it as no default value. | ||
null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should throw an exception here instead of returning NULL, since that would comprise an incorrect result.
@@ -3091,12 +3091,11 @@ object SQLConf { | |||
buildConf("spark.sql.defaultColumn.allowedProviders") | |||
.internal() | |||
.doc("List of table providers wherein SQL commands are permitted to assign DEFAULT column " + | |||
"values. Comma-separated list, whitespace ignored, case-insensitive. If an asterisk " + | |||
"appears after any table provider in this list, any command may assign DEFAULT column " + | |||
"except `ALTER TABLE ... ADD COLUMN`. Otherwise, if no asterisk appears, all commands " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The argument makes sense in spirit, but at least one data source has already used this functionality of the default column providers which would be broken with this change. Let's talk offline.
793e2ee
to
9c78055
Compare
@@ -21,7 +21,7 @@ | |||
} | |||
}], | |||
"scalarScalaUdf": { | |||
"payload": "rO0ABXNyAC1vcmcuYXBhY2hlLnNwYXJrLnNxbC5jb25uZWN0LmNvbW1vbi5VZGZQYWNrZXR7DRDpFmMz1QIAA0wACGZ1bmN0aW9udAASTGphdmEvbGFuZy9PYmplY3Q7TAANaW5wdXRFbmNvZGVyc3QAFkxzY2FsYS9jb2xsZWN0aW9uL1NlcTtMAA1vdXRwdXRFbmNvZGVydAA4TG9yZy9hcGFjaGUvc3Bhcmsvc3FsL2NhdGFseXN0L2VuY29kZXJzL0Fnbm9zdGljRW5jb2Rlcjt4cHNyACVvcmcuYXBhY2hlLnNwYXJrLnNxbC5UZXN0VURGcyQkYW5vbiQyWQlE7Ce2cPkCAAB4cHNyACRzY2FsYS5jb2xsZWN0aW9uLm11dGFibGUuQXJyYXlCdWZmZXIVOLBTg4KOcwIAA0kAC2luaXRpYWxTaXplSQAFc2l6ZTBbAAVhcnJheXQAE1tMamF2YS9sYW5nL09iamVjdDt4cAAAABAAAAABdXIAE1tMamF2YS5sYW5nLk9iamVjdDuQzlifEHMpbAIAAHhwAAAAAXNyAE1vcmcuYXBhY2hlLnNwYXJrLnNxbC5jYXRhbHlzdC5lbmNvZGVycy5BZ25vc3RpY0VuY29kZXJzJFByaW1pdGl2ZUxvbmdFbmNvZGVyJE3e2W1O/wUaAgAAeHIATG9yZy5hcGFjaGUuc3Bhcmsuc3FsLmNhdGFseXN0LmVuY29kZXJzLkFnbm9zdGljRW5jb2RlcnMkUHJpbWl0aXZlTGVhZkVuY29kZXLf4z0LufMQmwIAAHhyAENvcmcuYXBhY2hlLnNwYXJrLnNxbC5jYXRhbHlzdC5lbmNvZGVycy5BZ25vc3RpY0VuY29kZXJzJExlYWZFbmNvZGVyKtoOee2SqKMCAANaAAtpc1ByaW1pdGl2ZUwABmNsc1RhZ3QAGExzY2FsYS9yZWZsZWN0L0NsYXNzVGFnO0wACGRhdGFUeXBldAAlTG9yZy9hcGFjaGUvc3Bhcmsvc3FsL3R5cGVzL0RhdGFUeXBlO3hwAXNyACpzY2FsYS5yZWZsZWN0Lk1hbmlmZXN0RmFjdG9yeSRMb25nTWFuaWZlc3QAAAAAAAAAAQIAAHhyABxzY2FsYS5yZWZsZWN0LkFueVZhbE1hbmlmZXN0AAAAAAAAAAECAAFMAAh0b1N0cmluZ3QAEkxqYXZhL2xhbmcvU3RyaW5nO3hwdAAETG9uZ3NyACRvcmcuYXBhY2hlLnNwYXJrLnNxbC50eXBlcy5Mb25nVHlwZSStAg1XC0z3OwIAAHhwc3IARm9yZy5hcGFjaGUuc3Bhcmsuc3FsLmNhdGFseXN0LmVuY29kZXJzLkFnbm9zdGljRW5jb2RlcnMkUHJvZHVjdEVuY29kZXIcqKluUDodYQIAA0wABmNsc1RhZ3EAfgAPTAAGZmllbGRzcQB+AAJMAAZzY2hlbWF0ACdMb3JnL2FwYWNoZS9zcGFyay9zcWwvdHlwZXMvU3RydWN0VHlwZTt4cHNyACZzY2FsYS5yZWZsZWN0LkNsYXNzVGFnJEdlbmVyaWNDbGFzc1RhZwAAAAAAAAABAgABTAAMcnVudGltZUNsYXNzdAARTGphdmEvbGFuZy9DbGFzczt4cHZyAAxzY2FsYS5UdXBsZTHPDUf18JuzPAIAAUwAAl8xcQB+AAF4cHNyADJzY2FsYS5jb2xsZWN0aW9uLmltbXV0YWJsZS5MaXN0JFNlcmlhbGl6YXRpb25Qcm94eQAAAAAAAAABAwAAeHBzcgBEb3JnLmFwYWNoZS5zcGFyay5zcWwuY2F0YWx5c3QuZW5jb2RlcnMuQWdub3N0aWNFbmNvZGVycyRFbmNvZGVyRmllbGTkpQPl0rttUgIABloACG51bGxhYmxlTAADZW5jcQB+AANMAAhtZXRhZGF0YXQAJUxvcmcvYXBhY2hlL3NwYXJrL3NxbC90eXBlcy9NZXRhZGF0YTtMAARuYW1lcQB+ABRMAApyZWFkTWV0aG9kdAAOTHNjYWxhL09wdGlvbjtMAAt3cml0ZU1ldGhvZHEAfgAleHAAcQB+ABFzcgAjb3JnLmFwYWNoZS5zcGFyay5zcWwudHlwZXMuTWV0YWRhdGFtjWLvldkl+wIAA0kACV9oYXNoQ29kZVoACGJpdG1hcCQwTAADbWFwdAAgTHNjYWxhL2NvbGxlY3Rpb24vaW1tdXRhYmxlL01hcDt4cAAAAAAAc3IAKHNjYWxhLmNvbGxlY3Rpb24uaW1tdXRhYmxlLk1hcCRFbXB0eU1hcCSx6xuFbkKAywIAAHhwdAACXzFzcgALc2NhbGEuTm9uZSRGUCT2U8qUrAIAAHhyAAxzY2FsYS5PcHRpb27+aTf92w5mdAIAAHhwcQB+AC9zcgAsc2NhbGEuY29sbGVjdGlvbi5pbW11dGFibGUuTGlzdFNlcmlhbGl6ZUVuZCSKXGNb91MLbQIAAHhweHNyACVvcmcuYXBhY2hlLnNwYXJrLnNxbC50eXBlcy5TdHJ1Y3RUeXBl9I34EN1tjkUCAAlJAAlfaGFzaENvZGVCAAhiaXRtYXAkMFoAGWhhc0V4aXN0ZW5jZURlZmF1bHRWYWx1ZXNbABZleGlzdGVuY2VEZWZhdWx0VmFsdWVzcQB+AAhbABhleGlzdGVuY2VEZWZhdWx0c0JpdG1hc2t0AAJbWkwADWZpZWxkTmFtZXNTZXR0ACBMc2NhbGEvY29sbGVjdGlvbi9pbW11dGFibGUvU2V0O1sABmZpZWxkc3QAKVtMb3JnL2FwYWNoZS9zcGFyay9zcWwvdHlwZXMvU3RydWN0RmllbGQ7TAALbmFtZVRvRmllbGR0ABZMc2NhbGEvY29sbGVjdGlvbi9NYXA7TAALbmFtZVRvSW5kZXhxAH4ANnhwAAAAAAAAcHBwdXIAKVtMb3JnLmFwYWNoZS5zcGFyay5zcWwudHlwZXMuU3RydWN0RmllbGQ7tWPEaGAaDUcCAAB4cAAAAAFzcgAmb3JnLmFwYWNoZS5zcGFyay5zcWwudHlwZXMuU3RydWN0RmllbGQrgSSJZ9l3nwIABFoACG51bGxhYmxlTAAIZGF0YVR5cGVxAH4AEEwACG1ldGFkYXRhcQB+ACRMAARuYW1lcQB+ABR4cABxAH4AGHEAfgApcQB+ACxwcA==", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this change is due to some methods in StructType
are marked as private.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed the rest of the files. Thanks again for doing this.
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ColumnDefaultValue.java
Show resolved
Hide resolved
private[sql] def toColumns: Array[Column] = fields.map { f => | ||
val defaultValue = f.getCurrentDefaultValue().map { sql => | ||
val existDefaultOpt = f.getExistenceDefaultValue() | ||
assert(existDefaultOpt.isDefined, "current and exist default must be both set or neither") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, the existence default value can be set but the current default value can be absent if we do create table t (col int default 42) using parquet
and then alter table t alter column col drop default
. Maybe reword this message to "if the current default value is set, the existence default value must also be set"?
val defaultValue = f.getCurrentDefaultValue().map { sql => | ||
val existDefaultOpt = f.getExistenceDefaultValue() | ||
assert(existDefaultOpt.isDefined, "current and exist default must be both set or neither") | ||
val e = CatalystSqlParser.parseExpression(f.getExistenceDefaultValue().get) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
val e = CatalystSqlParser.parseExpression(f.getExistenceDefaultValue().get) | |
val e = CatalystSqlParser.parseExpression(existDefaultOpt.get) |
@@ -471,34 +471,22 @@ class StructTypeSuite extends SparkFunSuite with SQLHelper { | |||
assert(source1.existenceDefaultValues(1) == UTF8String.fromString("abc")) | |||
assert(source1.existenceDefaultValues(2) == null) | |||
|
|||
// Positive test: StructType.defaultValues works because the existence default value parses and | |||
// resolves successfully, then evaluates to a non-literal expression: this is constant-folded at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we still test this constant-folding somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done in a more type-safe way, via DefaultValueExpression
. The analyzer makes sure DefaultValueExpression.child
is foldable, and at execution time we require DefaultValueExpression.child
to be a literal before passing it to catalogs.
// Check default values before converting to v1 command. | ||
DefaultColumnUtil.checkDefaultValuesInPlan(a, isForV1 = true) | ||
val cols = if (colsToAdd.exists(_.column.defaultValue.isDefined)) { | ||
// Do a constant-folding, as we need to store the expression SQL string which should be in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably put this into a util method in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala for better clarity of the code
@@ -37,6 +38,7 @@ case class DescribeTableExec( | |||
addPartitioning(rows) | |||
|
|||
if (isExtended) { | |||
addColumnDefaultValue(rows) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this add the column default information earlier in the result than before? that might be OK...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, I think this is more important than general table properties like owner and location.
createTable(ident, CatalogV2Util.v2ColumnsToStructType(columns), partitions, properties) | ||
} | ||
|
||
// TODO: remove it when no tests calling this deprecated method. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of method changes like this in this PR. Would it be better to clean up the code by updating the call sites? Or should we do that in a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are too many call sides in the test, I'd like to clean it up in a separate PR.
@@ -1058,21 +1057,21 @@ class InsertSuite extends DataSourceTest with SharedSparkSession { | |||
assert(intercept[AnalysisException] { | |||
sql("create table t(i boolean, s bigint default (select min(x) from badtable)) " + | |||
"using parquet") | |||
}.getMessage.contains(Errors.BAD_SUBQUERY)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this no longer returns an error that subquery expressions are not allowed in default values. Can we have a test that checks this case?
} | ||
} | ||
|
||
test("SPARK-39844 Restrict adding DEFAULT columns for existing tables to certain sources") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per response to that comment, it looks likely we'll have to retain this functionality.
06550d9
to
2e92f2e
Compare
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
Outdated
Show resolved
Hide resolved
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
The current column default value framework has a few problems:
StructType
. We must store the default value as the SQL text inStructField
metadata, then parse and resolve it with a hacky analyzer later.createTable
API only takesStructType
, so implementations must know the special metadata key of the default value to access it. TheTableChange
API has the default value as an individual field.This PR refines the default value framework to solve these problems:
Column
, which keeps the default value as a catalystExpression
. Then it can be resolved with the DDL commands by the main analyzer.CheckAnalysis
.createTable
API now takesColumn
(different from the catalystColumn
), which holds the default value as a v2Literal
as well as the original SQL. This avoids the need of special metadata key and is also more extensible when adding more special cols like generated cols. This is also type-safe and makes sure the default value is literal. The implementation is free to decide how to encode and store default values. Note: backward compatibility is taken care.For custom v2 catalogs, the workflow is:
CheckAnalysis
validates default value expression and makes sure it's literalFor v1 catalog (the Hive catalog), the workflow is:
After step 2, we have 2 branches:
1.
ResolveSessionCatalog
turns some DDL commands to v1 version. It invokesCheckAnalysis
to do validation, invokesContantFolding
to create constants, and put default value constant as a SQL text inStructField
metadata.Another branch is still running as v2 commands:
CheckAnalysis
validates default value expression and makes sure it's literalV2SessionCatalog
, the v2 API adaptor of v1 catalog, turns v2 literal to SQL text and put it inStructField
metadata.TODO (in followup PRs): the consumer side (DML commands) should only parse the default value text and use the main analyzer to resolve it. Then we can remove the hacky analyzer completely.
Why are the changes needed?
Refine the framework before releasing it
Does this PR introduce any user-facing change?
no
How was this patch tested?
existing tests