Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-29680][SQL] Remove ALTER TABLE CHANGE COLUMN syntax #26338

Closed
wants to merge 10 commits into from

Conversation

viirya
Copy link
Member

@viirya viirya commented Oct 31, 2019

What changes were proposed in this pull request?

This patch removes v1 ALTER TABLE CHANGE COLUMN syntax.

Why are the changes needed?

Since in v2 we have ALTER TABLE CHANGE COLUMN and ALTER TABLE RENAME COLUMN, this old syntax is not necessary now and can be confusing.

The v2 ALTER TABLE CHANGE COLUMN should fallback to v1 AlterTableChangeColumnCommand (#26354).

Does this PR introduce any user-facing change?

Yes, the old v1 ALTER TABLE CHANGE COLUMN syntax is removed.

How was this patch tested?

Unit tests.

@@ -369,6 +369,13 @@ class ResolveSessionCatalog(
AlterTableRecoverPartitionsCommand(
v1TableName.asTableIdentifier,
"ALTER TABLE RECOVER PARTITIONS")

case AlterTableChangeColumnStatement(tableName, columnName, newColumn) =>
val v1TableName = parseV1Table(tableName, "ALTER TABLE CHANGE COLUMN")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this command can also work on V2, if old column name and new column name is the same. cc @cloud-fan

@@ -151,7 +151,7 @@ statement
| ALTER TABLE multipartIdentifier
(ALTER | CHANGE) COLUMN? qualifiedName
(TYPE dataType)? (COMMENT comment=STRING)? colPosition? #alterTableColumn
| ALTER TABLE tableIdentifier partitionSpec?
| ALTER TABLE multipartIdentifier partitionSpec?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, the parser rule above seems to include this one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the above one is for v2 command. two rules have a bit difference. v1 rule can specify new column name with data type (i.e., colType), but v2 rule can only specify data type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not combine two rules and two statements. A possible combine might have few Option fields and a logic to interpret it to v1/v2 cases, a bit mess it sounds like.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The v2 rule

(ALTER | CHANGE) COLUMN? qualifiedName (TYPE dataType)? (COMMENT comment=STRING)?

The colType

colName=errorCapturingIdentifier dataType (COMMENT STRING)?

Seems like the v1 rule is more powerful and can rename a column and change data type together. If other DBs support it as well, maybe Spark should also support it. Otherwise, maybe 3.0 is a good time to drop this syntax.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we do not really support it, though we allow such syntax:

// Find the origin column from dataSchema by column name.
val originColumn = findColumnByName(table.dataSchema, columnName, resolver)
// Throw an AnalysisException if the column name/dataType is changed.
if (!columnEqual(originColumn, newColumn, resolver)) {
throw new AnalysisException(
"ALTER TABLE CHANGE COLUMN is not supported for changing column " +
s"'${originColumn.name}' with type '${originColumn.dataType}' to " +
s"'${newColumn.name}' with type '${newColumn.dataType}'")
}

Copy link
Member Author

@viirya viirya Nov 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what if the original plan was to support that in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's simply remove this syntax. We have ALTER TABLE CHANGE COLUMN and ALTER TABLE RENAME COLUMN, which is good enough to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. let's remove it.

@SparkQA
Copy link

SparkQA commented Oct 31, 2019

Test build #113006 has finished for PR 26338 at commit 91c8e58.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class AlterTableChangeColumnStatement(

@viirya viirya changed the title [SPARK-29680][SQL] ALTER TABLE CHANGE COLUMN should look up catalog/table like v2 commands [SPARK-29680][SQL] Remove ALTER TABLE CHANGE COLUMN syntax Nov 2, 2019
@SparkQA
Copy link

SparkQA commented Nov 2, 2019

Test build #113122 has finished for PR 26338 at commit a432da7.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class AlterTableRenamePartitionStatement(
  • case class AlterTableDropPartitionStatement(
  • case class OptimizeLocalShuffleReader(conf: SQLConf) extends Rule[SparkPlan]

@SparkQA
Copy link

SparkQA commented Nov 2, 2019

Test build #113123 has finished for PR 26338 at commit 05f0584.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -2,54 +2,38 @@
CREATE TABLE test_change(a INT, b STRING, c INT) using parquet;
DESC test_change;

-- Change column name (not supported yet)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a test file for RENAME COLUMN? If not can we add some tests here to keep the test coverage?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok.

@SparkQA
Copy link

SparkQA commented Nov 4, 2019

Test build #113192 has finished for PR 26338 at commit 17a34c2.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 5, 2019

Test build #113238 has finished for PR 26338 at commit 9e7a678.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 5, 2019

Test build #113237 has finished for PR 26338 at commit fb7564a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 5, 2019

Test build #113240 has finished for PR 26338 at commit ac76822.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class AlterTableSerDePropertiesStatement(

@SparkQA
Copy link

SparkQA commented Nov 5, 2019

Test build #113245 has finished for PR 26338 at commit 64a68fc.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Nov 5, 2019

retest this please.

@SparkQA
Copy link

SparkQA commented Nov 5, 2019

Test build #113249 has finished for PR 26338 at commit 64a68fc.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -149,11 +149,8 @@ statement
| ALTER (TABLE | VIEW) multipartIdentifier
UNSET TBLPROPERTIES (IF EXISTS)? tablePropertyList #unsetTableProperties
| ALTER TABLE multipartIdentifier
(ALTER | CHANGE) COLUMN? qualifiedName
(ALTER | CHANGE) COLUMN? multipartIdentifier
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we use qualifiedName: identifier ('.' identifier)* to capture column name.

This conflicts a test in ErrorParserSuite that test-col is not allowed in ALTER TABLE t CHANGE COLUMN test-col TYPE BIGINT.

The column name should be multiple errorCapturingIdentifier. So I changed it to multipartIdentifier:

multipartIdentifier
    : parts+=errorCapturingIdentifier ('.' parts+=errorCapturingIdentifier)*
    ;

@cloud-fan

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like we should replace qualifiedName with multipartIdentifier in all other places. We can do it in a followup.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. will make a followup later.

@SparkQA
Copy link

SparkQA commented Nov 5, 2019

Test build #113275 has finished for PR 26338 at commit b2acddb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -149,11 +149,8 @@ statement
| ALTER (TABLE | VIEW) multipartIdentifier
UNSET TBLPROPERTIES (IF EXISTS)? tablePropertyList #unsetTableProperties
| ALTER TABLE multipartIdentifier
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can distinguish the two multipartIdentifier

ALTER TABLE table=multipartIdentifier ... COLUMN? column=multipartIdentifier

You can fix it in your followup

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 6233958 Nov 6, 2019
cloud-fan added a commit that referenced this pull request Jan 2, 2020
### What changes were proposed in this pull request?

Revert #26338 , as the syntax is actually the [hive style ALTER COLUMN](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ChangeColumnName/Type/Position/Comment).

This PR brings it back, and make it support multi-catalog:
1. renaming is not allowed as `AlterTableAlterColumnStatement` can't do renaming.
2. column name should be multi-part

### Why are the changes needed?

to not break hive compatibility.

### Does this PR introduce any user-facing change?

no, as the removal was merged in 3.0.

### How was this patch tested?

new parser tests

Closes #27076 from cloud-fan/alter.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
rdblue pushed a commit to Netflix/spark that referenced this pull request Jan 21, 2020
Revert apache/spark#26338 , as the syntax is actually the [hive style ALTER COLUMN](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ChangeColumnName/Type/Position/Comment).

This PR brings it back, and make it support multi-catalog:
1. renaming is not allowed as `AlterTableAlterColumnStatement` can't do renaming.
2. column name should be multi-part

to not break hive compatibility.

no, as the removal was merged in 3.0.

new parser tests

Closes #27076 from cloud-fan/alter.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@viirya viirya deleted the SPARK-29680 branch December 27, 2023 18:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants