-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-32402][SQL][FOLLOW-UP] Use quoted column name for JDBCTableCatalog.alterTable #30041
Conversation
Test build #129744 has finished for PR 30041 at commit
|
@cloud-fan @MaxGekk Could you please take a look? Thanks! |
can we add a test? |
t = spark.table("h2.test.alt_table") | ||
expectedSchema = expectedSchema.add("C3", DoubleType) | ||
expectedSchema = expectedSchema.add("c3", DoubleType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without fix (with unquoted column name), assert(t.schema === expectedSchema)
throws the following Exception:
org.scalatest.exceptions.TestFailedException: StructType(StructField(ID,IntegerType,true), StructField(C1,IntegerType,true), StructField(C2,StringType,true), StructField(C3,DoubleType,true)) did not equal StructType(StructField(ID,IntegerType,true), StructField(C1,IntegerType,true), StructField(C2,StringType,true), StructField(c3,DoubleType,true))
at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)
sql("CREATE TABLE h2.test.alt_table (ID INTEGER, C0 INTEGER) USING _") | ||
sql("ALTER TABLE h2.test.alt_table RENAME COLUMN ID TO C") | ||
sql("CREATE TABLE h2.test.alt_table (id INTEGER, C0 INTEGER) USING _") | ||
sql("ALTER TABLE h2.test.alt_table RENAME COLUMN id TO C") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without fix (with unquoted column name), this throws the following Exception:
Failed table altering: test.alt_table;
org.apache.spark.sql.AnalysisException: Failed table altering: test.alt_table;
at org.apache.spark.sql.jdbc.JdbcDialect.classifyException(JdbcDialects.scala:268)
Caused by: org.h2.jdbc.JdbcSQLException: Column "ID" not found; SQL statement:
ALTER TABLE "test"."alt_table" RENAME COLUMN id TO "C" [42122-195]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should still work if we write ID
, right? Spark normalizes the columns according to the table schema, so case sensitivity flag can still apply. See ResolveAlterTableChanges
.
It would be better if we can add tests for case insensitive column resolution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, upper case ID
still works here. I will add tests for case insensitive column resolution.
Thanks for reviewing and merging my PR!
sql("ALTER TABLE h2.test.alt_table DROP COLUMN C1") | ||
sql("ALTER TABLE h2.test.alt_table DROP COLUMN c3") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without fix (with unquoted column name), this throws the following Exception:
Failed table altering: test.alt_table;
org.apache.spark.sql.AnalysisException: Failed table altering: test.alt_table;
at org.apache.spark.sql.jdbc.JdbcDialect.classifyException(JdbcDialects.scala:268)
Caused by: org.h2.jdbc.JdbcSQLException: Column "C3" not found; SQL statement:
ALTER TABLE "test"."alt_table" DROP COLUMN c3 [42122-195]
sql("ALTER TABLE h2.test.alt_table ALTER COLUMN id TYPE DOUBLE") | ||
sql("ALTER TABLE h2.test.alt_table ALTER COLUMN deptno TYPE DOUBLE") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without fix (with unquoted column name), this throws the following Exception:
Failed table altering: test.alt_table;
org.apache.spark.sql.AnalysisException: Failed table altering: test.alt_table;
at org.apache.spark.sql.jdbc.JdbcDialect.classifyException(JdbcDialects.scala:268)
Caused by: org.h2.jdbc.JdbcSQLException: Column "DEPTNO" not found; SQL statement:
ALTER TABLE "test"."alt_table" ALTER COLUMN deptno DOUBLE PRECISION [42122-195]
@@ -272,10 +274,12 @@ class JDBCTableCatalogSuite extends QueryTest with SharedSparkSession { | |||
|
|||
test("alter table ... update column nullability") { | |||
withTable("h2.test.alt_table") { | |||
sql("CREATE TABLE h2.test.alt_table (ID INTEGER NOT NULL) USING _") | |||
sql("CREATE TABLE h2.test.alt_table (ID INTEGER NOT NULL, deptno INTEGER NOT NULL) USING _") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without fix (with unquoted column name), this throws the following Exception:
Failed table altering: test.alt_table;
org.apache.spark.sql.AnalysisException: Failed table altering: test.alt_table;
at org.apache.spark.sql.jdbc.JdbcDialect.classifyException(JdbcDialects.scala:268)
Caused by: org.h2.jdbc.JdbcSQLException: Column "DEPTNO" not found; SQL statement:
ALTER TABLE "test"."alt_table" ALTER COLUMN deptno SET NULL [42122-195]
I actually get a little confused now: do I also need to take consideration of |
I think there was a discussion a long time ago. For table/column lookup that happens inside custom catalogs, Spark can't control it. |
Test build #129808 has finished for PR 30041 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status success |
thanks, merging to master! |
What changes were proposed in this pull request?
I currently have unquoted column names in alter table, e.g.
ALTER TABLE "test"."alt_table" DROP COLUMN c1
should change to quoted column name
ALTER TABLE "test"."alt_table" DROP COLUMN "c1"
Why are the changes needed?
We should always use quoted identifiers in JDBC SQLs, e.g.
CREATE TABLE "test"."abc" ("col" INTEGER )
orINSERT INTO "test"."abc" ("col") VALUES (?)
. Using unquoted column name in alterTable causes problems, for example:Does this PR introduce any user-facing change?
No
How was this patch tested?
Existing tests