New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[SPARK-32402][SQL][FOLLOW-UP] Use quoted column name for JDBCTableCatalog.alterTable #30041

Closed

huaxingao wants to merge 2 commits into apache:master from huaxingao:alter_table_followup

Contributor

huaxingao commented Oct 14, 2020

What changes were proposed in this pull request?

I currently have unquoted column names in alter table, e.g. ALTER TABLE "test"."alt_table" DROP COLUMN c1
should change to quoted column name ALTER TABLE "test"."alt_table" DROP COLUMN "c1"

Why are the changes needed?

We should always use quoted identifiers in JDBC SQLs, e.g. CREATE TABLE "test"."abc" ("col" INTEGER ) or INSERT INTO "test"."abc" ("col") VALUES (?). Using unquoted column name in alterTable causes problems, for example:

sql("CREATE TABLE h2.test.alt_table (c1 INTEGER, c2 INTEGER) USING _")
sql("ALTER TABLE h2.test.alt_table DROP COLUMN c1")

org.apache.spark.sql.AnalysisException: Failed table altering: test.alt_table;
......

Caused by: org.h2.jdbc.JdbcSQLException: Column "C1" not found; SQL statement:
ALTER TABLE "test"."alt_table" DROP COLUMN c1 [42122-195]

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing tests


          Use quoted column name for JDBCTableCatalog.alterTable

cb0de33

SparkQA commented Oct 15, 2020

Test build #129744 has finished for PR 30041 at commit cb0de33.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

Contributor Author

huaxingao commented Oct 15, 2020

@cloud-fan @MaxGekk Could you please take a look? Thanks!

Contributor

cloud-fan commented Oct 15, 2020

can we add a test?


          modify tests

7e8098c

huaxingao commented

View reviewed changes

...rc/test/scala/org/apache/spark/sql/execution/datasources/v2/jdbc/JDBCTableCatalogSuite.scala

                     t = spark.table("h2.test.alt_table")
-                    expectedSchema = expectedSchema.add("C3", DoubleType)
+                    expectedSchema = expectedSchema.add("c3", DoubleType)

Contributor Author

huaxingao Oct 15, 2020

Without fix (with unquoted column name), assert(t.schema === expectedSchema) throws the following Exception:

org.scalatest.exceptions.TestFailedException: StructType(StructField(ID,IntegerType,true), StructField(C1,IntegerType,true), StructField(C2,StringType,true), StructField(C3,DoubleType,true)) did not equal StructType(StructField(ID,IntegerType,true), StructField(C1,IntegerType,true), StructField(C2,StringType,true), StructField(c3,DoubleType,true))
	at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)

huaxingao commented

View reviewed changes

...rc/test/scala/org/apache/spark/sql/execution/datasources/v2/jdbc/JDBCTableCatalogSuite.scala

    
                    sql("CREATE TABLE h2.test.alt_table (ID INTEGER, C0 INTEGER) USING _")

                    sql("ALTER TABLE h2.test.alt_table RENAME COLUMN ID TO C")

                    sql("CREATE TABLE h2.test.alt_table (id INTEGER, C0 INTEGER) USING _")

                    sql("ALTER TABLE h2.test.alt_table RENAME COLUMN id TO C")

Contributor Author

huaxingao Oct 15, 2020

Without fix (with unquoted column name), this throws the following Exception:

Failed table altering: test.alt_table;
org.apache.spark.sql.AnalysisException: Failed table altering: test.alt_table;
	at org.apache.spark.sql.jdbc.JdbcDialect.classifyException(JdbcDialects.scala:268)

Caused by: org.h2.jdbc.JdbcSQLException: Column "ID" not found; SQL statement:
ALTER TABLE "test"."alt_table" RENAME COLUMN id TO "C" [42122-195]

Contributor

cloud-fan Oct 15, 2020

It should still work if we write ID, right? Spark normalizes the columns according to the table schema, so case sensitivity flag can still apply. See ResolveAlterTableChanges.

It would be better if we can add tests for case insensitive column resolution.

Contributor Author

huaxingao Oct 15, 2020

Yes, upper case ID still works here. I will add tests for case insensitive column resolution.
Thanks for reviewing and merging my PR!

huaxingao commented

View reviewed changes

...rc/test/scala/org/apache/spark/sql/execution/datasources/v2/jdbc/JDBCTableCatalogSuite.scala

		sql("ALTER TABLE h2.test.alt_table DROP COLUMN C1")
		sql("ALTER TABLE h2.test.alt_table DROP COLUMN c3")

Contributor Author

huaxingao Oct 15, 2020

Without fix (with unquoted column name), this throws the following Exception:

Failed table altering: test.alt_table;
org.apache.spark.sql.AnalysisException: Failed table altering: test.alt_table;
	at org.apache.spark.sql.jdbc.JdbcDialect.classifyException(JdbcDialects.scala:268)
Caused by: org.h2.jdbc.JdbcSQLException: Column "C3" not found; SQL statement:
ALTER TABLE "test"."alt_table" DROP COLUMN c3 [42122-195]

huaxingao commented

View reviewed changes

...rc/test/scala/org/apache/spark/sql/execution/datasources/v2/jdbc/JDBCTableCatalogSuite.scala

		sql("ALTER TABLE h2.test.alt_table ALTER COLUMN id TYPE DOUBLE")
		sql("ALTER TABLE h2.test.alt_table ALTER COLUMN deptno TYPE DOUBLE")

Contributor Author

huaxingao Oct 15, 2020

Without fix (with unquoted column name), this throws the following Exception:

Failed table altering: test.alt_table;
org.apache.spark.sql.AnalysisException: Failed table altering: test.alt_table;
	at org.apache.spark.sql.jdbc.JdbcDialect.classifyException(JdbcDialects.scala:268)
  Caused by: org.h2.jdbc.JdbcSQLException: Column "DEPTNO" not found; SQL statement:
ALTER TABLE "test"."alt_table" ALTER COLUMN deptno DOUBLE PRECISION [42122-195]

huaxingao commented

View reviewed changes

...rc/test/scala/org/apache/spark/sql/execution/datasources/v2/jdbc/JDBCTableCatalogSuite.scala

                 test("alter table ... update column nullability") {
                   withTable("h2.test.alt_table") {
-                    sql("CREATE TABLE h2.test.alt_table (ID INTEGER NOT NULL) USING _")
+                    sql("CREATE TABLE h2.test.alt_table (ID INTEGER NOT NULL, deptno INTEGER NOT NULL) USING _")

Contributor Author

huaxingao Oct 15, 2020

Without fix (with unquoted column name), this throws the following Exception:

Failed table altering: test.alt_table;
org.apache.spark.sql.AnalysisException: Failed table altering: test.alt_table;
	at org.apache.spark.sql.jdbc.JdbcDialect.classifyException(JdbcDialects.scala:268)
  Caused by: org.h2.jdbc.JdbcSQLException: Column "DEPTNO" not found; SQL statement:
ALTER TABLE "test"."alt_table" ALTER COLUMN deptno SET NULL [42122-195]

Contributor Author

huaxingao commented Oct 15, 2020

I actually get a little confused now: do I also need to take consideration of spark.sql.caseSensitive?
I will take a look at this tomorrow.

cloud-fan approved these changes

View reviewed changes

Contributor

cloud-fan commented Oct 15, 2020

I actually get a little confused now: do I also need to take consideration of spark.sql.caseSensitive?

I think there was a discussion a long time ago. For table/column lookup that happens inside custom catalogs, Spark can't control it. spark.sql.caseSensitive only controls lookup behavior inside spark.

SparkQA commented Oct 15, 2020

Test build #129808 has finished for PR 30041 at commit 7e8098c.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

SparkQA commented Oct 15, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34413/

SparkQA commented Oct 15, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34413/

Contributor

cloud-fan commented Oct 15, 2020

thanks, merging to master!

cloud-fan closed this in

31f7097

huaxingao deleted the alter_table_followup branch

October 15, 2020 16:13

This was referenced Oct 15, 2020

[SPARK-33095][SQL] Support ALTER TABLE in JDBC v2 Table Catalog: add, update type and nullability of columns (MySQL dialect) #30025

Closed

[SPARK-33130][SQL] Support ALTER TABLE in JDBC v2 Table Catalog: add, update type and nullability of columns (MsSqlServer dialect) #30038

Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment