-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-19724][SQL]create a managed table with an existed default table should throw an exception #17272
Conversation
…e path should throw an exception
@@ -187,38 +187,32 @@ class InMemoryCatalog( | |||
val db = tableDefinition.identifier.database.get | |||
requireDbExists(db) | |||
val table = tableDefinition.identifier.table | |||
if (tableExists(db, table)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have check if the table exists in SessionCatalog.createtable
Test build #74436 has finished for PR 17272 at commit
|
@@ -164,15 +164,6 @@ abstract class ExternalCatalogSuite extends SparkFunSuite with BeforeAndAfterEac | |||
assert(actual.tableType === CatalogTableType.EXTERNAL) | |||
} | |||
|
|||
test("create table when the table already exists") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since in InMemoryCatalog
and HiveExternalCatalog
does not check if the table exists, here move the test case to SessionCatalogSuite
(using InMemoryCatalog) and HiveExternalCatalogSuite
@@ -155,6 +154,8 @@ case class CreateDataSourceTableAsSelectCommand( | |||
} else { | |||
table.storage.locationUri | |||
} | |||
|
|||
sparkSession.sessionState.catalog.checkTableOrPathExists(table, ignoreIfExists = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because in following saveDataIntoTable
will create the table path and store data in it, we should not check the path in following createTable
, we check it here before saveDataIntoTable
Test build #74437 has finished for PR 17272 at commit
|
Test build #74441 has finished for PR 17272 at commit
|
retest this please |
Test build #74445 has finished for PR 17272 at commit
|
retest this please |
Test build #74468 has finished for PR 17272 at commit
|
Test build #74472 has finished for PR 17272 at commit
|
Test build #74493 has started for PR 17272 at commit |
retest this please |
Test build #74499 has finished for PR 17272 at commit
|
2ac70b4
to
cd4a091
Compare
python/pyspark/sql/tests.py
Outdated
@@ -1892,7 +1892,7 @@ def test_list_tables(self): | |||
self.assertEquals(spark.catalog.listTables(), []) | |||
self.assertEquals(spark.catalog.listTables("some_db"), []) | |||
spark.createDataFrame([(1, 1)]).createOrReplaceTempView("temp_tab") | |||
spark.sql("CREATE TABLE tab1 (name STRING, age INT) USING parquet") | |||
spark.sql("CREATE TABLE tbl1 (name STRING, age INT) USING parquet") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
python test failed
the location('file:/home/jenkins/workspace/SparkPullRequestBuilder/spark-warehouse/tab1') of table('`default`.`tab1`') already exists.;
the location of tab1 does not deleted , I have tried to found out in which test case it forget to delete it(search all test cases containing tab1
and run it), but they are all ok to delete the location of tab1, so here we change the table name to work around it.
Test build #74516 has finished for PR 17272 at commit
|
retest this please |
Test build #74524 has finished for PR 17272 at commit
|
516c4e4
to
739f207
Compare
Test build #74535 has finished for PR 17272 at commit
|
Test build #74604 has finished for PR 17272 at commit
|
@windpiger is this still active? |
cc @ltnwgl Could you please take this over? Thanks! |
cc @dongjoon-hyun Are you interested in this PR? Just take it over? |
@gatorsmile I will take it over :) |
…le should throw an exception ## What changes were proposed in this pull request? This PR is to finish #17272 This JIRA is a follow up work after SPARK-19583 As we discussed in that PR The following DDL for a managed table with an existed default location should throw an exception: CREATE TABLE ... (PARTITIONED BY ...) AS SELECT ... CREATE TABLE ... (PARTITIONED BY ...) Currently there are some situations which are not consist with above logic: CREATE TABLE ... (PARTITIONED BY ...) succeed with an existed default location situation: for both hive/datasource(with HiveExternalCatalog/InMemoryCatalog) CREATE TABLE ... (PARTITIONED BY ...) AS SELECT ... situation: hive table succeed with an existed default location This PR is going to make above two situations consist with the logic that it should throw an exception with an existed default location. ## How was this patch tested? unit test added Author: Gengliang Wang <gengliang.wang@databricks.com> Closes #20886 from gengliangwang/pr-17272.
…le should throw an exception ## What changes were proposed in this pull request? This PR is to finish apache#17272 This JIRA is a follow up work after SPARK-19583 As we discussed in that PR The following DDL for a managed table with an existed default location should throw an exception: CREATE TABLE ... (PARTITIONED BY ...) AS SELECT ... CREATE TABLE ... (PARTITIONED BY ...) Currently there are some situations which are not consist with above logic: CREATE TABLE ... (PARTITIONED BY ...) succeed with an existed default location situation: for both hive/datasource(with HiveExternalCatalog/InMemoryCatalog) CREATE TABLE ... (PARTITIONED BY ...) AS SELECT ... situation: hive table succeed with an existed default location This PR is going to make above two situations consist with the logic that it should throw an exception with an existed default location. ## How was this patch tested? unit test added Author: Gengliang Wang <gengliang.wang@databricks.com> Closes apache#20886 from gengliangwang/pr-17272.
What changes were proposed in this pull request?
This JIRA is a follow up work after SPARK-19583
As we discussed in that PR
The following DDL for a managed table with an existed default location should throw an exception:
Currently there are some situations which are not consist with above logic:
CREATE TABLE ... (PARTITIONED BY ...) succeed with an existed default location
situation: for both hive/datasource(with HiveExternalCatalog/InMemoryCatalog)
CREATE TABLE ... (PARTITIONED BY ...) AS SELECT ...
situation: hive table succeed with an existed default location
This PR is going to make above two situations consist with the logic that it should throw an exception
with an existed default location.
How was this patch tested?
unit test added