Skip to content

[SPARK-37545][SQL] V2 CreateTableAsSelect command should qualify location#34806

Closed
imback82 wants to merge 3 commits intoapache:masterfrom
imback82:v2_ctas_qualified_loc
Closed

[SPARK-37545][SQL] V2 CreateTableAsSelect command should qualify location#34806
imback82 wants to merge 3 commits intoapache:masterfrom
imback82:v2_ctas_qualified_loc

Conversation

@imback82
Copy link
Copy Markdown
Contributor

@imback82 imback82 commented Dec 4, 2021

What changes were proposed in this pull request?

Currently, v2 CTAS command doesn't qualify the location:

spark.sql("CREATE TABLE testcat.t USING foo LOCATION '/tmp/foo' AS SELECT id FROM source")
spark.sql("DESCRIBE EXTENDED testcat.t").filter("col_name = 'Location'").show

+--------+-------------+-------+
|col_name|    data_type|comment|
+--------+-------------+-------+
|Location|/tmp/foo     |       |
+--------+-------------+-------+

, whereas v1 command qualifies the location as file:/tmp/foo which is the correct behavior since the default filesystem can change for different sessions.

Why are the changes needed?

This PR proposes to store the qualified location in order to prevent the issue where default filesystem changes for different sessions.

Does this PR introduce any user-facing change?

Yes, now, v2 CTAS command will store qualified location:

+--------+-------------+-------+
|col_name|    data_type|comment|
+--------+-------------+-------+
|Location|file:/tmp/foo|       |
+--------+-------------+-------+

How was this patch tested?

Added new test

@github-actions github-actions bot added the SQL label Dec 4, 2021
@SparkQA
Copy link
Copy Markdown

SparkQA commented Dec 4, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50396/

@SparkQA
Copy link
Copy Markdown

SparkQA commented Dec 4, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50396/

@SparkQA
Copy link
Copy Markdown

SparkQA commented Dec 4, 2021

Test build #145920 has finished for PR 34806 at commit e3c2624.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Dec 4, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50398/

@SparkQA
Copy link
Copy Markdown

SparkQA commented Dec 4, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50398/

@SparkQA
Copy link
Copy Markdown

SparkQA commented Dec 4, 2021

Test build #145922 has finished for PR 34806 at commit fb771b7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@huaxingao
Copy link
Copy Markdown
Contributor

I didn't qualify path in ReplaceTableAsSelect. I will submit a PR shortly to fix it.

@huaxingao huaxingao closed this in feba5ac Dec 5, 2021
@huaxingao
Copy link
Copy Markdown
Contributor

Merged to master. Thanks! @imback82

cc @cloud-fan

@cloud-fan
Copy link
Copy Markdown
Contributor

late LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants