Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-37546][SQL] V2 ReplaceTableAsSelect command should qualify location #34807

Closed
wants to merge 4 commits into from

Conversation

huaxingao
Copy link
Contributor

What changes were proposed in this pull request?

Currently, v2 RTAS command doesn't qualify the location:

spark.sql("REPLACE TABLE testcat.t USING foo LOCATION '/tmp/foo' AS SELECT id FROM source")
spark.sql("DESCRIBE EXTENDED testcat.t").filter("col_name = 'Location'").show

+--------+-------------+-------+
|col_name|    data_type|comment|
+--------+-------------+-------+
|Location|/tmp/foo     |       |
+--------+-------------+-------+

, whereas v1 command qualifies the location as file:/tmp/foo which is the correct behavior since the default filesystem can change for different sessions.

Why are the changes needed?

This PR proposes to store the qualified location in order to prevent the issue where default filesystem changes for different sessions.

Does this PR introduce any user-facing change?

Yes, now, v2 RTAS command will store qualified location:

+--------+-------------+-------+
|col_name|    data_type|comment|
+--------+-------------+-------+
|Location|file:/tmp/foo|       |
+--------+-------------+-------+

How was this patch tested?

new test

@github-actions github-actions bot added the SQL label Dec 4, 2021
@huaxingao
Copy link
Contributor Author

cc @imback82

@SparkQA
Copy link

SparkQA commented Dec 5, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50399/

@SparkQA
Copy link

SparkQA commented Dec 5, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50399/

Copy link
Contributor

@imback82 imback82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if tests pass. Thanks @huaxingao!

@SparkQA
Copy link

SparkQA commented Dec 5, 2021

Test build #145923 has finished for PR 34807 at commit 19b656a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 5, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50402/

@SparkQA
Copy link

SparkQA commented Dec 5, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50402/

@SparkQA
Copy link

SparkQA commented Dec 5, 2021

Test build #145926 has finished for PR 34807 at commit b065e18.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@huaxingao
Copy link
Contributor Author

@cloud-fan

.filter("col_name = 'Location'")
.select("data_type").head.getString(0)
assert(location === "file:/tmp/foo")
assert(location1 === "file:/tmp/foo")
spark.sql(s"REPLACE TABLE $identifier USING foo LOCATION '/tmp/foo' " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a new test case? In the future I believe we will have separate test suites for CREATE and REPLACE TABLE and we need separate test cases anyway.

@@ -165,9 +165,10 @@ class DataSourceV2Strategy(session: SparkSession) extends Strategy with Predicat

case CreateTable(ResolvedDBObjectName(catalog, ident), schema, partitioning,
tableSpec, ifNotExists) =>
val qualifiedLocation = tableSpec.location.map(makeQualifiedDBObjectPath(_))
val tableSpecWithQualifiedLocation = tableSpec.copy(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we add a new method qualifyLocInTableSpec(tableSpec): TableSpec to simplify the code here and other places?

@SparkQA
Copy link

SparkQA commented Dec 6, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50419/

@SparkQA
Copy link

SparkQA commented Dec 6, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50419/

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in bde47c8 Dec 6, 2021
@SparkQA
Copy link

SparkQA commented Dec 6, 2021

Test build #145944 has finished for PR 34807 at commit e1104a0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 6, 2021

Test build #145943 has finished for PR 34807 at commit 5ff0c26.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@huaxingao
Copy link
Contributor Author

Thanks! @cloud-fan @imback82

@huaxingao huaxingao deleted the location branch December 6, 2021 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants