Skip to content

[SPARK-36552][SQL] Fix different behavior for writing char/varchar to hive and datasource table#33798

Closed
yaooqinn wants to merge 2 commits intoapache:masterfrom
yaooqinn:SPARK-36552
Closed

[SPARK-36552][SQL] Fix different behavior for writing char/varchar to hive and datasource table#33798
yaooqinn wants to merge 2 commits intoapache:masterfrom
yaooqinn:SPARK-36552

Conversation

@yaooqinn
Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

For the hive table, the actual write path and the schema handling are inconsistent when spark.sql.legacy.charVarcharAsString is true.

This causes problems like SPARK-36552 described.

In this PR we respect spark.sql.legacy.charVarcharAsString when generates hive table schema from spark data types.

Why are the changes needed?

bugfix

Does this PR introduce any user-facing change?

yes, when spark.sql.legacy.charVarcharAsString is true, hive table with char/varchar will respect string behavior.

How was this patch tested?

newly added test

@github-actions github-actions bot added the SQL label Aug 20, 2021
@yaooqinn
Copy link
Copy Markdown
Member Author

cc @cloud-fan @HyukjinKwon @maropu thanks

@SparkQA
Copy link
Copy Markdown

SparkQA commented Aug 20, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47172/

@SparkQA
Copy link
Copy Markdown

SparkQA commented Aug 20, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47172/

@SparkQA
Copy link
Copy Markdown

SparkQA commented Aug 20, 2021

Test build #142670 has finished for PR 33798 at commit f33ca39.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Aug 21, 2021

Test build #142680 has finished for PR 33798 at commit c4d25aa.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Aug 21, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47182/

@SparkQA
Copy link
Copy Markdown

SparkQA commented Aug 21, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47182/

@yaooqinn
Copy link
Copy Markdown
Member Author

retest this please

@SparkQA
Copy link
Copy Markdown

SparkQA commented Aug 21, 2021

Test build #142681 has finished for PR 33798 at commit c4d25aa.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Aug 21, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47183/

@SparkQA
Copy link
Copy Markdown

SparkQA commented Aug 21, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47183/

@HyukjinKwon
Copy link
Copy Markdown
Member

Merged to master and branch-3.2.

HyukjinKwon pushed a commit that referenced this pull request Aug 22, 2021
… hive and datasource table

### What changes were proposed in this pull request?

For the hive table, the actual write path and the schema handling are inconsistent when `spark.sql.legacy.charVarcharAsString` is true.

This causes problems like SPARK-36552 described.

In this PR we respect `spark.sql.legacy.charVarcharAsString` when generates hive table schema from spark data types.

### Why are the changes needed?

bugfix

### Does this PR introduce _any_ user-facing change?

yes, when `spark.sql.legacy.charVarcharAsString` is true, hive table with char/varchar will respect string behavior.

### How was this patch tested?

newly added test

Closes #33798 from yaooqinn/SPARK-36552.

Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit f918c12)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
@yaooqinn yaooqinn deleted the SPARK-36552 branch March 27, 2026 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants