New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-32685][SQL] When specify serde, default filed.delim is '\t' #30942
Conversation
Add UT
|
FYI @maropu @cloud-fan |
@@ -505,7 +505,12 @@ class SparkSqlAstBuilder extends AstBuilder { | |||
} else { | |||
None | |||
} | |||
(Seq.empty, Option(name), props.toSeq, recordHandler) | |||
val finalProps = if (!props.contains("field.delim")) { | |||
props.toSeq ++ Seq("field.delim" -> "\t") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the behavior now? which delimiter do we use by default before this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the behavior now? which delimiter do we use by default before this PR?
No default we use '\u0001'
@@ -505,7 +505,12 @@ class SparkSqlAstBuilder extends AstBuilder { | |||
} else { | |||
None | |||
} | |||
(Seq.empty, Option(name), props.toSeq, recordHandler) | |||
val finalProps = if (!props.contains("field.delim")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this case sensitive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test build #133421 has finished for PR 30942 at commit
|
The code has been modified to make it easier for it to add other possible default parameters in specified Hive serde mode. |
thanks, merging to master! |
@AngersZhuuuu Please update the migration guide |
Yea |
…ault filed.delim to '\t' when user specifies serde ### What changes were proposed in this pull request? Update migration guide according to #30942 (comment) ### Why are the changes needed? update migration guide. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Not need Closes #31051 from AngersZhuuuu/SPARK-32685-FOLLOW-UP. Authored-by: angerszhu <angers.zhu@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
In hive script transform, when we use specified serde, the
filed.delim
is '\t'And change to other serde and explain query plan,
filed.delim
is same.In spark current code, the result is as below:
We should keep same as hive.
Notic:
the result's NULL value is different is another issue https://issues.apache.org/jira/browse/SPARK-32684
Why are the changes needed?
Keep same with hive serde
Does this PR introduce any user-facing change?
In script transform, is not specified,
field.delim
keep same with hive as\t
How was this patch tested?
UT added