[SPARK-36861][SQL] Use yyyy-MM-dd as the date pattern in partition discovery#34700
[SPARK-36861][SQL] Use yyyy-MM-dd as the date pattern in partition discovery#34700MaxGekk wants to merge 3 commits intoapache:masterfrom
yyyy-MM-dd as the date pattern in partition discovery#34700Conversation
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #145590 has finished for PR 34700 at commit
|
| .collect() | ||
|
|
||
| assert(Set(result: _*) === Set( | ||
| Row("29.5.a_b_EGDP022204.jpg", "kittens", Date.valueOf("2018-01-01")), |
There was a problem hiding this comment.
I reverted changes made by https://github.com/apache/spark/pull/33709/files#r688851936. Now the test looks the same as in branch-3.2.
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
thanks, merging to master! |
|
Test build #145618 has finished for PR 34700 at commit
|
What changes were proposed in this pull request?
In the PR, I propose to explicitly set the date pattern to
yyyy-MM-ddwhile inferring types of partition values.Why are the changes needed?
The existing date partition parser is much more tolerant to its input, and can skip some parts of date strings. For example, see SPARK-36861. As a consequence, it can loose some user's info (pieces of partition values).
Does this PR introduce any user-facing change?
No. New behaviour introduced by #33709 hasn't released yet.
How was this patch tested?
By running the modified test suite: