New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP][SPARK-29155][SQL] Support special date/timestamp values in the PostgreSQL dialect only #25834
Conversation
@HyukjinKwon This is the draft PR for #25716 (review) . I added the config from #25697 . As soon as this PR #25708 for dates be merged, I will put special date values under the config as well. |
Test build #110929 has finished for PR 25834 at commit
|
…ues-under-config # Conflicts: # sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala # sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala
Test build #110973 has finished for PR 25834 at commit
|
jenkins, retest this, please |
Test build #110983 has finished for PR 25834 at commit
|
cc @maropu and @cloud-fan |
Test build #111149 has finished for PR 25834 at commit
|
Test build #111164 has finished for PR 25834 at commit
|
Test build #111165 has finished for PR 25834 at commit
|
Hi, @MaxGekk , thanks for the work! one question: don't we need some summary documents for these PgSQL-specific behaviour? If we add more and more features for this flag in future, it might be difficult for users to follow them. cc: @gatorsmile @dongjoon-hyun |
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
Show resolved
Hide resolved
We need a document page to explain the pgsql dialect. There will be a lot of things and it's too much to put in the |
Any ideas where I can create such page about supported features of PostgreSQL dialect? |
Test build #111188 has finished for PR 25834 at commit
|
Test build #111189 has finished for PR 25834 at commit
|
jenkins, retest this, please |
Test build #111199 has finished for PR 25834 at commit
|
Test build #111221 has finished for PR 25834 at commit
|
With hindsight, I think this feature (special datetime strings) doesn't conflict with Spark and seems very intuitive. Do we really need to control it by a flag? |
I would keep the feature without any flags. For me it makes sense to hide it only if the feature brings any performance penalty but it should not (I do believe but we can recheck this) because condition for branching should be cheap. |
Even if pgsql doesn't have this feature, I would accept this feature as an improvement to Spark. Shall we simply add a migration guide instead of protecting it via a flag? cc @gatorsmile @gengliangwang |
This feature doesn't conflict with the existing behavior of Spark. I think we don't need the dialect flag here. |
I think we need a consistent rule for that... for example, the case 2 in #25697 is conflict with the existing behaivours? |
I think so. Previously the result of |
Please, let me know what I should do next 1. either rebase this PR or 2. open another one for updating the SQL migration guide. |
I prefer to do 2. This is a nice feature to have in Spark. |
Please, take a look at #25948 |
What changes were proposed in this pull request?
In the PR, I propose to support special timestamp and date values introduced by #25716 and #25708 only when the SQL config
spark.sql.dialect
is set toPostgreSQL
.Why are the changes needed?
The special values are PostgreSQL specific feature. It should be supported under a config to avoid performance penalties and potential impact on user apps.
Does this PR introduce any user-facing change?
Yes, special date/timestamp values like
epoch
,now
, today, tomorrow
,yesterday
are supported only in the PostgreSQL dialect.In the Spark dialect (the default dialect):
In the PostgreSQL dialect:
How was this patch tested?
Updated existing test suite
TimestampFormatterSuite
,DateTimeUtilsSuite
andCsv
/JsonFunctionsSuite
.