-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-28435][SQL] Support accepting the interval keyword in the schema string #25189
Conversation
Test build #107826 has finished for PR 25189 at commit
|
retest this please |
Test build #107835 has finished for PR 25189 at commit
|
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/DataTypeParserSuite.scala
Show resolved
Hide resolved
-- cast string to interval and interval to string | ||
SELECT CAST('interval 3 month 1 hour' AS interval); | ||
SELECT CAST(interval 3 month 1 hour AS string); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Moves these tests in the end of this file?
Test build #107892 has finished for PR 25189 at commit
|
Test build #107896 has finished for PR 25189 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me cc: @dongjoon-hyun @gatorsmile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @wangyum . Your contribution is good, but I'd give -1 for this PR (AS-IS) with the following reasons.
- This PR silently rename
calendarinterval
tointerval
and that causes most of the changes in this PR (is it inevitable?) - However, the PR title and description doesn't mentioned it at all.
- This situation is repeated several times in your PRs .
The PR should be consist in three parts; title/description/code.
In addition, I'm wondering if this is the only way to support this casting. This is just a design choice. You can double down your direction, but could you please try to find a minimal way before we go down this way?
cc @gatorsmile
How about split it into 2 PRs:
|
Otherwise, you can update this PR title and description according to your contribution. If you suggest that, of course, it's possible and those PR look more clear as a single PR. For (1), the PR should describe the context in the PR description instead of pointing this PR. |
Created a new PR(#25225) to make it more clear. |
#25225 is merged now. Could you rebase this PR? |
Test build #108033 has finished for PR 25189 at commit
|
retest this please |
Test build #108041 has finished for PR 25189 at commit
|
Weird message log? Is this not related to this pr, right?
Anyway, could you update the title/description? (It seems the current title doesn't explain about the pr directly?) |
Thank you @maropu Updated the title and description. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little hesitant to add interval relate features and make interval type more visible to users. AFAIK there are ongoing discussions about if we should revisit the interval type and make it follow SQL standard. cc @MaxGekk @mgaido91 PRs like this will add more backward compatibility concerns if we change the interval type in the future. |
@cloud-fan thanks for pinging me. May you please point out which discussion you're referring to? I am probably missing it... |
@mgaido91 I really can't recall it, maybe in some PRs or JIRA tickets or dev list emails. I was pinging some people who might be related :P Also cc @HyukjinKwon |
Hi, @cloud-fan . This is not that-level work because this is only SQL layer. I guess you mean this. There was @rxin and @cloud-fan 's positive comment there.
Shall we talk on that PR? |
I think we should support 2 separate data types for intervals as it is defined by SQL standard. I opened 2 JIRA tickets for that: SPARK-27790 (SPARK-27791 & SPARK-27793) but both types must have the same keyword for literals - |
@MaxGekk I think one thing we can do right now is to add a restrict on the interval literal: we can't allow an interval with both year-month fields and day-time fields. |
Yea, it's really rather about #25022. But I have to say there are still multiple tasks to fully support interval. For instance, we should map relevant language-native types into internal in Python and R as well. See SPARK-28493 and SPARK-28492. Also, we should consider about mapping R <> Arrow, Pandas <> Arrow too. I think we should officially document those interval related features are all experimental and highly unstable to alleviate the concern. |
I agree with @MaxGekk when he says:
Hence, I honestly share @cloud-fan 's doubts on making this visible to users. Making visible something which we are going to change (hopefully in the same 3.0 timeframe) seems not a great choice and may introduce more concerns on backward compatibility when we want to get SQL standard compliance and/or we may then break user's applications. Another thing we may do is to transform the |
I'm OK to expose some parts of the interval that are not going to change, e.g. the interval literal syntax is fine to me: About this interval cast syntax, does Postgres/Presto/SQL Server/Oracle have it? |
Yep. Of course, this is a supported syntax, @cloud-fan . |
|
yea I know the interval literal syntax is standard, I was asking about the cast syntax. |
Oh, I got it. |
Yea, pg can explicitly cast a text to interval;
|
What he meant is
|
btw, mysql/presto doesn't have interval as an exposed type for casts. |
Yep. It's different and also PostgreSQL has additional variants like #25225 (comment) . |
Hi, this pr seems to expose interval type to table schema for creating and altering table, e.g. in 2.4 or earlier, |
What changes were proposed in this pull request?
#7355 add support casting between IntervalType and StringType for scala interface:
But SQL interface does not support it:
This PR add supports accepting the
interval
keyword in the schema string. So that SQL interface can support this feature.How was this patch tested?
unit tests