[SPARK-29387][SQL] Support * and / operators for intervals#26132
[SPARK-29387][SQL] Support * and / operators for intervals#26132MaxGekk wants to merge 36 commits intoapache:masterfrom
* and / operators for intervals#26132Conversation
* and \ operators for intervals* and / operators for intervals
|
Test build #112124 has finished for PR 26132 at commit
|
|
Test build #112182 has finished for PR 26132 at commit
|
|
Test build #112187 has finished for PR 26132 at commit
|
|
@wangyum @cloud-fan @srowen Please, review this PR. |
srowen
left a comment
There was a problem hiding this comment.
It's a niche feature but looks plausible, if it's for PostgreSQL. Just want to check whether the behavior in corner cases matches.
| } | ||
|
|
||
| public CalendarInterval multiply(double num) { | ||
| int months = Math.toIntExact((long)(num * this.months)); |
There was a problem hiding this comment.
What happens when dividing an interval of 1 month by, say, 2? You'd end up with an interval of 0 time. I suppose the right answer is whatever PostgreSQL does.
There was a problem hiding this comment.
I will check what it does.
There was a problem hiding this comment.
PosgreSQL does this differently. I have re-implemented the operations.
common/unsafe/src/test/java/org/apache/spark/unsafe/types/CalendarIntervalSuite.java
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| private static CalendarInterval fromDoubles(double months, double microseconds) { | ||
| long roundedMonths = (long)(months); |
There was a problem hiding this comment.
'truncatedMonths` or something? it's not exactly rounded
common/unsafe/src/main/java/org/apache/spark/unsafe/types/CalendarInterval.java
Outdated
Show resolved
Hide resolved
|
Test build #112230 has finished for PR 26132 at commit
|
|
Test build #112231 has finished for PR 26132 at commit
|
|
Test build #113100 has finished for PR 26132 at commit
|
|
Test build #113107 has finished for PR 26132 at commit
|
|
jenkins, retest this, please |
|
Test build #113109 has finished for PR 26132 at commit
|
@srowen @cloud-fan @dongjoon-hyun @HyukjinKwon Since #26134 has been already merged, please, take a look at this PR. |
| double microsWithFraction) { | ||
| int truncatedMonths = Math.toIntExact((long)(monthsWithFraction)); | ||
| // Using 30 days per month as PostgreSQL does. | ||
| double days = daysWithFraction + 30 * (monthsWithFraction - truncatedMonths); |
There was a problem hiding this comment.
is this consistent with pgsql? i.e. we convert the truncated months to days.
There was a problem hiding this comment.
maxim=# select interval '1 month' * 1.1;
?column?
--------------
1 mon 3 days
(1 row)| return new CalendarInterval(truncatedMonths, truncatedDays, truncatedMicros); | ||
| } | ||
|
|
||
| public CalendarInterval multiply(double num) { |
There was a problem hiding this comment.
shall we use decimal instead? double is approximate value and we may truncate.
There was a problem hiding this comment.
- In that case, our implementation will deviate from postgesql which uses double internally. At the moment, we return the same results as postgresql (or I haven't found yet the case when the results are different).
- most likely, it will be slower
If it is ok, I will switch to decimals.
There was a problem hiding this comment.
OK let's use double if it's what pgsql uses.
Can we move the add, subtract, multiply and divide to IntervalUtils? In case we want to change them in the future.
There was a problem hiding this comment.
Can we move the add, subtract ... to IntervalUtils?
Do you want to move + and - in this PR? I just want to double check this because those methods are not related to this PR directly.
There was a problem hiding this comment.
both are fine. We can move them tother, or have a followup PR to move +/-
There was a problem hiding this comment.
Double is not big enough to support the average aggregate for interval #26347, I prefer decimal personally
There was a problem hiding this comment.
Double is not big enough to support the average aggregate ...
@yaooqinn Could you explain what do you mean? I could image that double is not precise enough but not big though ...
…l-div # Conflicts: # sql/core/src/test/resources/sql-tests/results/datetime.sql.out
|
Test build #113202 has finished for PR 26132 at commit
|
| MultiplyInterval(l, r) | ||
| case Multiply(l @ NumericType(), r @ CalendarIntervalType()) => | ||
| MultiplyInterval(r, l) | ||
| case Divide(l @ CalendarIntervalType(), r @ NumericType()) => |
There was a problem hiding this comment.
postgres=# select interval '1 year' / '365';
?column?
---------------
23:40:16.4064
(1 row)could this be supported?
There was a problem hiding this comment.
Taking into account the discussion in #26165, I am not sure. @cloud-fan Should I support this?
There was a problem hiding this comment.
We can, but this should only apply to literals, not string columns.
There was a problem hiding this comment.
If we want to support it, please open another PR.
| override def prettyName: String = operationName + "_interval" | ||
| } | ||
|
|
||
| case class MultiplyInterval(interval: Expression, num: Expression) |
There was a problem hiding this comment.
https://github.com/apache/spark/pull/26345/files#diff-b83497f7bc11578a0b63a814a2a30f48R2198-R2207
How about adding these doc and test cases here
There was a problem hiding this comment.
This should be added for only expressions registered as functions.
| (i: CalendarInterval, n: Double) => multiply(i, n), | ||
| "multiply") | ||
|
|
||
| case class DivideInterval(interval: Expression, num: Expression) |
There was a problem hiding this comment.
|
Test build #113255 has finished for PR 26132 at commit
|
|
Test build #113265 has finished for PR 26132 at commit
|
|
thanks, merging to master! |
What changes were proposed in this pull request?
Added new expressions
MultiplyIntervalandDivideIntervalto multiply/divide an interval by a numeric. UpdatedTypeCoercion.DateTimeOperationsto turn theMultiply/Divideexpressions ofCalendarIntervalTypeandNumericTypetoMultiplyInterval/DivideInterval.To support new operations, added new methods
multiply()anddivide()toCalendarInterval.Why are the changes needed?
numeric * interval,interval * numericandinterval / numeric. See 4.5.3 Operations involving datetimes and intervals.Does this PR introduce any user-facing change?
Yes, previously the following query fails with the error:
After:
How was this patch tested?
multiply()anddivide()methods toCalendarIntervalSuite.javaIntervalExpressionsSuiteMultiply->MultiplyIntervalandDivide->DivideIntervalinTypeCoercionSuitedatetime.sql