[SPARK-34850][SQL] Support multiply a day-time interval by a numeric#31951
[SPARK-34850][SQL] Support multiply a day-time interval by a numeric#31951MaxGekk wants to merge 8 commits intoapache:masterfrom
Conversation
|
Kubernetes integration test starting |
|
@yaooqinn @cloud-fan Could you review this PR, please. |
|
Kubernetes integration test status failure |
| s"((new Decimal()).set($m).$$times($n)).toJavaBigDecimal()" + | ||
| ".setScale(0, java.math.RoundingMode.HALF_UP).longValueExact()") | ||
| case _: FractionalType => | ||
| defineCodeGen(ctx, ev, (m, n) => s"java.lang.Math.round($m * (double)$n)") |
There was a problem hiding this comment.
Seems that we should fail here too not just round it to the max long value
There was a problem hiding this comment.
Does the SQL standard require such behavior? From my point of view, we can map Double.PositiveInfinity to Long.MaxValue as multiple double values can map to the same long value. We just should document such behavior, and borrow the text from java.lang.Math.round()
<li>If the argument is negative infinity or any value less than or
equal to the value of {@code Long.MIN_VALUE}, the result is
equal to the value of {@code Long.MIN_VALUE}.
<li>If the argument is positive infinity or any value greater than or
equal to the value of {@code Long.MAX_VALUE}, the result is
equal to the value of {@code Long.MAX_VALUE}.</ul>
@srielau @cloud-fan WDYT?
There was a problem hiding this comment.
I don't have a strong opinion here. Shall we turn float/double to Decimal and do the calculation?
There was a problem hiding this comment.
IMO, the interval value expression's return type is an interval and it defines an interval itself. The expression behavior should respect the interval term rather than the multiplier or divisor. According to the standard, we should define and raise interval field overflow rather than numeric overflows when the number of significant digits exceeds the implementation-defined maximum number of significant digits.
There was a problem hiding this comment.
Shall we turn float/double to Decimal and do the calculation?
Decimal (and Java BigDecimal) doesn't have representation for NaN, PositiveInfinity and NegativeInfinity, see:
scala> import org.apache.spark.sql.types.Decimal
import org.apache.spark.sql.types.Decimal
scala> Decimal(Double.NaN)
java.lang.NumberFormatException
at java.math.BigDecimal.<init>(BigDecimal.java:497)
scala> Decimal(Float.PositiveInfinity)
java.lang.NumberFormatException
at java.math.BigDecimal.<init>(BigDecimal.java:497)
scala> Decimal(Float.MinValue)
res2: org.apache.spark.sql.types.Decimal = -340282346638528860000000000000000000000There was a problem hiding this comment.
I have found the roundToLong() from Guava satisfies our needs completely. I am going to invoke it.
|
Test build #136447 has finished for PR 31951 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #136481 has finished for PR 31951 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Kubernetes integration test starting |
|
GA passed. Merging to master. |
|
Kubernetes integration test status failure |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #136495 has finished for PR 31951 at commit
|
|
Test build #136500 has finished for PR 31951 at commit
|
|
Test build #136502 has finished for PR 31951 at commit
|
What changes were proposed in this pull request?
MultiplyDTIntervalwhich multiplies aDayTimeIntervalTypeexpression by aNumericTypeexpression including ByteType, ShortType, IntegerType, LongType, FloatType, DoubleType, DecimalType.numeric * day-time intervalandday-time interval * numeric.DoubleMath.roundToIntindouble/float * year-month interval.Why are the changes needed?
To conform the ANSI SQL standard which requires such operation over day-time intervals:

Does this PR introduce any user-facing change?
No
How was this patch tested?
By running new tests: