Skip to content

Commit

Permalink
[SPARK-32133][SQL] Forbid time field steps for date start/end in Sequ…
Browse files Browse the repository at this point in the history
…ence

### What changes were proposed in this pull request?
1.Add time field steps check for date start/end in Sequence at `org.apache.spark.sql.catalyst.expressions.Sequence.TemporalSequenceImpl`
2.Add a UT:`SPARK-32133: Sequence step must be a day interval if start and end values are dates` at `org.apache.spark.sql.catalyst.expressions.CollectionExpressionsSuite`

### Why are the changes needed?
**Sequence time field steps for date start/end looks strange in spark as follows:**
```
scala> sql("select explode(sequence(cast('2011-03-01' as date), cast('2011-03-02' as date), interval 1 hour))").head(3)
res0: Array[org.apache.spark.sql.Row] = _Array([2011-03-01], [2011-03-01], [2011-03-01])_ **<- strange result.**

scala> sql("select explode(sequence(cast('2011-03-01' as date), cast('2011-03-02' as date), interval 1 day))").head(3)
res1: Array[org.apache.spark.sql.Row] = Array([2011-03-01], [2011-03-02])
```

**While this behavior in Prosto make sense:**
```
presto> select sequence(date('2011-03-01'),date('2011-03-02'),interval '1' hour);
Query 20200624_122744_00002_pehix failed: sequence step must be a day interval if start and end values are dates
presto> select sequence(date('2011-03-01'),date('2011-03-02'),interval '1' day);
_col0
[2011-03-01, 2011-03-02]
```

### Does this PR introduce _any_ user-facing change?
Yes, after this patch, users will get informed `sequence step must be a day interval if start and end values are dates` when
use time field steps for date start/end in Sequence.

### How was this patch tested?
Unit test.

Closes #28926 from TJX2014/master-SPARK-31982-sequence-cross-dst-follow-presto.

Authored-by: TJX2014 <xiaoxingstack@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
  • Loading branch information
TJX2014 authored and dongjoon-hyun committed Jul 10, 2020
1 parent 560fe1f commit 500877e
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -2612,10 +2612,17 @@ object Sequence {
val stepDays = step.days
val stepMicros = step.microseconds

if (scale == MICROS_PER_DAY && stepMonths == 0 && stepDays == 0) {
throw new IllegalArgumentException(
"sequence step must be a day interval if start and end values are dates")
}

if (stepMonths == 0 && stepMicros == 0 && scale == MICROS_PER_DAY) {
// Adding pure days to date start/end
backedSequenceImpl.eval(start, stop, fromLong(stepDays))

} else if (stepMonths == 0 && stepDays == 0 && scale == 1) {
// Adding pure microseconds to timestamp start/end
backedSequenceImpl.eval(start, stop, fromLong(stepMicros))

} else {
Expand Down Expand Up @@ -2674,11 +2681,24 @@ object Sequence {
|${genSequenceLengthCode(ctx, startMicros, stopMicros, intervalInMicros, arrLength)}
""".stripMargin

val check = if (scale == MICROS_PER_DAY) {
s"""
|if ($stepMonths == 0 && $stepDays == 0) {
| throw new IllegalArgumentException(
| "sequence step must be a day interval if start and end values are dates");
|}
""".stripMargin
} else {
""
}

s"""
|final int $stepMonths = $step.months;
|final int $stepDays = $step.days;
|final long $stepMicros = $step.microseconds;
|
|$check
|
|if ($stepMonths == 0 && $stepMicros == 0 && ${scale}L == ${MICROS_PER_DAY}L) {
| ${backedSequenceImpl.genCode(ctx, start, stop, stepDays, arr, elemType)};
|
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -933,6 +933,13 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper
Literal(negateExact(stringToInterval("interval 1 month")))),
EmptyRow,
s"sequence boundaries: 0 to 2678400000000 by -${28 * MICROS_PER_DAY}")

// SPARK-32133: Sequence step must be a day interval if start and end values are dates
checkExceptionInExpression[IllegalArgumentException](Sequence(
Cast(Literal("2011-03-01"), DateType),
Cast(Literal("2011-04-01"), DateType),
Option(Literal(stringToInterval("interval 1 hour")))), null,
"sequence step must be a day interval if start and end values are dates")
}
}

Expand Down

0 comments on commit 500877e

Please sign in to comment.