Skip to content

Commit

Permalink
[SPARK-35051][SQL] Support add/subtract of a day-time interval to/fro…
Browse files Browse the repository at this point in the history
…m a date

### What changes were proposed in this pull request?
Support `date +/- day-time interval`. In the PR, I propose to update the binary arithmetic rules, and cast an input date to a timestamp at the session time zone, and then add a day-time interval to it.

### Why are the changes needed?
1. To conform the ANSI SQL standard which requires to support such operation over dates and intervals:
<img width="811" alt="Screenshot 2021-03-12 at 11 36 14" src="https://user-images.githubusercontent.com/1580697/111081674-865d4900-8515-11eb-86c8-3538ecaf4804.png">
2. To fix the regression comparing to the recent Spark release 3.1 with default settings.

Before the changes:
```sql
spark-sql> select date'now' + (timestamp'now' - timestamp'yesterday');
Error in query: cannot resolve 'DATE '2021-04-14' + subtracttimestamps(TIMESTAMP '2021-04-14 18:14:56.497', TIMESTAMP '2021-04-13 00:00:00')' due to data type mismatch: argument 1 requires timestamp type, however, 'DATE '2021-04-14'' is of date type.; line 1 pos 7;
'Project [unresolvedalias(cast(2021-04-14 + subtracttimestamps(2021-04-14 18:14:56.497, 2021-04-13 00:00:00, false, Some(Europe/Moscow)) as date), None)]
+- OneRowRelation
```

Spark 3.1:
```sql
spark-sql> select date'now' + (timestamp'now' - timestamp'yesterday');
2021-04-15
```

Hive:
```sql
0: jdbc:hive2://localhost:10000/default> select date'2021-04-14' + (timestamp'2020-04-14 18:15:30' - timestamp'2020-04-13 00:00:00');
+------------------------+
|          _c0           |
+------------------------+
| 2021-04-15 18:15:30.0  |
+------------------------+
```

### Does this PR introduce _any_ user-facing change?
Should not since new intervals have not been released yet.

After the changes:
```sql
spark-sql> select date'now' + (timestamp'now' - timestamp'yesterday');
2021-04-15 18:13:16.555
```

### How was this patch tested?
By running new tests:
```
$ build/sbt "test:testOnly *ColumnExpressionSuite"
```

Closes #32170 from MaxGekk/date-add-day-time-interval.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
  • Loading branch information
MaxGekk committed Apr 14, 2021
1 parent 3e218ad commit de9e8b6
Show file tree
Hide file tree
Showing 2 changed files with 52 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -345,6 +345,8 @@ class Analyzer(override val catalogManager: CatalogManager)
override def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperatorsUp {
case p: LogicalPlan => p.transformExpressionsUp {
case a @ Add(l, r, f) if a.childrenResolved => (l.dataType, r.dataType) match {
case (DateType, DayTimeIntervalType) => TimeAdd(Cast(l, TimestampType), r)
case (DayTimeIntervalType, DateType) => TimeAdd(Cast(r, TimestampType), l)
case (DateType, YearMonthIntervalType) => DateAddYMInterval(l, r)
case (YearMonthIntervalType, DateType) => DateAddYMInterval(r, l)
case (TimestampType, YearMonthIntervalType) => TimestampAddYMInterval(l, r)
Expand All @@ -360,6 +362,8 @@ class Analyzer(override val catalogManager: CatalogManager)
case _ => a
}
case s @ Subtract(l, r, f) if s.childrenResolved => (l.dataType, r.dataType) match {
case (DateType, DayTimeIntervalType) =>
DatetimeSub(l, r, TimeAdd(Cast(l, TimestampType), UnaryMinus(r, f)))
case (DateType, YearMonthIntervalType) =>
DatetimeSub(l, r, DateAddYMInterval(l, UnaryMinus(r, f)))
case (TimestampType, YearMonthIntervalType) =>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2775,4 +2775,52 @@ class ColumnExpressionSuite extends QueryTest with SharedSparkSession {
}
}
}

test("SPARK-35051: add/subtract a day-time interval to/from a date") {
withSQLConf(SQLConf.DATETIME_JAVA8API_ENABLED.key -> "true") {
outstandingZoneIds.foreach { zid =>
withSQLConf(SQLConf.SESSION_LOCAL_TIMEZONE.key -> zid.getId) {
Seq(
(LocalDate.of(1, 1, 1), Duration.ofDays(31)) -> LocalDateTime.of(1, 2, 1, 0, 0, 0),
(LocalDate.of(1582, 9, 15), Duration.ofDays(30).plus(1, ChronoUnit.MICROS)) ->
LocalDateTime.of(1582, 10, 15, 0, 0, 0, 1000),
(LocalDate.of(1900, 1, 1), Duration.ofDays(0).plusHours(1)) ->
LocalDateTime.of(1900, 1, 1, 1, 0, 0),
(LocalDate.of(1970, 1, 1), Duration.ofDays(-1).minusMinutes(1)) ->
LocalDateTime.of(1969, 12, 30, 23, 59, 0),
(LocalDate.of(2021, 3, 14), Duration.ofDays(1)) ->
LocalDateTime.of(2021, 3, 15, 0, 0, 0),
(LocalDate.of(2020, 12, 31), Duration.ofDays(4 * 30).plusMinutes(30)) ->
LocalDateTime.of(2021, 4, 30, 0, 30, 0),
(LocalDate.of(2020, 2, 29), Duration.ofDays(365).plusSeconds(59)) ->
LocalDateTime.of(2021, 2, 28, 0, 0, 59),
(LocalDate.of(10000, 1, 1), Duration.ofDays(-2)) ->
LocalDateTime.of(9999, 12, 30, 0, 0, 0)
).foreach { case ((date, duration), expected) =>
val result = expected.atZone(zid).toInstant
val ts = date.atStartOfDay(zid).toInstant
val df = Seq((date, duration, result)).toDF("date", "interval", "result")
checkAnswer(
df.select($"date" + $"interval", $"interval" + $"date", $"result" - $"interval",
$"result" - $"date"),
Row(result, result, ts, duration))
}
}
}

Seq(
"2021-04-14" -> "date + i",
"1900-04-14" -> "date - i").foreach { case (date, op) =>
val e = intercept[SparkException] {
Seq(
(LocalDate.parse(date), Duration.of(Long.MaxValue, ChronoUnit.MICROS)))
.toDF("date", "i")
.selectExpr(op)
.collect()
}.getCause
assert(e.isInstanceOf[ArithmeticException])
assert(e.getMessage.contains("long overflow"))
}
}
}
}

0 comments on commit de9e8b6

Please sign in to comment.