-
Notifications
You must be signed in to change notification settings - Fork 28.2k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-37279][PYTHON][SQL] Support DayTimeIntervalType in createDataF…
…rame, collect and Python UDF ### What changes were proposed in this pull request? This PR implements `DayTimeIntervalType` in PySpark's `DataFrame.collect()`, `SparkSession.createDataFrame()` and `functions.udf`. This type is mapped to [`datetime.timedelta`](https://docs.python.org/3/library/datetime.html#timedelta-objects). Arrow code path will be separately implemented at SPARK-37277, and Py4J support will be done at SPARK-37281. ### Why are the changes needed? - In order to support `datetime.timedelta` out of the box via PySpark. - To seamlessly support ANSI standard types Semantically [`datetime.timedelta`](https://docs.python.org/3/library/datetime.html#timedelta-objects) is mapped to `DayTimeIntervalType`. Python's timedelta does not support months, years, etc. ### Does this PR introduce _any_ user-facing change? Yes, users will be able to use `datetime.timedelta` in PySpark with `DayTimeIntervalType` at `DataFrame.collect()`, `SparkSession.createDataFrame()` and `functions.udf`: ```python >>> import datetime >>> df = spark.createDataFrame([(datetime.timedelta(days=1),)]) >>> df.collect() [Row(_1=datetime.timedelta(days=1))] ``` ```python >>> from pyspark.sql.functions import udf >>> df.select(udf(lambda x: x, "interval day to second")("_1")).show() +--------------------+ | <lambda>(_1)| +--------------------+ |INTERVAL '1 00:00...| +--------------------+ ``` ### How was this patch tested? Unittests were added, and the Closes #34614 from HyukjinKwon/SPARK-37277. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
- Loading branch information
1 parent
9553ed7
commit e2e1e42
Showing
5 changed files
with
178 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -302,6 +302,7 @@ Data Types | |
StructType | ||
TimestampNTZType | ||
TimestampType | ||
DayTimeIntervalType | ||
|
||
|
||
Observation | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters