Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-8199][SPARK-8184][SPARK-8183][SPARK-8182][SPARK-8181][SPARK-8180][SPARK-8179][SPARK-8177][SPARK-8178][SPARK-9115][SQL] date functions #6981

Closed
wants to merge 54 commits into from
Closed
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
d0e2f99
date functions
tarekbecker Jun 24, 2015
5ebb235
resolved naming conflict
tarekbecker Jun 24, 2015
4d8049b
fixed tests and added type check
tarekbecker Jun 24, 2015
638596f
improved codegen
tarekbecker Jun 24, 2015
849fb41
fixed stupid test
tarekbecker Jun 24, 2015
c739788
added support for quarter SPARK-8178
tarekbecker Jun 24, 2015
b680db6
added codegeneration to all functions
tarekbecker Jun 24, 2015
a5ea120
added python api; changed test to be more meaningful
tarekbecker Jun 24, 2015
02efc5d
removed doubled code
tarekbecker Jun 26, 2015
356df78
rely on cast mechanism of Spark. Simplified implementation
tarekbecker Jun 29, 2015
3bfac90
fixed style
tarekbecker Jun 29, 2015
5fe74e1
fixed python style
tarekbecker Jun 29, 2015
a8edebd
use Calendar instead of SimpleDateFormat
tarekbecker Jun 29, 2015
f120415
improved runtime
tarekbecker Jun 30, 2015
eb6760d
Merge branch 'master' into SPARK-8199
tarekbecker Jul 4, 2015
5a105d9
[SPARK-8199] rebase after #6985 got merged
tarekbecker Jul 4, 2015
7bc9d93
Merge branch 'master' into SPARK-8199
tarekbecker Jul 9, 2015
d9f8ac3
[SPARK-8199] implement fast track
tarekbecker Jul 9, 2015
6f5d95c
[SPARK-8199] fixed year interval
tarekbecker Jul 9, 2015
f3e7a9f
[SPARK-8199] revert change in DataFrameFunctionsSuite
tarekbecker Jul 9, 2015
7d9f0eb
[SPARK-8199] git renaming issue
tarekbecker Jul 9, 2015
10e4ad1
Merge branch 'master' into date-functions-fast
tarekbecker Jul 9, 2015
ccb723c
[SPARK-8199] style and fixed merge issues
tarekbecker Jul 9, 2015
c42b444
Removed merge conflict file
tarekbecker Jul 9, 2015
ad17e96
improved implementation
tarekbecker Jul 10, 2015
f775f39
fixed return type
tarekbecker Jul 10, 2015
1a436c9
wip
tarekbecker Jul 13, 2015
4fb66da
WIP: date functions on calculation only
tarekbecker Jul 13, 2015
740af0e
implement date function using a calculation based on days
tarekbecker Jul 13, 2015
1358cdc
Merge remote-tracking branch 'origin/master' into SPARK-8199
tarekbecker Jul 16, 2015
ec87c69
[SPARK-8119] bug fixing and refactoring
tarekbecker Jul 16, 2015
0852655
[SPARK-8119] changed from ExpectsInputTypes to implicit casts
tarekbecker Jul 16, 2015
1b2e540
[SPARK-8119] style fix
tarekbecker Jul 16, 2015
b382267
[SPARK-8199] fixed bug in day calculation; removed set TimeZone in Hi…
tarekbecker Jul 17, 2015
d6aa14e
[SPARK-8199] fixed Hive compatibility
tarekbecker Jul 17, 2015
e223bc0
[SPARK-8199] refactoring
tarekbecker Jul 17, 2015
56c4a92
[SPARK-8199] update python docu
tarekbecker Jul 17, 2015
d01b977
[SPARK-8199] python underscore
tarekbecker Jul 17, 2015
2259299
[SPARK-8199] day_of_month alias
tarekbecker Jul 17, 2015
523542d
[SPARK-8199] address comments
tarekbecker Jul 17, 2015
0ad6db8
[SPARK-8199] minor fix
tarekbecker Jul 17, 2015
746b80a
[SPARK-8199] build fix
tarekbecker Jul 17, 2015
cdfae27
[SPARK-8199] cleanup & python docstring fix
tarekbecker Jul 17, 2015
fb98ba0
[SPARK-8199] python docstring fix
tarekbecker Jul 17, 2015
3c6ae2e
[SPARK-8199] removed binary search
tarekbecker Jul 18, 2015
70238e0
Merge branch 'master' into SPARK-8199
tarekbecker Jul 18, 2015
ea6c110
[SPARK-8199] fix after merging master
tarekbecker Jul 18, 2015
4afc09c
[SPARK-8199] concise leap year handling
tarekbecker Jul 18, 2015
6e0c78f
[SPARK-8199] removed setTimeZone in tests, according to cloud-fans co…
tarekbecker Jul 18, 2015
5983dcc
[SPARK-8199] whitespace fix
tarekbecker Jul 18, 2015
256c357
[SPARK-8199] code cleanup
tarekbecker Jul 18, 2015
3e095ba
[SPARK-8199] style and timezone fix
tarekbecker Jul 18, 2015
bb567b6
[SPARK-8199] fixed test
tarekbecker Jul 18, 2015
f7b4c8c
[SPARK-8199] fixed bug in tests
tarekbecker Jul 19, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 90 additions & 0 deletions python/pyspark/sql/functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -486,6 +486,96 @@ def ntile(n):
sc = SparkContext._active_spark_context
return Column(sc._jvm.functions.ntile(int(n)))

@since(1.5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ignore_unicode_prefix

def dateFormat(dateCol, formatCol):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rxin camel case or underscore?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

underscore

"""
Convert the given date into the format specified by the second argument. Return type is always string.
>>> sqlContext.createDataFrame([('2015-04-08',)], ['a']).select(dateFormat('a', 'MM/dd/yyy').alias('date')).collect()
[Row(date=u'04/08/2015')]
"""
sc = SparkContext._active_spark_context
return Column(sc._jvm.functions.dateFormat(dateCol, formatCol))

@since(1.5)
def year(col):
"""
Extract the year of a given date as integer.
>>> sqlContext.createDataFrame([('2015-04-08',)], ['a']).select(year('a').alias('year')).collect()
[Row(year=2015)]
"""
sc = SparkContext._active_spark_context
return Column(sc._jvm.functions.year(col))

@since(1.5)
def quarter(col):
"""
Extract the quarter of a given date as integer.
>>> sqlContext.createDataFrame([('2015-04-08',)], ['a']).select(quarter('a').alias('quarter')).collect()
[Row(quarter=2)]
"""
sc = SparkContext._active_spark_context
return Column(sc._jvm.functions.quarter(col))

@since(1.5)
def month(col):
"""
Extract the month of a given date as integer.
>>> sqlContext.createDataFrame([('2015-04-08',)], ['a']).select(month('a').alias('month')).collect()
[Row(month=4)]
"""
sc = SparkContext._active_spark_context
return Column(sc._jvm.functions.month(col))

@since(1.5)
def day(col):
"""
Extract the day of a given date as integer.
>>> sqlContext.createDataFrame([('2015-04-08',)], ['a']).select(day('a').alias('day')).collect()
[Row(day=8)]
"""
sc = SparkContext._active_spark_context
return Column(sc._jvm.functions.day(col))

@since(1.5)
def hour(col):
"""
Extract the hours of a given date as integer.
>>> sqlContext.createDataFrame([('2015-04-08 13:08:15',)], ['a']).select(hour('a').alias('hour')).collect()
[Row(hour=13)]
"""
sc = SparkContext._active_spark_context
return Column(sc._jvm.functions.hour(col))

@since(1.5)
def minute(col):
"""
Extract the minutes of a given date as integer.
>>> sqlContext.createDataFrame([('2015-04-08 13:08:15',)], ['a']).select(minute('a').alias('minute')).collect()
[Row(minute=8)]
"""
sc = SparkContext._active_spark_context
return Column(sc._jvm.functions.minute(col))

@since(1.5)
def second(col):
"""
Extract the seconds of a given date as integer.
>>> sqlContext.createDataFrame([('2015-04-08 13:08:15',)], ['a']).select(second('a').alias('second')).collect()
[Row(second=15)]
"""
sc = SparkContext._active_spark_context
return Column(sc._jvm.functions.second(col))

@since(1.5)
def weekOfYear(col):
"""
Extract the week number of a given date as integer.
>>> sqlContext.createDataFrame([('2015-04-08',)], ['a']).select(weekOfYear('a').alias('week')).collect()
[Row(week=15)]
"""
sc = SparkContext._active_spark_context
return Column(sc._jvm.functions.weekOfYear(col))


class UserDefinedFunction(object):
"""
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,18 @@ object FunctionRegistry {
expression[Substring]("substr"),
expression[Substring]("substring"),
expression[Upper]("ucase"),
expression[Upper]("upper")
expression[Upper]("upper"),

// datetime functions
expression[DateFormatClass]("dateformat"),
expression[Year]("year"),
expression[Quarter]("quarter"),
expression[Month]("month"),
expression[Day]("day"),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rxin In Jira you mentioned there should be an alias. Can I just add expression[Day]("day_of_month")?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw please sort the expressions alphabetically

expression[Hour]("hour"),
expression[Minute]("minute"),
expression[Second]("second"),
expression[WeekOfYear]("weekofyear")
)

val builtin: FunctionRegistry = {
Expand Down
Loading