-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interval support #1243
Interval support #1243
Conversation
ibis/expr/tests/test_datatypes.py
Outdated
| @@ -300,6 +300,10 @@ def test_timestamp_with_timezone_parser_invalid_timezone(): | |||
| assert str(ts) == "timestamp('US/Ea')" | |||
|
|
|||
|
|
|||
| def test_interval(): | |||
| dt.validate_type("interval('Y')") | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should assert that this returns the same thing as Interval('Y').
ibis/expr/datatypes.py
Outdated
| self.unit = unit | ||
|
|
||
| def valid_literal(self, value): | ||
| return isinstance(value, six.string_types + (datetime.timedelta,)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this list the specific characters that are valid?
ibis/expr/datatypes.py
Outdated
|
|
||
| __slots__ = 'unit', | ||
|
|
||
| _valid_units = set([ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this frozenset.
ibis/expr/tests/test_temporal.py
Outdated
| @@ -201,21 +201,10 @@ def test_interval(literal): | |||
|
|
|||
|
|
|||
| def test_interval_repr(): | |||
| repr(api.interval(weeks=3)).splitlines()[0] == 'Literal[interval]' | |||
| assert repr(api.interval(weeks=3)) == 'Literal[interval]\n 3' | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should put the unit in the repr as well.
ibis/expr/tests/test_datatypes.py
Outdated
| ]) | ||
| def test_interval(unit): | ||
| definition = "interval('{}')".format(unit) | ||
| dt.Interval(unit) == dt.validate_type(definition) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs an assert here
ibis/expr/rules.py
Outdated
| @@ -709,6 +710,10 @@ def time(**arg_kwds): | |||
| return ValueTyped(ir.TimeValue, 'not time', **arg_kwds) | |||
|
|
|||
|
|
|||
| def interval(**arg_kwds): | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also need a way to specify exactly unit rules as well? Something like:
interval_with_unit('s')
ibis/expr/rules.py
Outdated
| def interval(**arg_kwds): | ||
| return ValueTyped(ir.IntervalValue, 'not an interval', **arg_kwds) | ||
|
|
||
|
|
||
| def timedelta(**arg_kwds): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this can be removed (or just aliased to interval for backwards compat)
ibis/expr/rules.py
Outdated
| @@ -452,22 +452,23 @@ def output_type(self): | |||
| return output_type | |||
|
|
|||
|
|
|||
| def numeric_highest_promote(i): | |||
| # TODO: UNUSED | |||
| # def numeric_highest_promote(i): | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If things are unused, you can delete them. We can always resurrect them later with git if needed.
ibis/expr/operations.py
Outdated
| class IntervalAdd(ValueOp): # __radd__ | ||
|
|
||
| input_type = [rules.temporal, rules.interval] | ||
| output_type = rules.shape_like_arg(0, 'timestamp') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the output type here is a bit more complicated since date + interval doesn't always return a timestamp. You can either perform the checking in the output_type method or you can make three different add classes that directly express the output type. The latter is a bit easier to reason about IMO.
ibis/expr/types.py
Outdated
| print('can_implicit_cast') | ||
| op = arg.op() | ||
| if isinstance(op, Literal): | ||
| if isinstance(op.value, six.integer_types): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arbitrary integers shouldn't be implicitly castable to IntervalValues since integers do not have units. Only other interval values should be implicitly castable to interval values.
ibis/expr/api.py
Outdated
| if len(defined_units) > 1: | ||
| raise ValueError('Only one arg can be specified') | ||
| elif len(defined_units) < 1: | ||
| raise ValueError('At least one of the arguments must be specified') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably just len(defined_units) != 1, with a message saying Exactly one argument is required.
Is there any reason why we need to enforce this? Postgres, for example, allows multiple values to the interval constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Postgres is more flexible than others
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, just wondering.
ibis/expr/api.py
Outdated
| else: | ||
| unit, value = defined_units[0] | ||
| type = dt.Interval(unit) | ||
| return ir.IntervalScalar(ir.literal(value, type=type).op()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be ir.literal(value, type=type).op().to_expr()?
ibis/expr/datatypes.py
Outdated
| self.unit = unit | ||
|
|
||
| def valid_literal(self, value): | ||
| return isinstance(value, six.integer_types + (datetime.timedelta,)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think only timedelta instances are valid literals, otherwise that would mean we're giving a default unit to integers.
| @@ -778,6 +808,7 @@ def valid_literal(self, value): | |||
| date = Date() | |||
| time = Time() | |||
| timestamp = Timestamp() | |||
| interval = Interval() | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want aliases for each of the interval values like year, month, day, etc.?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine if not, just a thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just following the patterns in datatypes.py
None of the parametric types has specified aliases, so I guess not.
|
@cpcloud Would You please review before I finalize this PR? |
ibis/sql/tests/test_compiler.py
Outdated
| WHERE `timestamp_col` < months_add('2010-01-01 00:00:00', 3) AND | ||
| `timestamp_col` < days_add(now(), 10)""" | ||
| assert result == expected | ||
| # def test_where_analyze_scalar_op(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to keep this test passing if we alias the day, month, etc functions to be calls to the top level interval function.
ibis/clickhouse/operations.py
Outdated
| @@ -367,19 +386,14 @@ def _timestamp_from_unix(translator, expr): | |||
| return _call(translator, 'toDateTime', arg) | |||
|
|
|||
|
|
|||
| def _timestamp_delta(translator, expr): | |||
| def _timestamp_add(translator, expr): | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this being used? It looks like you're mapping ops.TimestampAdd to binary_infix_op('+') and not this. Is that intentional?
| _interval_floordiv = _binop_expr('__floordiv__', _ops.IntervalFloorDivide) | ||
|
|
||
| _interval_value_methods = dict( | ||
| to_unit=_to_unit, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be done with cast?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do You mean interval(hours=3).cast("interval('s')")?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. If this is for backward compat, then we can leave it.
ibis/expr/rules.py
Outdated
| """Infers the output type of the binary interval operation | ||
|
|
||
| This is a slightly modified version of BinaryPromoter, it converts | ||
| back and forth between the interval and its innner value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small typo here, innner has one too many n's
ibis/clickhouse/operations.py
Outdated
| @@ -261,6 +261,23 @@ def _value_list(translator, expr): | |||
| return '({0})'.format(', '.join(values_)) | |||
|
|
|||
|
|
|||
| def _interval_format(translator, expr): | |||
| units = { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use Interval._valid_units here so we don't have to repeat this? Maybe we can use Interval._valid_units everywhere we would hard code unit values/labels.
| return False | ||
|
|
||
| def valid_literal(self, value): | ||
| return isinstance(value, six.integer_types + (datetime.timedelta,)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the semantics here such that:
interval(days=2) + 1 == interval(days=3)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's required to pass https://github.com/ibis-project/ibis/blob/master/ibis/expr/types.py#L1587
For a type e.g. interval<int32>('s') a python integer should be a valid literal.
The operation fails:
In [1]: import ibis
In [2]: ibis.interval(days=2) + 1 == ibis.interval(days=3)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-2-3abf187ff76a> in <module>()
----> 1 ibis.interval(days=2) + 1 == ibis.interval(days=3)
TypeError: unsupported operand type(s) for +: 'IntervalScalar' and 'int'There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right. Carry on then, this looks good.
ibis/expr/datatypes.py
Outdated
| 'ns' # Nanosecond | ||
| ]) | ||
|
|
||
| def __init__(self, value_type=None, unit='s', nullable=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably want the unit first here, so that it's easy to construct this type without using a keyword:
# current
Interval(unit='s')
# a little easier to read IMO
Interval('s')|
@kszucs Looking great. Everything ready to go on your side? |
|
@cpcloud Yes! Thank You! |

IntervalValue/IntervalScalar/IntervalColumnresolves #1233, #1228