Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Fix pandas backends to have inclusive preceding window bound #2009

Merged
merged 6 commits into from
Nov 1, 2019

Conversation

hjoo
Copy link
Contributor

@hjoo hjoo commented Oct 24, 2019

Fix for issue #2000.

The preceding argument to ibis.window() and ibis.trailing_window() is an inclusive preceding bound. The window argument to pandas.DataFrame.rolling() is a window size.

Currently, pandas backends pass the preceding bound directly as the window size and do not adjust for the inclusive window bounds in ibis.

This PR implements the correct expected behavior and updates the pandas backend tests accordingly.

@@ -306,11 +310,6 @@ def compute_window_spec_interval(_, expr):
return pd.tseries.frequencies.to_offset(value)


@compute_window_spec.register(dt.DataType)
def compute_window_spec_expr(_, expr):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this function should ever be called as Ibis throws an input exception if preceding is not an integer or an Ibis interval.

@hjoo
Copy link
Contributor Author

hjoo commented Oct 24, 2019

cc: @jreback @toryhaavik

Copy link
Contributor

@icexelloss icexelloss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@icexelloss
Copy link
Contributor

@hjoo Can you change the title to include BUG?

@jreback
Copy link
Contributor

jreback commented Oct 24, 2019

will look

@hjoo hjoo changed the title Fix pandas backends to have inclusive preceding window bound [BUG] Fix pandas backends to have inclusive preceding window bound Oct 24, 2019
@xmnlab xmnlab changed the title [BUG] Fix pandas backends to have inclusive preceding window bound BUG: Fix pandas backends to have inclusive preceding window bound Oct 25, 2019
@xmnlab
Copy link
Contributor

xmnlab commented Oct 25, 2019

@hjoo about the PR titles you can check here some info #1606 (comment)

@jreback
Copy link
Contributor

jreback commented Oct 27, 2019

@hjoo reasonable to add an closed='left|right|both|neither' (this is what pandas calls this, could also be 'inclusive').

I think by default we should actually match pandas behavior, e.g. closed='left'. I think both would be a breaking change, yes?

cc @toryhaavik

@hjoo
Copy link
Contributor Author

hjoo commented Oct 28, 2019

@jreback not quite sure I follow. I thought closed in pandas is only implemented for datetimelike and time offset based windows and not row-based windows. Error message I get when using closed=left for a rolling window given a window size:

df.a.rolling(3, min_periods=1, closed='left')...

E           ValueError: closed only implemented for datetimelike and offset based windows

In Ibis, the left row bound is already expected to be a closed bound.

@jreback jreback added bug Incorrect behavior inside of ibis pandas The pandas backend labels Nov 1, 2019
@jreback jreback added this to the Next Bugfix Release milestone Nov 1, 2019
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. comments about does this affect other window methods that take preceding. ping on green.

@@ -7,6 +7,8 @@ Release Notes
These release notes are for versions of ibis **1.0 and later**. Release
notes for pre-1.0 versions of ibis can be found at :doc:`/release-pre-1.0`

* :release:`1.2.1 <pending>`
* :bug:`2009` Change window bound behavior in pandas backend to match other backends
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you be a bit more specific here; e.g. when specifying preceding= it did x now does y

@@ -420,7 +420,8 @@ def trailing_window(preceding, group_by=None, order_by=None):
preceding : int, float or expression of intervals, i.e.
ibis.interval(days=1) + ibis.interval(hours=5)
Int indicates number of trailing rows to include;
0 includes only the current row.
0 includes only the current row, 1 includes the current row and one
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to update for trailing_range_window as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trailing_range_window is a time range window and not a row-based window so this doesn't apply.

@jreback
Copy link
Contributor

jreback commented Nov 1, 2019

if you can also create an issue for discussion about adding a closed= paramter to control the bounds inclusivety.

@hjoo
Copy link
Contributor Author

hjoo commented Nov 1, 2019

Created issue #2019 to further discuss closed= API.

@jreback jreback merged commit 72ece31 into ibis-project:master Nov 1, 2019
@jreback
Copy link
Contributor

jreback commented Nov 1, 2019

thanks @hjoo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis pandas The pandas backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants