Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Fix pandas backends to have inclusive preceding window bound #2009

Merged
merged 6 commits into from Nov 1, 2019

Conversation

@hjoo
Copy link
Contributor

hjoo commented Oct 24, 2019

Fix for issue #2000.

The preceding argument to ibis.window() and ibis.trailing_window() is an inclusive preceding bound. The window argument to pandas.DataFrame.rolling() is a window size.

Currently, pandas backends pass the preceding bound directly as the window size and do not adjust for the inclusive window bounds in ibis.

This PR implements the correct expected behavior and updates the pandas backend tests accordingly.

@@ -306,11 +310,6 @@ def compute_window_spec_interval(_, expr):
return pd.tseries.frequencies.to_offset(value)


@compute_window_spec.register(dt.DataType)
def compute_window_spec_expr(_, expr):

This comment has been minimized.

Copy link
@hjoo

hjoo Oct 24, 2019

Author Contributor

I don't think this function should ever be called as Ibis throws an input exception if preceding is not an integer or an Ibis interval.

@hjoo

This comment has been minimized.

Copy link
Contributor Author

hjoo commented Oct 24, 2019

Copy link
Collaborator

icexelloss left a comment

LGTM

@icexelloss

This comment has been minimized.

Copy link
Collaborator

icexelloss commented Oct 24, 2019

@hjoo Can you change the title to include BUG?

@jreback

This comment has been minimized.

Copy link
Contributor

jreback commented Oct 24, 2019

will look

@hjoo hjoo changed the title Fix pandas backends to have inclusive preceding window bound [BUG] Fix pandas backends to have inclusive preceding window bound Oct 24, 2019
@xmnlab xmnlab changed the title [BUG] Fix pandas backends to have inclusive preceding window bound BUG: Fix pandas backends to have inclusive preceding window bound Oct 25, 2019
@xmnlab

This comment has been minimized.

Copy link
Collaborator

xmnlab commented Oct 25, 2019

@hjoo about the PR titles you can check here some info #1606 (comment)

@jreback

This comment has been minimized.

Copy link
Contributor

jreback commented Oct 27, 2019

@hjoo reasonable to add an closed='left|right|both|neither' (this is what pandas calls this, could also be 'inclusive').

I think by default we should actually match pandas behavior, e.g. closed='left'. I think both would be a breaking change, yes?

cc @toryhaavik

@hjoo

This comment has been minimized.

Copy link
Contributor Author

hjoo commented Oct 28, 2019

@jreback not quite sure I follow. I thought closed in pandas is only implemented for datetimelike and time offset based windows and not row-based windows. Error message I get when using closed=left for a rolling window given a window size:

df.a.rolling(3, min_periods=1, closed='left')...

E           ValueError: closed only implemented for datetimelike and offset based windows

In Ibis, the left row bound is already expected to be a closed bound.

Copy link
Contributor

jreback left a comment

lgtm. comments about does this affect other window methods that take preceding. ping on green.

@@ -7,6 +7,8 @@ Release Notes
These release notes are for versions of ibis **1.0 and later**. Release
notes for pre-1.0 versions of ibis can be found at :doc:`/release-pre-1.0`

* :release:`1.2.1 <pending>`
* :bug:`2009` Change window bound behavior in pandas backend to match other backends

This comment has been minimized.

Copy link
@jreback

jreback Nov 1, 2019

Contributor

can you be a bit more specific here; e.g. when specifying preceding= it did x now does y

@@ -420,7 +420,8 @@ def trailing_window(preceding, group_by=None, order_by=None):
preceding : int, float or expression of intervals, i.e.
ibis.interval(days=1) + ibis.interval(hours=5)
Int indicates number of trailing rows to include;
0 includes only the current row.
0 includes only the current row, 1 includes the current row and one

This comment has been minimized.

Copy link
@jreback

jreback Nov 1, 2019

Contributor

need to update for trailing_range_window as well?

This comment has been minimized.

Copy link
@hjoo

hjoo Nov 1, 2019

Author Contributor

trailing_range_window is a time range window and not a row-based window so this doesn't apply.

@jreback

This comment has been minimized.

Copy link
Contributor

jreback commented Nov 1, 2019

if you can also create an issue for discussion about adding a closed= paramter to control the bounds inclusivety.

@hjoo

This comment has been minimized.

Copy link
Contributor Author

hjoo commented Nov 1, 2019

Created issue #2019 to further discuss closed= API.

@jreback
jreback approved these changes Nov 1, 2019
@jreback jreback merged commit 72ece31 into ibis-project:master Nov 1, 2019
10 checks passed
10 checks passed
ibis-project.ibis Build #20191101.1 succeeded
Details
ibis-project.ibis (LinuxBenchmark) LinuxBenchmark succeeded
Details
ibis-project.ibis (LinuxBuildConda) LinuxBuildConda succeeded
Details
ibis-project.ibis (LinuxBuildDocs) LinuxBuildDocs succeeded
Details
ibis-project.ibis (LinuxTest py35) LinuxTest py35 succeeded
Details
ibis-project.ibis (LinuxTest py36) LinuxTest py36 succeeded
Details
ibis-project.ibis (LinuxTest py37) LinuxTest py37 succeeded
Details
ibis-project.ibis (WindowsTest py35) WindowsTest py35 succeeded
Details
ibis-project.ibis (WindowsTest py36) WindowsTest py36 succeeded
Details
ibis-project.ibis (WindowsTest py37) WindowsTest py37 succeeded
Details
@jreback

This comment has been minimized.

Copy link
Contributor

jreback commented Nov 1, 2019

thanks @hjoo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.