Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Rolling window with step size (GH-15354) #45735

Closed
wants to merge 3 commits into from

Conversation

rtpsw
Copy link
Contributor

@rtpsw rtpsw commented Jan 31, 2022

@rtpsw
Copy link
Contributor Author

rtpsw commented Jan 31, 2022

This is quite a large PR because the introduction of a step argument requires changes in multiple places where rolling window functionality is implemented or exposed, which in turn requires adding a bunch of tests to cover these changes. The general rule is that adding step=step to a (possibly group-by) rolling window is equivalent to adding [::step] to the original (possibly per-group) result. The step-argument-related code is designed to optimize the computation, i.e., to avoid computing an output that would immediately be filtered out by [::step].

While the code for this PR took some time to figure out, almost all of the changes seem reasonable to me in hindsight. Nevertheless, one thing I'm worried about is the interaction of this PR with #42915. Specifically, the test_rolling_corr_cov_other_diff_size_as_groups test case ran into trouble and I could only get it to work with the following changes:

  1. Decapsulating the per-group step-filtering
  2. Adding a pairwise-result adjustment function that is implemented differently between rolling-group-by window and other (expanding- or emw-) windows. I tried multiple ways to encapsulate the step-filtering, but they all ran into issues (e.g., reindexing issues related to prep_binary in flex_binary_moments). I gave up on getting it to work without decapsulation after a few tries and decided to put in this PR for review. Because I was able to get the similar test_rolling_corr_cov_other_same_size_as_groups test case (let alone the many other test cases in this PR) to work out without resorting to decapsulation, I suspect something weird is going on with this interaction, specifically something that has to do with the #42915 rolling-window branching logic, but I couldn't as yet point my finger at what exactly. Please review this part carefully.

@jreback jreback added Enhancement Window rolling, ewma, expanding labels Jan 31, 2022
Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this PR is so large, it would be easier to review in chunks. Might help to divide as:

  1. All internal changes needed to support step without step being exposed to the user
  2. Adding step API, tests, docs

Also some thoughts about part 2)

  • I recommend having a dedicated file to test the steps API (test_step.py)
  • For cov, corr, ewm.<method>, expanding.<method> and all groupby variants, I recommend just raising a NotImplementedError

@rtpsw
Copy link
Contributor Author

rtpsw commented Jan 31, 2022

@mroeschke: I'd like to try to clean up the errors in this PR first. After that I don't mind canceling it and just using it as a reference as needed.

What do you have in mind for test_step.py? As I was working on testing this code, I found that adding the step parameter to existing tests was catching up unexpected errors left and right, so I'd feel confident about adding but not replacing existing tests in this PR.

IIRC, I did not add a step parameter to ewm.method and expanding.method. In any case, I do not have a strong opinion about raising NotImplementedError for them. As for cov and corr, I believe what passes the tests, as modified with the step argument and [::step] pattern, is most likely the behavior users would expect; so, if there is no desire to release them as standard, perhaps only as experimental?

@mroeschke
Copy link
Member

What do you have in mind for test_step.py? As I was working on testing this code, I found that adding the step parameter to existing tests was catching up unexpected errors left and right, so I'd feel confident about adding but not replacing existing tests in this PR.

Ah okay sure then this is a good argument to keep step in the existing tests.

perhaps only as experimental

I slightly prefer having dedicated behavior than experimental per-se i.e. I would rather raise a NotImplementedError, flesh out the intended behavior in an issue, and then add.

But if through your investigation you are confident about the intended behavior with corr/cov, then I'll defer to you.

@rtpsw
Copy link
Contributor Author

rtpsw commented Jan 31, 2022

While I'm reasonably confident about the expected behavior of most rolling window step-operations as discussed in #15354, I'd suggest posting there about the [::step] pattern and see if anyone objects.

As for corr and cov, I'm not sure what behavior users would expect with a step-parameter even after reading through #42915. Here I'd vote for NotImplementedError and open a discussion about this.

@rtpsw
Copy link
Contributor Author

rtpsw commented Feb 1, 2022

AFAICS, the build failure is benign and this PR is cleaned up of errors. I'll work on starting a separate PR that advances in manageable chunks.

@rtpsw
Copy link
Contributor Author

rtpsw commented Feb 1, 2022

I'm closing this PR and will open a replacement one soon.

@rtpsw
Copy link
Contributor Author

rtpsw commented Feb 1, 2022

Replaced by #45765.

@rtpsw rtpsw deleted the GH-15354 branch March 1, 2022 11:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Window rolling, ewma, expanding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rolling window with step size
3 participants