Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance increase rolling min max #19549

Merged
merged 22 commits into from
Feb 14, 2018

Conversation

hexgnu
Copy link
Contributor

@hexgnu hexgnu commented Feb 6, 2018

In my testing the performance of

import pandas as pd
import timeit

df = pd.DataFrame({"a": 0}, index=pd.date_range('2017-01-01', '2019-01-01', freq='1T'))

timeit.timeit(lambda: df.rolling('1d').max(), number=1)

Went from 1.8 sec to 0.3 sec on my machine (lenovo laptop).

@pep8speaks
Copy link

pep8speaks commented Feb 6, 2018

Hello @hexgnu! Thanks for updating the PR.

Line 25:1: E302 expected 2 blank lines, found 1

Comment last updated on February 12, 2018 at 04:38 Hours UTC

@jreback jreback added Performance Memory or execution speed performance Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Feb 6, 2018
@jreback
Copy link
Contributor

jreback commented Feb 6, 2018

pls add a whatsnew, and I think we have asv's for this, so run and post those as well (of course need passing too :>)

approach looks good.

@codecov
Copy link

codecov bot commented Feb 7, 2018

Codecov Report

Merging #19549 into master will increase coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19549      +/-   ##
==========================================
+ Coverage   91.58%   91.59%   +<.01%     
==========================================
  Files         150      150              
  Lines       48807    48807              
==========================================
+ Hits        44702    44704       +2     
+ Misses       4105     4103       -2
Flag Coverage Δ
#multiple 89.96% <ø> (ø) ⬆️
#single 41.73% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/util/testing.py 83.85% <0%> (+0.2%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7a5634e...65c0dbe. Read the comment docs.

@@ -1,11 +0,0 @@
#ifndef _PANDAS_MATH_H_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't remove this. otherwise windows builds will fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you are trying to fix this using the cpp library below. not sure how this will work out. windows is a bit funky on its incudes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea I think I'm going to revert it back to math.h even though it's a tad wonky, it was actually working quite well that way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh ugh I just realized I didn't checkin src/headers/cmath which is why this whole thing is failing.

@hexgnu
Copy link
Contributor Author

hexgnu commented Feb 8, 2018

Ok pretty sure this will pass this time since I went through the trouble of setting up windows and testing it on MSVC 9.0.

@jreback jreback added this to the 0.23.0 milestone Feb 8, 2018
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. just a little docs. ping on green.

@@ -1242,32 +1244,43 @@ cdef _roll_min_max(ndarray[numeric] input, int64_t win, int64_t minp,

output = np.empty(N, dtype=input.dtype)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a description of the algorithm and a link (if available)

@@ -0,0 +1,13 @@
#ifndef _PANDAS_MATH_H_
#define _PANDAS_MATH_H_

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add commets here on why we have this file (so the next person doesn't go thru the same as you :>)

@chris-b1
Copy link
Contributor

chris-b1 commented Feb 8, 2018

I'm not sure it matters in practice with the toolchains that actually get used but historically we haven't used c++ at all inside pandas - maybe has implications adding it?

#if defined(_MSC_VER) && (_MSC_VER < 1800)
#include <cmath>
namespace std {
__inline int signbit(double num) { return _copysign(1.0, num) < 0; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would revert this to use math.h - technically extending namespace std is undefined behavior so you'd need to do it a different way

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's the right direction to go, cmath is what you use when you write C++ code. math.h doesn't work either when compiling in clang. This works in all three compilers even though it's not the most ideal (MSVC doesn't have this defined for older versions which is annoying).

@hexgnu
Copy link
Contributor Author

hexgnu commented Feb 9, 2018

Here are the ASVs I ran. But I'm noticing 1 an increase in time_pairwise and also idk if there's a variable window rolling benchmark. So I'll look into that.

       before           after         ratio
     [36f90528]       [42f8fdfd]
+      5.86±0.2ms       7.29±0.6ms     1.24  rolling.Pairwise.time_pairwise(10, 'corr', False)
-     2.30±0.08ms      2.08±0.03ms     0.91  rolling.Methods.time_rolling('Series', 1000, 'float', 'mean')
-        38.7±2ms       33.4±0.6ms     0.86  rolling.Methods.time_rolling('DataFrame', 10, 'float', 'median')
-      3.56±0.3ms      2.88±0.03ms     0.81  rolling.Quantile.time_quantile('Series', 10, 'float', 1)
-      3.46±0.2ms       2.80±0.2ms     0.81  rolling.Quantile.time_quantile('DataFrame', 1000, 'float', 0)
-      3.79±0.2ms      2.95±0.07ms     0.78  rolling.Methods.time_rolling('Series', 10, 'float', 'min')
-      2.61±0.2ms      1.98±0.03ms     0.76  rolling.Methods.time_rolling('Series', 10, 'int', 'sum')
-        73.3±5ms         55.7±1ms     0.76  rolling.Methods.time_rolling('DataFrame', 1000, 'int', 'median')
-      3.96±0.2ms      2.77±0.07ms     0.70  rolling.Quantile.time_quantile('Series', 1000, 'float', 0)
-      4.69±0.6ms       3.07±0.1ms     0.65  rolling.Methods.time_rolling('Series', 10, 'float', 'std')
-      9.06±0.3ms       5.49±0.2ms     0.61  rolling.Methods.time_rolling('Series', 10, 'int', 'count')
-      3.63±0.2ms      2.13±0.08ms     0.59  rolling.Methods.time_rolling('Series', 10, 'float', 'mean')
-        9.37±1ms       5.42±0.4ms     0.58  rolling.Methods.time_rolling('Series', 10, 'float', 'count')
-      3.96±0.4ms      2.27±0.08ms     0.57  rolling.Methods.time_rolling('Series', 10, 'int', 'kurt')
-     3.55±0.08ms      2.00±0.03ms     0.56  rolling.Methods.time_rolling('Series', 10, 'float', 'sum')
-      4.22±0.2ms       2.17±0.3ms     0.51  rolling.Methods.time_rolling('Series', 10, 'int', 'skew')
-      5.75±0.4ms      2.89±0.03ms     0.50  rolling.Methods.time_rolling('Series', 10, 'float', 'max')
-        4.27±1ms      2.11±0.04ms     0.49  rolling.Methods.time_rolling('Series', 1000, 'int', 'min')

@jreback
Copy link
Contributor

jreback commented Feb 9, 2018

we have other c++ code so this is not a big deal

@hexgnu
Copy link
Contributor Author

hexgnu commented Feb 9, 2018

@chris-b1 also there is C++ code in the repo look at msgpack

@hexgnu
Copy link
Contributor Author

hexgnu commented Feb 9, 2018

So in reading this most of the benchmarks for variable rolling windows has gone down but some haven't. How accurate is ASV? Or is it perhaps me doing something else impacting it, cause I was doing some work in a jupyter notebook.

      before           after         ratio
     [36f90528]       [060dfb77]
+      4.78±0.1ms       10.5±0.8ms     2.19  rolling.VariableWindowMethods.time_rolling('Series', '50s', 'float', 'min')
+      3.03±0.2ms       5.99±0.7ms     1.98  rolling.Methods.time_rolling('DataFrame', 1000, 'int', 'min')
+      3.66±0.1ms       7.09±0.6ms     1.94  rolling.Quantile.time_quantile('Series', 10, 'float', 1)
+      2.26±0.1ms       4.34±0.7ms     1.92  rolling.Methods.time_rolling('Series', 10, 'int', 'sum')
+         129±3ms          159±5ms     1.24  rolling.VariableWindowMethods.time_rolling('DataFrame', '1d', 'float', 'median')
+     4.74±0.07ms       5.61±0.3ms     1.18  rolling.VariableWindowMethods.time_rolling('DataFrame', '1d', 'int', 'kurt')
-        395±50ms          340±4ms     0.86  rolling.Quantile.time_quantile('DataFrame', 1000, 'float', 0.5)
-      5.76±0.1ms       4.90±0.3ms     0.85  rolling.VariableWindowMethods.time_rolling('DataFrame', '1h', 'float', 'std')
-      6.17±0.2ms      5.10±0.06ms     0.83  rolling.VariableWindowMethods.time_rolling('Series', '50s', 'int', 'skew')
-      4.20±0.2ms      3.47±0.07ms     0.83  rolling.VariableWindowMethods.time_rolling('DataFrame', '50s', 'int', 'sum')
-      23.9±0.9ms       19.2±0.8ms     0.80  rolling.Pairwise.time_pairwise(1000, 'cov', True)
-      4.08±0.4ms      3.20±0.05ms     0.78  rolling.Quantile.time_quantile('Series', 1000, 'float', 1)
-        6.22±1ms       4.72±0.1ms     0.76  rolling.VariableWindowMethods.time_rolling('DataFrame', '50s', 'int', 'kurt')
-        4.46±1ms      3.21±0.08ms     0.72  rolling.Methods.time_rolling('Series', 1000, 'float', 'kurt')
-      4.12±0.3ms       2.91±0.2ms     0.71  rolling.Quantile.time_quantile('DataFrame', 1000, 'float', 1)
-        223±20ms          152±9ms     0.68  rolling.Quantile.time_quantile('Series', 10, 'float', 0.5)
-      6.47±0.7ms       4.42±0.2ms     0.68  rolling.VariableWindowMethods.time_rolling('DataFrame', '1h', 'float', 'skew')
-        4.90±1ms       3.28±0.2ms     0.67  rolling.Methods.time_rolling('DataFrame', 10, 'int', 'count')
-        11.2±1ms         7.40±1ms     0.66  rolling.Pairwise.time_pairwise(None, 'corr', False)
-        7.08±2ms       4.47±0.2ms     0.63  rolling.VariableWindowMethods.time_rolling('DataFrame', '50s', 'float', 'skew')
-      5.83±0.7ms       3.66±0.3ms     0.63  rolling.VariableWindowMethods.time_rolling('DataFrame', '50s', 'float', 'mean')
-      4.41±0.4ms       2.71±0.3ms     0.61  rolling.Quantile.time_quantile('Series', 10, 'int', 0)
-        5.22±1ms      3.16±0.07ms     0.60  rolling.Methods.time_rolling('Series', 1000, 'float', 'skew')
-        8.02±1ms       4.57±0.2ms     0.57  rolling.Pairwise.time_pairwise(1000, 'cov', False)
-      5.74±0.5ms      3.13±0.07ms     0.55  rolling.Methods.time_rolling('Series', 1000, 'float', 'max')
-      3.73±0.4ms       1.92±0.2ms     0.52  rolling.Methods.time_rolling('DataFrame', 1000, 'float', 'sum')
-      3.76±0.3ms       1.88±0.2ms     0.50  rolling.Quantile.time_quantile('DataFrame', 1000, 'int', 1)
-      4.55±0.5ms      2.25±0.06ms     0.49  rolling.Quantile.time_quantile('Series', 1000, 'int', 1)
-      4.62±0.6ms      2.27±0.01ms     0.49  rolling.Methods.time_rolling('Series', 1000, 'float', 'mean')
-      7.25±0.5ms       3.48±0.2ms     0.48  rolling.VariableWindowMethods.time_rolling('DataFrame', '50s', 'float', 'count')
-      6.37±0.3ms       2.84±0.2ms     0.45  rolling.Quantile.time_quantile('DataFrame', 1000, 'float', 0)
-        325±20ms         138±10ms     0.43  rolling.VariableWindowMethods.time_rolling('Series', '1d', 'float', 'median')
-        821±20ms          330±5ms     0.40  rolling.Quantile.time_quantile('Series', 1000, 'float', 0.5)
-        98.9±5ms         6.45±1ms     0.07  rolling.VariableWindowMethods.time_rolling('Series', '1h', 'int', 'max')
-         102±3ms       6.56±0.4ms     0.06  rolling.VariableWindowMethods.time_rolling('Series', '1h', 'int', 'min')
-         107±5ms       5.99±0.4ms     0.06  rolling.VariableWindowMethods.time_rolling('Series', '1h', 'float', 'max')
-         117±2ms       6.44±0.6ms     0.06  rolling.VariableWindowMethods.time_rolling('Series', '1h', 'float', 'min')
-         104±6ms       5.11±0.2ms     0.05  rolling.VariableWindowMethods.time_rolling('DataFrame', '1h', 'int', 'max')
-         106±2ms       5.09±0.1ms     0.05  rolling.VariableWindowMethods.time_rolling('DataFrame', '1h', 'float', 'max')
-       117±0.6ms       5.04±0.4ms     0.04  rolling.VariableWindowMethods.time_rolling('DataFrame', '1h', 'float', 'min')
-         126±5ms       5.35±0.1ms     0.04  rolling.VariableWindowMethods.time_rolling('DataFrame', '1h', 'int', 'min')
-           2.26s       7.99±0.8ms     0.00  rolling.VariableWindowMethods.time_rolling('DataFrame', '1d', 'int', 'max')
-           2.29s         6.95±1ms     0.00  rolling.VariableWindowMethods.time_rolling('Series', '1d', 'float', 'max')
-           2.90s         8.39±2ms     0.00  rolling.VariableWindowMethods.time_rolling('DataFrame', '1d', 'int', 'min')
-           2.09s       5.58±0.3ms     0.00  rolling.VariableWindowMethods.time_rolling('DataFrame', '1d', 'float', 'min')
-           2.52s         6.63±1ms     0.00  rolling.VariableWindowMethods.time_rolling('Series', '1d', 'int', 'max')
-           2.41s         6.09±1ms     0.00  rolling.VariableWindowMethods.time_rolling('Series', '1d', 'int', 'min')
-           2.45s       5.60±0.2ms     0.00  rolling.VariableWindowMethods.time_rolling('Series', '1d', 'float', 'min')
-           2.61s       5.63±0.1ms     0.00  rolling.VariableWindowMethods.time_rolling('DataFrame', '1d', 'float', 'max')

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.

@jreback
Copy link
Contributor

jreback commented Feb 10, 2018

can you add a whatsnew note in perf. lgtm otherwise.

I suspect that a very short time window has a small perf regression here, but we win really big for bigger windows. So this is a nice tradeoff (in theory we could use the original method for short windows but that adds a lot to code complexity). You could add this in the code as a comment.

@jreback
Copy link
Contributor

jreback commented Feb 12, 2018

i restarted. ping on green.

@jreback jreback merged commit 39e7b69 into pandas-dev:master Feb 14, 2018
@jreback
Copy link
Contributor

jreback commented Feb 14, 2018

thanks @hexgnu nice patch!

harisbal pushed a commit to harisbal/pandas that referenced this pull request Feb 28, 2018
changhiskhan added a commit to changhiskhan/pandas that referenced this pull request Jul 11, 2018
…21704)

User reported that `df.rolling(to_offset('3D'), closed='left').max()`
segfaults when df has a datetime index. The bug was in PR pandas-dev#19549. In
that PR, in https://github.com/pandas-dev/pandas/blame/master/pandas/_libs/window.pyx#L1268
`i` is initialized to `endi[0]`, which is 0 when `closed=left`.
So in the next line when it tries to set `output[i-1]` it goes out of bounds.
In addition, there are 2 more bugs in the `roll_min_max` code.
The second bug is that for variable size windows, the `nobs` is never updated
when elements leave the window. The third bug is at the end of the fixed
window where all output elements up to `minp` are initialized to 0 if
the input is not float.

This PR fixes all three of the aforementioned bugs, at the cost of casting the
output array to floating point even if the input is integer. This is less
than ideal if the output has no NaNs, but is still consistent with roll_sum
behavior.
changhiskhan added a commit to changhiskhan/pandas that referenced this pull request Jul 11, 2018
…21704)

User reported that `df.rolling(to_offset('3D'), closed='left').max()`
segfaults when df has a datetime index. The bug was in PR pandas-dev#19549. In
that PR, in https://github.com/pandas-dev/pandas/blame/master/pandas/_libs/window.pyx#L1268
`i` is initialized to `endi[0]`, which is 0 when `closed=left`.
So in the next line when it tries to set `output[i-1]` it goes out of bounds.
In addition, there are 2 more bugs in the `roll_min_max` code.
The second bug is that for variable size windows, the `nobs` is never updated
when elements leave the window. The third bug is at the end of the fixed
window where all output elements up to `minp` are initialized to 0 if
the input is not float.

This PR fixes all three of the aforementioned bugs, at the cost of casting the
output array to floating point even if the input is integer. This is less
than ideal if the output has no NaNs, but is still consistent with roll_sum
behavior.
changhiskhan added a commit to changhiskhan/pandas that referenced this pull request Jul 12, 2018
…21704)

User reported that `df.rolling(to_offset('3D'), closed='left').max()`
segfaults when df has a datetime index. The bug was in PR pandas-dev#19549. In
that PR, in https://github.com/pandas-dev/pandas/blame/master/pandas/_libs/window.pyx#L1268
`i` is initialized to `endi[0]`, which is 0 when `closed=left`.
So in the next line when it tries to set `output[i-1]` it goes out of bounds.
In addition, there are 2 more bugs in the `roll_min_max` code.
The second bug is that for variable size windows, the `nobs` is never updated
when elements leave the window. The third bug is at the end of the fixed
window where all output elements up to `minp` are initialized to 0 if
the input is not float.

This PR fixes all three of the aforementioned bugs, at the cost of casting the
output array to floating point even if the input is integer. This is less
than ideal if the output has no NaNs, but is still consistent with roll_sum
behavior.
changhiskhan added a commit to changhiskhan/pandas that referenced this pull request Jul 13, 2018
…21704)

User reported that `df.rolling(to_offset('3D'), closed='left').max()`
segfaults when df has a datetime index. The bug was in PR pandas-dev#19549. In
that PR, in https://github.com/pandas-dev/pandas/blame/master/pandas/_libs/window.pyx#L1268
`i` is initialized to `endi[0]`, which is 0 when `closed=left`.
So in the next line when it tries to set `output[i-1]` it goes out of bounds.
In addition, there are 2 more bugs in the `roll_min_max` code.
The second bug is that for variable size windows, the `nobs` is never updated
when elements leave the window. The third bug is at the end of the fixed
window where all output elements up to `minp` are initialized to 0 if
the input is not float.

This PR fixes these three bugs, at the cost of casting the output array to
floating point even if the input is integer. This is less than ideal if the
output has no NaNs, but is consistent with roll_sum behavior.
changhiskhan added a commit to changhiskhan/pandas that referenced this pull request Jul 13, 2018
…21704)

User reported that `df.rolling(to_offset('3D'), closed='left').max()`
segfaults when df has a datetime index. The bug was in PR pandas-dev#19549. In
that PR, in https://github.com/pandas-dev/pandas/blame/master/pandas/_libs/window.pyx#L1268
`i` is initialized to `endi[0]`, which is 0 when `closed=left`.
So in the next line when it tries to set `output[i-1]` it goes out of bounds.
In addition, there are 2 more bugs in the `roll_min_max` code.
The second bug is that for variable size windows, the `nobs` is never updated
when elements leave the window. The third bug is at the end of the fixed
window where all output elements up to `minp` are initialized to 0 if
the input is not float.

This PR fixes these three bugs, at the cost of casting the output array to
floating point even if the input is integer. This is less than ideal if the
output has no NaNs, but is consistent with roll_sum behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PERF: improve perf of variable window rolling_min/max
4 participants