Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: O(n) speedup in any/all by re-enabling short-circuiting for bool case #25070

Merged
merged 1 commit into from Apr 30, 2019

Conversation

@qwhelan
Copy link
Contributor

commented Feb 1, 2019

At present, the .all() and .any() functions scale as O(n), more or less regardless of their exit condition being fulfilled:

$ asv run -b Any -b All upstream/master
[  0.01%] ··· series_methods.All.time_all                                                                                                                                                                 ok
[  0.01%] ··· ========= ============ ============
              --                   case          
              --------- -------------------------
                  N         fast         slow    
              ========= ============ ============
                 1000     115±3μs      123±5μs   
               1000000   11.1±0.2ms   13.6±0.1ms 
              ========= ============ ============

[  0.01%] ··· series_methods.Any.time_any                                                                                                                                                                 ok
[  0.01%] ··· ========= ============ ============
              --                   case          
              --------- -------------------------
                  N         fast         slow    
              ========= ============ ============
                 1000     118±2μs      117±3μs   
               1000000   14.0±0.2ms   11.6±0.4ms 
              ========= ============ ============

The root cause of this was identified by asv find to be #8550, which was released as part of v0.15.2. Specifically, we call isna() on the entire input prior to dispatching to np.all/np.any; the latter does short circuit but is insignificant compared to isna().

My proposal here is to exempt np.bool from the isna() call on the basis that it is not capable of storing NaN values; I'm hesitant to make this case much broader than necessary to accommodate calls like pd.isnull(s).all(). The hasattr() check is necessary due to a Panel test case directly invoking nanall(Panel).

The good news is that this approach is very effective:

$ asv run -b Any -b All HEAD
[  0.01%] ··· series_methods.All.time_all                                                                                                                                                                 ok
[  0.01%] ··· ========= ============ ============
              --                   case          
              --------- -------------------------
                  N         fast         slow    
              ========= ============ ============
                 1000     59.8±1μs    60.9±0.7μs 
               1000000   73.7±0.4μs    138±4μs   
              ========= ============ ============

[  0.01%] ··· series_methods.Any.time_any                                                                                                                                                                 ok
[  0.01%] ··· ========= ========== ============
              --                  case         
              --------- -----------------------
                  N        fast        slow    
              ========= ========== ============
                 1000    59.7±1μs   59.7±0.8μs 
               1000000   75.7±1μs    133±3μs   
              ========= ========== ============

Additionally, this call is not too widely used within the pandas codebase as most invoke the numpy equivalent. That being said, we do see some broad speedups when running the entire suite:

$ asv compare ea013a25 37c2e7a2 -s --sort ratio --only-changed
       before           after         ratio
     [ea013a25]       [37c2e7a2]
     <master~1>       <master>  
-      10.4±0.2ms       9.41±0.1ms     0.90  timeseries.AsOf.time_asof('DataFrame')
-     1.54±0.03ms      1.39±0.01ms     0.90  groupby.GroupByMethods.time_dtype_as_group('float', 'rank', 'transformation')
-      38.3±0.8ms       34.3±0.6ms     0.90  stat_ops.FrameOps.time_op('median', 'float', 1, True)
-     1.18±0.03ms      1.05±0.01ms     0.89  groupby.GroupByMethods.time_dtype_as_group('int', 'cummax', 'direct')
-      23.2±0.3ms       20.8±0.2ms     0.89  groupby.MultiColumn.time_cython_sum
-     1.19±0.05ms      1.06±0.01ms     0.89  groupby.GroupByMethods.time_dtype_as_group('datetime', 'cummin', 'direct')
-         128±1ms          114±2ms     0.89  indexing.CategoricalIndexIndexing.time_get_indexer_list('non_monotonic')
-     4.12±0.08ms      3.64±0.08ms     0.88  indexing.NumericSeriesIndexing.time_loc_list_like(<class 'pandas.core.indexes.numeric.Int64Index'>, 'unique_monotonic_inc')
-     1.23±0.02ms      1.09±0.01ms     0.88  groupby.GroupByMethods.time_dtype_as_group('float', 'cummax', 'transformation')
-     1.20±0.04ms      1.06±0.02ms     0.88  groupby.GroupByMethods.time_dtype_as_group('float', 'cummin', 'direct')
-     1.20±0.02ms      1.06±0.01ms     0.88  groupby.GroupByMethods.time_dtype_as_field('int', 'cumsum', 'direct')
-     10.5±0.08ms       9.12±0.3ms     0.87  timeseries.AsOf.time_asof_nan('DataFrame')
-      6.30±0.3ms       5.46±0.2ms     0.87  indexing.NumericSeriesIndexing.time_loc_array(<class 'pandas.core.indexes.numeric.Int64Index'>, 'unique_monotonic_inc')
-     1.55±0.04ms      1.34±0.02ms     0.86  groupby.GroupByMethods.time_dtype_as_group('datetime', 'rank', 'transformation')
-         130±1ms        112±0.8ms     0.86  indexing.CategoricalIndexIndexing.time_get_indexer_list('monotonic_decr')
-      28.1±0.7ms       22.9±0.3ms     0.82  reshape.Cut.time_cut_float(10)
-      26.4±0.4ms         21.0±1ms     0.80  reshape.Cut.time_cut_float(4)
-        12.7±2ms       9.06±0.2ms     0.71  inference.DateInferOps.time_subtract_datetimes
-     4.33±0.05ms      3.03±0.04ms     0.70  timeseries.AsOf.time_asof_single('DataFrame')
-      1.04±0.2ms          681±7μs     0.66  groupby.GroupByMethods.time_dtype_as_group('int', 'median', 'transformation')
-     4.51±0.05ms      2.69±0.05ms     0.60  timeseries.AsOf.time_asof_nan_single('DataFrame')
-         112±2μs       59.5±0.8μs     0.53  series_methods.All.time_all(1000, 'fast')
-         115±2μs         60.7±1μs     0.53  series_methods.Any.time_any(1000, 'fast')
-         117±7μs         61.3±2μs     0.53  series_methods.All.time_all(1000, 'slow')
-         115±1μs         59.8±1μs     0.52  series_methods.Any.time_any(1000, 'slow')
-      2.39±0.6ms      1.20±0.01ms     0.50  stat_ops.SeriesOps.time_op('kurt', 'int', False)
-      2.45±0.6ms      1.21±0.01ms     0.49  stat_ops.SeriesOps.time_op('skew', 'int', False)
-      2.43±0.6ms      1.20±0.01ms     0.49  stat_ops.SeriesOps.time_op('kurt', 'int', True)
-      11.1±0.2ms        132±0.5μs     0.01  series_methods.Any.time_any(1000000, 'slow')
-      13.5±0.2ms          133±2μs     0.01  series_methods.All.time_all(1000000, 'slow')
-      10.9±0.3ms         74.9±1μs     0.01  series_methods.All.time_all(1000000, 'fast')
-      13.4±0.3ms         75.3±2μs     0.01  series_methods.Any.time_any(1000000, 'fast')

For the largest n tested, we see a speedup of ~170x in the fast case.

  • closes #xxxx
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry
@pep8speaks

This comment has been minimized.

Copy link

commented Feb 1, 2019

Hello @qwhelan! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-04-29 05:29:16 UTC
@codecov

This comment has been minimized.

Copy link

commented Feb 1, 2019

Codecov Report

Merging #25070 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #25070      +/-   ##
==========================================
- Coverage   92.37%   92.36%   -0.01%     
==========================================
  Files         166      166              
  Lines       52403    52409       +6     
==========================================
+ Hits        48405    48410       +5     
- Misses       3998     3999       +1
Flag Coverage Δ
#multiple 90.79% <100%> (ø) ⬆️
#single 42.88% <100%> (+0.01%) ⬆️
Impacted Files Coverage Δ
pandas/core/nanops.py 93.9% <100%> (+0.06%) ⬆️
pandas/util/testing.py 88.04% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ea013a2...3acc6b6. Read the comment docs.

@codecov

This comment has been minimized.

Copy link

commented Feb 1, 2019

Codecov Report

Merging #25070 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #25070      +/-   ##
==========================================
- Coverage   91.97%   91.96%   -0.01%     
==========================================
  Files         175      175              
  Lines       52368    52378      +10     
==========================================
+ Hits        48164    48171       +7     
- Misses       4204     4207       +3
Flag Coverage Δ
#multiple 90.52% <100%> (ø) ⬆️
#single 40.69% <40.62%> (-0.15%) ⬇️
Impacted Files Coverage Δ
pandas/core/nanops.py 94.11% <100%> (+0.28%) ⬆️
pandas/io/gbq.py 78.94% <0%> (-10.53%) ⬇️
pandas/core/frame.py 96.9% <0%> (-0.12%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9feb3ad...0387154. Read the comment docs.

@qwhelan qwhelan force-pushed the qwhelan:any_all_fix branch from 3acc6b6 to 362e702 Feb 1, 2019

@@ -201,11 +201,13 @@ def _get_fill_value(dtype, fill_value=None, fill_value_typ=None):


def _get_values(values, skipna, fill_value=None, fill_value_typ=None,
isfinite=False, copy=True, mask=None):
isfinite=False, copy=True, mask=None, compute_mask=True):

This comment has been minimized.

Copy link
@jreback

jreback Feb 1, 2019

Contributor

so rather than adding another argument, we already have the mask, can you simply have this accept mask=True in this case?

This comment has been minimized.

Copy link
@qwhelan

qwhelan Mar 22, 2019

Author Contributor

I ended up taking this approach, but instead making mask a positional argument and taking mask = None to be signifying that a mask has been determined to be unnecessary. If another sigil value would be preferred, I think that would be easy to do.

@@ -362,8 +364,12 @@ def nanany(values, axis=None, skipna=True, mask=None):
>>> nanops.nanany(s)
False
"""
values, mask, dtype, _, _ = _get_values(values, skipna, False, copy=skipna,
mask=mask)
if (hasattr(values, 'dtype') and is_bool_dtype(values.dtype) and

This comment has been minimized.

Copy link
@jreback

jreback Feb 1, 2019

Contributor

can you do this logic inside _get_values (maybe update the return signature to return skipna as well)

This comment has been minimized.

Copy link
@qwhelan

qwhelan Feb 1, 2019

Author Contributor

I'll investigate - this might require modifying some of the other utility functions to play nice.

This comment has been minimized.

Copy link
@qwhelan

qwhelan Mar 22, 2019

Author Contributor

This ended up being tricky to do inside _get_values() as mask is sometimes used to compute array dimensions within certain nanops. My solution was to move the logic outside of _get_values, make mask a positional argument, and remove it from the result tuple. This allows for functions that need it for dimension calculations to always compute it while others can do so optionally.

@jreback jreback added the Performance label Feb 1, 2019

@jbrockmendel

This comment has been minimized.

Copy link
Member

commented Feb 1, 2019

Does the implementation become any simpler after Panel is removed?

@jreback

This comment has been minimized.

Copy link
Contributor

commented Mar 20, 2019

can you merge master

@qwhelan qwhelan force-pushed the qwhelan:any_all_fix branch 5 times, most recently from 913dd30 to 94fbe52 Mar 21, 2019

@qwhelan

This comment has been minimized.

Copy link
Contributor Author

commented Mar 22, 2019

Apologies for the delay - interviewing processes have gotten in the way here.

@jbrockmendel Not especially.

@jreback Rebased on master. I've kept my additions in response to your comments as a separate commit for now - will squash if it seems they're going in the right general direction.

My PR to fix the underlying performance bug is here numpy/numpy#12988 and pending merge; hopefully it will make it into numpy 1.17 and to a large extent obviate the need for this change. It may make sense to expand this PR to include all integer dtypes, as they also cannot store NaN.

Additionally, np.any() and np.all() attempt to short-circuit but actually exhibit O(n) behavior due to processing of reduction ufuncs in NPY_BUFSIZE (ie, 8192 byte) chunks. I'm investigating ways to remove or increase the buffer size there, which should make pd.all() and pd.any() considerably faster. It's worth noting that there's no attempt to short-circuit for any non-bool dtypes.

isfinite=False, copy=True, mask=None):
def _maybe_get_mask(values, skipna, mask):
""" This function will return a mask iff it is necessary. Otherwise, return
None when a mask is not needed.

This comment has been minimized.

Copy link
@jreback

jreback Mar 22, 2019

Contributor

can you add parameters / returns in doc-string

""" This function will return a mask iff it is necessary. Otherwise, return
None when a mask is not needed.
"""
if (hasattr(values, 'dtype') and is_bool_dtype(values.dtype)):

This comment has been minimized.

Copy link
@jreback

jreback Mar 22, 2019

Contributor

is this first test needed?

This comment has been minimized.

Copy link
@qwhelan

qwhelan Mar 24, 2019

Author Contributor

The first test is needed due to a nanall(Panel) test case

This comment has been minimized.

Copy link
@jreback

jreback Mar 24, 2019

Contributor

Panel tests should be gone now



def _get_mask(values, mask, skipna=True):
if mask is None and skipna:

This comment has been minimized.

Copy link
@jreback

jreback Mar 22, 2019

Contributor

doc-string



def _get_values(values, skipna, mask, fill_value=None, fill_value_typ=None,
copy=True):
""" utility to get the values view, mask, dtype

This comment has been minimized.

Copy link
@jreback

jreback Mar 22, 2019

Contributor

can you update the doc-string here

# Boolean data cannot contain nulls, so signal via mask being None
return None

mask = _get_mask(values, mask, skipna=skipna)

This comment has been minimized.

Copy link
@jreback

jreback Mar 22, 2019

Contributor

why can't you do this inside _get_mask itself? (the test)

This comment has been minimized.

Copy link
@qwhelan

qwhelan Mar 24, 2019

Author Contributor

The _get_counts() function relies on mask providing basic shape information even when skipna=False. Unfortunately, creating an empty mask (eg, mask = np.zeros(values.shape)) to propagate this shape info for nanmean/etc negates the speed benefit to nanall()/nanany(), so two separate approaches are necessary.

This comment has been minimized.

Copy link
@jreback

jreback Mar 24, 2019

Contributor

can we pass mask=None in those cases? I find this having to call 2 functions much more overhead than before.

""" This function will return a mask iff it is necessary. Otherwise, return
None when a mask is not needed.
"""
if (hasattr(values, 'dtype') and is_bool_dtype(values.dtype)):

This comment has been minimized.

Copy link
@jreback

jreback Mar 24, 2019

Contributor

Panel tests should be gone now

# Boolean data cannot contain nulls, so signal via mask being None
return None

mask = _get_mask(values, mask, skipna=skipna)

This comment has been minimized.

Copy link
@jreback

jreback Mar 24, 2019

Contributor

can we pass mask=None in those cases? I find this having to call 2 functions much more overhead than before.

@jreback

This comment has been minimized.

Copy link
Contributor

commented Apr 5, 2019

can you merge master and update to comments

@jreback
Copy link
Contributor

left a comment

this is way more complicated and seemingly error prone than the original, can you see if you can simplify

def setup(self, N, dtype):
self.s = Series([1] * N, dtype=dtype)

def time_var(self, N, dtype):

This comment has been minimized.

Copy link
@jreback

jreback Apr 16, 2019

Contributor

could make the function another param

This comment has been minimized.

Copy link
@qwhelan

qwhelan Apr 16, 2019

Author Contributor

I'll probably remove these prior to merge as they duplicate existing benchmarks.

This comment has been minimized.

Copy link
@jreback

jreback Apr 28, 2019

Contributor

can you update this

@qwhelan qwhelan force-pushed the qwhelan:any_all_fix branch 7 times, most recently from 3573c21 to d96b086 Apr 16, 2019

@qwhelan

This comment has been minimized.

Copy link
Contributor Author

commented Apr 17, 2019

I'm going to see if I can simplify further - I was able to remove the "copy=" parameter from _get_values().

That being said, allowing mask=None requires some more involved modifications that also allow for some decent performance gains. The most recent commit extends the speedups across all nanops and to all integer dtypes. In particular, we're now getting a 7-8x speedup in nansum() and nanmean() for int64 data. The nanprod(), nanvar(), and nanstd() functions see a ~3x speedup:

$ asv compare upstream/master HEAD -s --sort ratio --only-changed
       before           after         ratio
     [68b1da7f]       [d96b086c]
     <any_all_fix~3>       <any_all_fix>
-     2.06±0.01ms         1.85±0ms     0.89  indexing.NumericSeriesIndexing.time_loc_list_like(<class 'pandas.core.indexes.numeric.Float64Index'>, 'nonunique_monotonic_inc')
-      52.3±0.5ms       46.5±0.7ms     0.89  reshape.Cut.time_cut_int(1000)
-         162±3ms        141±0.8ms     0.87  groupby.GroupByMethods.time_dtype_as_field('int', 'skew', 'transformation')
-         238±2ms          208±2ms     0.87  groupby.GroupByMethods.time_dtype_as_group('int', 'skew', 'direct')
-         240±3ms          208±2ms     0.87  groupby.GroupByMethods.time_dtype_as_group('int', 'skew', 'transformation')
-         374±2ms          324±3ms     0.87  groupby.GroupByMethods.time_dtype_as_group('float', 'skew', 'transformation')
-         373±3ms          321±2ms     0.86  groupby.GroupByMethods.time_dtype_as_group('float', 'skew', 'direct')
-      24.6±0.4ms       21.1±0.2ms     0.86  groupby.Nth.time_groupby_nth_all('float64')
-     4.67±0.07ms      3.98±0.08ms     0.85  reshape.SimpleReshape.time_stack
-      25.2±0.4ms       21.4±0.4ms     0.85  groupby.Nth.time_groupby_nth_all('datetime')
-        971±10μs         826±10μs     0.85  stat_ops.SeriesOps.time_op('median', 'int', False)
-         511±7μs         433±10μs     0.85  groupby.GroupByMethods.time_dtype_as_field('object', 'any', 'transformation')
-         166±4ms          140±2ms     0.85  groupby.GroupByMethods.time_dtype_as_field('int', 'skew', 'direct')
-        983±10μs         821±30μs     0.84  stat_ops.SeriesOps.time_op('median', 'int', True)
-     6.33±0.02ms      5.27±0.04ms     0.83  timeseries.AsOf.time_asof('DataFrame')
-     16.9±0.07ms       14.1±0.1ms     0.83  indexing.NumericSeriesIndexing.time_getitem_scalar(<class 'pandas.core.indexes.numeric.UInt64Index'>, 'nonunique_monotonic_inc')
-         115±7μs         91.2±6μs     0.80  series_methods.NanOps.time_std(1000, 'int64')
-      4.23±0.1ms       3.36±0.1ms     0.79  indexing.NumericSeriesIndexing.time_ix_array(<class 'pandas.core.indexes.numeric.Int64Index'>, 'unique_monotonic_inc')
-     6.42±0.06ms      5.03±0.03ms     0.78  timeseries.AsOf.time_asof_nan('DataFrame')
-        218±10μs          170±4μs     0.78  series_methods.NanOps.time_sem(1000, 'int64')
-      18.5±0.4ms       14.0±0.3ms     0.76  algorithms.Hashing.time_frame
-        70.4±2μs         53.2±4μs     0.76  series_methods.NanOps.time_sum(1000, 'int64')
-       39.3±10ms       29.7±0.5ms     0.75  series_methods.NanOps.time_sem(1000000, 'int64')
-     2.79±0.05ms      2.05±0.05ms     0.73  timeseries.AsOf.time_asof_single('DataFrame')
-      26.1±0.2ms         19.1±2ms     0.73  stat_ops.FrameOps.time_op('sem', 'int', 1, False)
-      26.2±0.2ms         19.2±2ms     0.73  stat_ops.FrameOps.time_op('sem', 'int', 1, True)
-        21.7±2ms         15.6±2ms     0.72  stat_ops.FrameOps.time_op('sem', 'int', 0, False)
-        21.8±2ms         15.6±2ms     0.72  stat_ops.FrameOps.time_op('sem', 'int', 0, True)
-         258±4μs          184±4μs     0.71  indexing_engines.NumericEngineIndexing.time_get_loc((<class 'pandas._libs.index.Int16Engine'>, <class 'numpy.int16'>), 'monotonic_incr')
-         248±3μs          176±2μs     0.71  indexing_engines.NumericEngineIndexing.time_get_loc((<class 'pandas._libs.index.Int8Engine'>, <class 'numpy.int8'>), 'monotonic_incr')
-        91.8±3μs         64.7±6μs     0.71  series_methods.NanOps.time_var(1000, 'int64')
-         129±3ms       90.5±0.4ms     0.70  frame_methods.Describe.time_dataframe_describe
-         312±3μs          217±1μs     0.69  indexing_engines.NumericEngineIndexing.time_get_loc((<class 'pandas._libs.index.UInt32Engine'>, <class 'numpy.uint32'>), 'monotonic_incr')
-        78.1±4μs         53.8±3μs     0.69  series_methods.NanOps.time_mean(1000, 'int64')
-      41.0±0.2ms       28.2±0.2ms     0.69  frame_methods.Describe.time_series_describe
-        89.0±6μs         61.0±4μs     0.69  series_methods.NanOps.time_skew(1000, 'int64')
-        92.7±4μs         62.0±2μs     0.67  series_methods.NanOps.time_kurt(1000, 'int64')
-        35.9±4ms         23.6±4ms     0.66  series_methods.NanOps.time_skew(1000000, 'int64')
-     2.92±0.02ms      1.91±0.03ms     0.66  timeseries.AsOf.time_asof_nan_single('DataFrame')
-      27.3±0.8ms       17.8±0.3ms     0.65  stat_ops.FrameOps.time_op('skew', 'int', 1, True)
-        59.2±2μs         38.6±3μs     0.65  series_methods.NanOps.time_argmax(1000, 'int64')
-     1.55±0.01ms      1.01±0.06ms     0.65  stat_ops.SeriesOps.time_op('sem', 'int', False)
-      9.85±0.3ms       6.36±0.1ms     0.65  groupby.TransformBools.time_transform_mean
-        36.1±4ms         23.2±4ms     0.64  series_methods.NanOps.time_kurt(1000000, 'int64')
-        51.2±1μs         32.7±3μs     0.64  series_methods.NanOps.time_prod(1000, 'int64')
-     1.58±0.04ms         999±60μs     0.63  stat_ops.SeriesOps.time_op('sem', 'int', True)
-         208±1ms        128±0.6ms     0.62  frame_methods.Dropna.time_dropna('all', 1)
-         207±1ms        125±0.6ms     0.60  frame_methods.Dropna.time_dropna('all', 0)
-        6.96±4ms      4.18±0.02ms     0.60  algorithms.Duplicated.time_duplicated(False, 'int')
-      24.9±0.1ms       14.9±0.2ms     0.60  stat_ops.FrameOps.time_op('kurt', 'int', 1, False)
-         253±2ms          151±4ms     0.60  frame_methods.Dropna.time_dropna_axis_mixed_dtypes('all', 1)
-        53.3±2μs         31.6±3μs     0.59  series_methods.All.time_all(1000, 'slow')
-      25.2±0.9ms       14.9±0.2ms     0.59  stat_ops.FrameOps.time_op('kurt', 'int', 1, True)
-        62.9±3μs         36.4±3μs     0.58  series_methods.NanOps.time_max(1000, 'int64')
-     5.05±0.04ms      2.93±0.01ms     0.58  stat_ops.FrameOps.time_op('median', 'int', 0, False)
-     5.07±0.03ms      2.93±0.03ms     0.58  stat_ops.FrameOps.time_op('median', 'int', 0, True)
-      16.2±0.1ms         9.29±2ms     0.57  stat_ops.FrameOps.time_op('skew', 'int', 0, False)
-      16.1±0.6ms         9.22±1ms     0.57  stat_ops.FrameOps.time_op('kurt', 'int', 0, True)
-      16.1±0.2ms         9.21±2ms     0.57  stat_ops.FrameOps.time_op('kurt', 'int', 0, False)
-        16.3±1ms         9.28±2ms     0.57  stat_ops.FrameOps.time_op('skew', 'int', 0, True)
-        63.9±4μs         35.4±1μs     0.55  series_methods.NanOps.time_min(1000, 'int64')
-        54.6±3μs         30.0±2μs     0.55  series_methods.All.time_all(1000, 'fast')
-        54.7±3μs         28.7±2μs     0.53  series_methods.Any.time_any(1000, 'fast')
-         264±1ms        136±0.3ms     0.51  frame_methods.Dropna.time_dropna_axis_mixed_dtypes('all', 0)
-        56.8±4μs         28.3±1μs     0.50  series_methods.Any.time_any(1000, 'slow')
-        852±20μs         388±10μs     0.46  stat_ops.SeriesOps.time_op('skew', 'int', False)
-        852±10μs          376±6μs     0.44  stat_ops.SeriesOps.time_op('skew', 'int', True)
-         852±9μs          376±6μs     0.44  stat_ops.SeriesOps.time_op('kurt', 'int', True)
-         292±3μs        123±0.9μs     0.42  stat_ops.SeriesOps.time_op('prod', 'int', True)
-         288±5μs          120±3μs     0.42  stat_ops.SeriesOps.time_op('prod', 'int', False)
-     1.23±0.01ms          490±4μs     0.40  stat_ops.FrameOps.time_op('prod', 'int', 0, True)
-     1.24±0.02ms          490±5μs     0.40  stat_ops.FrameOps.time_op('prod', 'int', 0, False)
-       931±100μs          369±6μs     0.40  stat_ops.SeriesOps.time_op('kurt', 'int', False)
-       158±0.6ms       62.5±0.4ms     0.40  frame_methods.Dropna.time_dropna_axis_mixed_dtypes('any', 1)
-         322±7μs          123±4μs     0.38  stat_ops.SeriesOps.time_op('mean', 'int', True)
-        708±10μs          261±8μs     0.37  stat_ops.SeriesOps.time_op('std', 'int', False)
-         328±3μs          120±2μs     0.37  stat_ops.SeriesOps.time_op('mean', 'int', False)
-         681±9μs         249±10μs     0.37  stat_ops.SeriesOps.time_op('var', 'int', False)
-      3.03±0.9ms      1.10±0.01ms     0.36  series_methods.NanOps.time_argmax(1000000, 'int64')
-        701±10μs          251±4μs     0.36  stat_ops.SeriesOps.time_op('std', 'int', True)
-     2.70±0.01ms          966±9μs     0.36  series_methods.NanOps.time_prod(1000000, 'int64')
-         690±6μs          243±3μs     0.35  stat_ops.SeriesOps.time_op('var', 'int', True)
-         305±7μs          103±3μs     0.34  stat_ops.SeriesOps.time_op('sum', 'int', False)
-        311±10μs         99.1±3μs     0.32  stat_ops.SeriesOps.time_op('sum', 'int', True)
-         121±1ms         38.4±1ms     0.32  frame_methods.Dropna.time_dropna('any', 0)
-       114±0.3ms       35.5±0.5ms     0.31  frame_methods.Dropna.time_dropna('any', 1)
-      8.80±0.2ms      2.64±0.07ms     0.30  stat_ops.FrameOps.time_op('std', 'int', 1, True)
-        17.5±4ms         5.20±4ms     0.30  series_methods.NanOps.time_std(1000000, 'int64')
-      8.83±0.2ms       2.61±0.2ms     0.30  stat_ops.FrameOps.time_op('std', 'int', 1, False)
-        8.61±1ms       2.41±0.3ms     0.28  stat_ops.FrameOps.time_op('var', 'int', 1, False)
-     1.37±0.01ms         382±10μs     0.28  stat_ops.FrameOps.time_op('prod', 'int', 1, True)
-     8.59±0.06ms      2.38±0.08ms     0.28  stat_ops.FrameOps.time_op('var', 'int', 1, True)
-     1.40±0.03ms         385±10μs     0.27  stat_ops.FrameOps.time_op('prod', 'int', 1, False)
-        17.8±4ms         4.75±4ms     0.27  series_methods.NanOps.time_var(1000000, 'int64')
-        3.70±1ms         963±10μs     0.26  series_methods.NanOps.time_mean(1000000, 'int64')
-      7.49±0.1ms      1.92±0.07ms     0.26  stat_ops.FrameOps.time_op('var', 'int', 0, True)
-      7.48±0.1ms      1.89±0.04ms     0.25  stat_ops.FrameOps.time_op('std', 'int', 0, False)
-      7.42±0.1ms      1.86±0.06ms     0.25  stat_ops.FrameOps.time_op('var', 'int', 0, False)
-      7.47±0.1ms      1.85±0.09ms     0.25  stat_ops.FrameOps.time_op('std', 'int', 0, True)
-         169±1ms       41.0±0.4ms     0.24  frame_methods.Dropna.time_dropna_axis_mixed_dtypes('any', 0)
-        3.39±1ms         723±10μs     0.21  series_methods.NanOps.time_min(1000000, 'int64')
-        3.67±1ms         722±10μs     0.20  series_methods.NanOps.time_max(1000000, 'int64')
-        3.40±1ms          649±9μs     0.19  series_methods.NanOps.time_sum(1000000, 'int64')
-     3.93±0.04ms          646±9μs     0.16  stat_ops.FrameOps.time_op('mean', 'int', 1, True)
-     3.91±0.03ms         641±10μs     0.16  stat_ops.FrameOps.time_op('mean', 'int', 1, False)
-     3.28±0.03ms         493±10μs     0.15  stat_ops.FrameOps.time_op('mean', 'int', 0, False)
-     3.27±0.03ms          488±8μs     0.15  stat_ops.FrameOps.time_op('mean', 'int', 0, True)
-     2.69±0.04ms         381±10μs     0.14  stat_ops.FrameOps.time_op('sum', 'int', 0, False)
-     2.70±0.02ms          375±5μs     0.14  stat_ops.FrameOps.time_op('sum', 'int', 0, True)
-     2.95±0.01ms         400±10μs     0.14  stat_ops.FrameOps.time_op('sum', 'int', 1, True)
-     2.94±0.02ms          383±8μs     0.13  stat_ops.FrameOps.time_op('sum', 'int', 1, False)
-      6.29±0.1ms       70.3±0.8μs     0.01  series_methods.Any.time_any(1000000, 'slow')
-      6.48±0.2ms         68.0±4μs     0.01  series_methods.All.time_all(1000000, 'slow')
-      6.50±0.2ms       37.2±0.9μs     0.01  series_methods.Any.time_any(1000000, 'fast')
-     6.37±0.07ms         35.5±1μs     0.01  series_methods.All.time_all(1000000, 'fast')
----------
values : ndarray
skipna : bool
mask : Optional[ndarray[bool]]

This comment has been minimized.

Copy link
@jreback

jreback Apr 17, 2019

Contributor

so rather than typing here, can you type the args themselves

This comment has been minimized.

Copy link
@qwhelan

qwhelan Apr 21, 2019

Author Contributor

Done, and also typed the return value. It appears the convention is to duplicate types in the return signature and docstring block; let me know if you'd prefer to keep only one.

This comment has been minimized.

Copy link
@jreback

jreback Apr 21, 2019

Contributor

great. cc @WillAyd @jorisvandenbossche need to think about this
but prob ok for now

@@ -200,12 +200,84 @@ def _get_fill_value(dtype, fill_value=None, fill_value_typ=None):
return tslibs.iNaT


def _maybe_get_mask(values, skipna, mask, invert=False):

This comment has been minimized.

Copy link
@jreback

jreback Apr 17, 2019

Contributor

do we really need invert? this is just confusing

This comment has been minimized.

Copy link
@jreback

jreback Apr 20, 2019

Contributor

pls type annotate the parameters

This comment has been minimized.

Copy link
@qwhelan

qwhelan Apr 21, 2019

Author Contributor

invert was necessary in nanmedian(), but the median calculation is so slow there's minimal impact, so reverting that bit back and removing invert

skipna : bool
fill_value : Any, optional
value to fill NaNs with
fill_value_typ : str, optional

This comment has been minimized.

Copy link
@jreback

jreback Apr 17, 2019

Contributor

same, can you type above

This comment has been minimized.

Copy link
@qwhelan

qwhelan Apr 21, 2019

Author Contributor

Done

if skipna and not is_any_int_dtype(values):
mask = _maybe_get_mask(values, skipna, mask)

if skipna and not is_any_int_dtype(values) and mask is not None:

This comment has been minimized.

Copy link
@jreback

jreback Apr 17, 2019

Contributor

do you need the int check here? (as the mask will by definition be None if we have ints)

This comment has been minimized.

Copy link
@qwhelan

qwhelan Apr 21, 2019

Author Contributor

No. The odd case of Series(dtype='bool').prod() will be slightly faster as a result of it being removed.

if mask is not None:
null_mask = mask.size - mask.sum()
else:
null_mask = functools.reduce(operator.mul, shape)

This comment has been minimized.

Copy link
@jreback

jreback Apr 17, 2019

Contributor

is this line covered in tests? isn't this just np.prod(shape)

This comment has been minimized.

Copy link
@qwhelan

qwhelan Apr 21, 2019

Author Contributor

Yeah, this should be getting hit by Series(dtype=int).sum() and the like. I'll confirm once codecov finishes.

@@ -200,12 +200,84 @@ def _get_fill_value(dtype, fill_value=None, fill_value_typ=None):
return tslibs.iNaT


def _maybe_get_mask(values, skipna, mask, invert=False):

This comment has been minimized.

Copy link
@jreback

jreback Apr 20, 2019

Contributor

pls type annotate the parameters

@qwhelan qwhelan force-pushed the qwhelan:any_all_fix branch 6 times, most recently from 4f9c8eb to 6ac741b Apr 21, 2019

@qwhelan

This comment has been minimized.

Copy link
Contributor Author

commented Apr 22, 2019

Updated to reflect comments - the single failing test appears to be unrelated.

@jreback
Copy link
Contributor

left a comment

can you add a whatsnew note in performance improvements
can you add some signature typing where indicated (as much as you can)
merge master; ping on green.

def setup(self, N, dtype):
self.s = Series([1] * N, dtype=dtype)

def time_var(self, N, dtype):

This comment has been minimized.

Copy link
@jreback

jreback Apr 28, 2019

Contributor

can you update this

Show resolved Hide resolved pandas/core/nanops.py Outdated
Show resolved Hide resolved pandas/core/nanops.py Outdated
Show resolved Hide resolved pandas/core/nanops.py Outdated

@jreback jreback added this to the 0.25.0 milestone Apr 28, 2019

@qwhelan qwhelan force-pushed the qwhelan:any_all_fix branch from 6ac741b to d4290a4 Apr 29, 2019

@@ -569,7 +664,7 @@ def _get_counts_nanvar(mask, axis, ddof, dtype=float):
count = np.nan
d = np.nan
else:
mask2 = count <= ddof
mask2 = count <= ddof # type: np.ndarray

This comment has been minimized.

Copy link
@qwhelan

qwhelan Apr 29, 2019

Author Contributor

mypy analysis needed an extra hint for the following line. It deduced that mask2: Union[bool, np.ndarray] and throws an error as bool does not have an any() function.

@qwhelan qwhelan force-pushed the qwhelan:any_all_fix branch from d4290a4 to 0387154 Apr 29, 2019

@qwhelan

This comment has been minimized.

Copy link
Contributor Author

commented Apr 29, 2019

ping

@jreback jreback merged commit b6324be into pandas-dev:master Apr 30, 2019

11 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
pandas-dev.pandas Build #20190429.5 succeeded
Details
pandas-dev.pandas (Checks_and_doc) Checks_and_doc succeeded
Details
pandas-dev.pandas (Linux py35_compat) Linux py35_compat succeeded
Details
pandas-dev.pandas (Linux py36_locale_slow) Linux py36_locale_slow succeeded
Details
pandas-dev.pandas (Linux py36_locale_slow_old_np) Linux py36_locale_slow_old_np succeeded
Details
pandas-dev.pandas (Linux py37_locale) Linux py37_locale succeeded
Details
pandas-dev.pandas (Linux py37_np_dev) Linux py37_np_dev succeeded
Details
pandas-dev.pandas (Windows py36_np15) Windows py36_np15 succeeded
Details
pandas-dev.pandas (Windows py37_np141) Windows py37_np141 succeeded
Details
pandas-dev.pandas (macOS py35_macos) macOS py35_macos succeeded
Details
@jreback

This comment has been minimized.

Copy link
Contributor

commented Apr 30, 2019

thanks @qwhelan very nice!

@qwhelan qwhelan deleted the qwhelan:any_all_fix branch Aug 14, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.