PERF: Very slow clip performance #15400

Closed
wesm opened this Issue Feb 14, 2017 · 2 comments

Comments

Projects
None yet
4 participants
Owner

wesm commented Feb 14, 2017

Code Sample, a copy-pastable example if possible

In [38]: s = pd.Series(np.random.randn(30))

In [39]: timeit s.clip(0, 1)
100 loops, best of 3: 2.02 ms per loop

Problem description

There is more than 1000x performance difference between Series.clip and numpy.clip:

In [43]: timeit np.clip(arr, 0, 1)
1000000 loops, best of 3: 1.06 µs per loop

Output of pd.show_versions()

pandas 0.19.2

wesm added the Bug label Feb 14, 2017

wesm added this to the 0.20.0 milestone Feb 14, 2017

I wondered where this huge difference came from. Not that I want to say that this big difference is not a problem, but this seems a consequence of its implemention / several slower functions that are used under the hood.
The clip is done in two separate steps for clip_upper and clip_lower. Each of those clips does a comparison to create a mask and then uses where; in where an align is done, etc:

In [89]: %timeit s.clip(0, 1)
100 loops, best of 3: 1.91 ms per loop

In [91]: %timeit s.clip_lower(0)
1000 loops, best of 3: 958 µs per loop

In [92]: %timeit s < 0
10000 loops, best of 3: 118 µs per loop

In [93]: mask = s < 0

In [94]: %timeit s.where(mask, 0)
1000 loops, best of 3: 395 µs per loop

In [100]: %timeit s.align(mask)
10000 loops, best of 3: 98.6 µs per loop

So it seems that several individual steps in the current implementation (creation of the mask, the alignment, ..) already take way longer than the actual clip in numpy. Probably each of those steps can be optimized, but you won't get a big speed-up with that I think. To get a big speed-up in pandas' clip, we should probably need a more low-level implementation.

When you look at a larger series, the difference is not that huge anymore:

In [32]: s = pd.Series(np.random.randn(100000))

In [33]: %timeit s.clip(0,1)
The slowest run took 8.48 times longer than the fastest. This could mean that an intermediate result is being cached.
100 loops, best of 3: 4.27 ms per loop

In [34]: arr = s.values

In [35]: %timeit np.clip(arr,0,1)
The slowest run took 4.41 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 558 µs per loop
Owner

wesm commented Feb 20, 2017

Profile results of 100 runs

         301103 function calls (300903 primitive calls) in 0.220 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.220    0.220 {built-in method builtins.exec}
        1    0.000    0.000    0.220    0.220 <string>:1(<module>)
      100    0.001    0.000    0.220    0.002 generic.py:3825(clip)
      100    0.001    0.000    0.109    0.001 generic.py:3913(clip_lower)
      100    0.001    0.000    0.109    0.001 generic.py:3889(clip_upper)
      200    0.001    0.000    0.092    0.000 generic.py:4806(where)
      200    0.002    0.000    0.092    0.000 generic.py:4547(_where)
2000/1800    0.013    0.000    0.074    0.000 internals.py:2978(apply)
     2800    0.012    0.000    0.065    0.000 series.py:135(__init__)
      200    0.001    0.000    0.062    0.000 ops.py:903(wrapper)
      400    0.001    0.000    0.050    0.000 ops.py:907(<lambda>)
      200    0.001    0.000    0.047    0.000 ops.py:1039(flex_wrapper)
      600    0.002    0.000    0.040    0.000 series.py:2364(fillna)
      600    0.005    0.000    0.039    0.000 generic.py:3200(fillna)
      200    0.002    0.000    0.036    0.000 ops.py:803(wrapper)
      200    0.001    0.000    0.034    0.000 internals.py:3158(where)
      600    0.002    0.000    0.032    0.000 generic.py:3007(astype)
      600    0.002    0.000    0.031    0.000 generic.py:3057(copy)
      200    0.000    0.000    0.027    0.000 series.py:2342(align)
      200    0.001    0.000    0.027    0.000 generic.py:4379(align)
      200    0.001    0.000    0.026    0.000 generic.py:4470(_align_series)
      400    0.001    0.000    0.022    0.000 series.py:2360(reindex)
      400    0.002    0.000    0.022    0.000 generic.py:2224(reindex)
      400    0.000    0.000    0.021    0.000 series.py:2326(_reindex_inde

A single call to clip calls the Series constructor 28 times. Not good. I will try to look more deeply into fixing this if no one beats me to it

@jreback jreback modified the milestone: 0.20.2, Interesting Issues May 16, 2017

@jreback jreback added a commit to jreback/pandas that referenced this issue May 16, 2017

@jreback jreback PERF: improved clip performance
closes #15400
daff5ea

@jreback jreback added a commit to jreback/pandas that referenced this issue May 16, 2017

@jreback jreback PERF: improved clip performance
closes #15400
6efa1c8

@jreback jreback added a commit to jreback/pandas that referenced this issue May 16, 2017

@jreback jreback PERF: improved clip performance
closes #15400
62843f8

jreback closed this in #16364 May 16, 2017

@jreback jreback added a commit that referenced this issue May 16, 2017

@jreback jreback PERF: improved clip performance (#16364)
closes #15400
42e2a87

@pcluo pcluo added a commit to pcluo/pandas that referenced this issue May 22, 2017

@jreback @pcluo jreback + pcluo PERF: improved clip performance (#16364)
closes #15400
a4730d5

@TomAugspurger TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue May 29, 2017

@jreback @TomAugspurger jreback + TomAugspurger PERF: improved clip performance (#16364)
closes #15400
(cherry picked from commit 42e2a87)
41d90dc

@TomAugspurger TomAugspurger added a commit that referenced this issue May 30, 2017

@jreback @TomAugspurger jreback + TomAugspurger PERF: improved clip performance (#16364)
closes #15400
(cherry picked from commit 42e2a87)
f16141f

@stangirala stangirala added a commit to stangirala/pandas that referenced this issue Jun 11, 2017

@jreback @stangirala jreback + stangirala PERF: improved clip performance (#16364)
closes #15400
4c6b1c9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment