Perf regression in 0.13+ for df.apply #6013

8one6 · 2014-01-20T16:59:54Z

On this page:
http://pandas.pydata.org/pandas-docs/stable/enhancingperf.html#enhancingperf

Right after In[11]: displays timing results, the text suggests we have seen a 10x speedup compared with the original code. It looks like the original code runs in 336ms/loop. And the code executed in In[11] executes in 105ms/loop.

So only 3x speedup, right?

(Edit) And then later on, after In[14], the text suggests a 3x speedup but that run seems to take execution from 105ms/loop down to 2.5ms/loop = 40x speedup...

ghost · 2014-01-20T17:17:24Z

Good catch, this might be another case of a known regression in the perf of apply in 0.13.

In [4]: %load_ext cythonmagic

In [10]: %%cython
   ...:    ....: cdef double f_typed(double x) except? -2:
   ...:    ....:     return x * (x - 1)
   ...:    ....: cpdef double integrate_f_typed(double a, double b, int N):
   ...:    ....:     cdef int i
   ...:    ....:     cdef double s, dx
   ...:    ....:     s = 0
   ...:    ....:     dx = (b - a) / N
   ...:    ....:     for i in range(N):
   ...:    ....:         s += f_typed(a + i * dx)
   ...:    ....:     return s * dx
   ...:

in 0.13.0-246-g1e1907c:

In [7]: df = DataFrame({'a': randn(1000), 'b': randn(1000),'N': randint(100, 1000, (1000)), 'x': 'x'})
In [8]: %timeit df.apply(lambda x: integrate_f_typed(x['a'], x['b'], x['N']), axis=1)
10 loops, best of 3: 116 ms per loop

But in 0.12:

In [5]: df = DataFrame({'a': randn(1000), 'b': randn(1000),'N': randint(100, 1000, (1000)), 'x': 'x'})
In [6]: %timeit df.apply(lambda x: integrate_f_typed(x['a'], x['b'], x['N']), axis=1)
10 loops, best of 3: 19.6 ms per loop

@jreback , is that apply perf hit here to stay?

related #5654, #5656 ?

jreback mentioned this issue Jan 21, 2014

PERF: apply perf enhancements #6024

Merged

jreback closed this as completed in #6024 Jan 21, 2014

jreback mentioned this issue Jan 28, 2014

performance of DataFrame.corrwith vs raw apply #5671

Closed

rosnfeld mentioned this issue May 18, 2014

Feedback on v0.14.0 RC1 #7146

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf regression in 0.13+ for df.apply #6013

Perf regression in 0.13+ for df.apply #6013

8one6 commented Jan 20, 2014

ghost commented Jan 20, 2014

Perf regression in 0.13+ for df.apply #6013

Perf regression in 0.13+ for df.apply #6013

Comments

8one6 commented Jan 20, 2014

ghost commented Jan 20, 2014