You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
import pandas as pd
import numpy as np
df = pd.DataFrame({'group': np.repeat(np.arange(1000), 10),
'B': np.nan,
'C': np.nan})
df.ix[4::10, 'B':'C'] = 5 # every 4th row of a group is non-null
df.groupby('group').transform('first')
This is then iterating over groups. Last I can see this was changed is: here. My recollection is that this was ONLY supposed to hit in a special case, and the general case is simply a repeat based on the indices.
This seems to be hitting in all cases makes transform back to super SLOW.
The text was updated successfully, but these errors were encountered:
So we have a fast path for Series transforms, but not for DataFrame transforms.
In [17]: result1 = g.transform('first')
In [18]: result2 = pd.concat([g.B.transform('first'), g.C.transform('first')], keys=['B','C'], axis=1)
In [19]: result1.equals(result2)
Out[19]: True
In [20]: %timeit g.transform('first')
10 loops, best of 3: 170 ms per loop
In [21]: %timeit pd.concat([g.B.transform('first'), g.C.transform('first')], keys=['B','C'], axis=1)
1000 loops, best of 3: 2 ms per loop
jreback
changed the title
PERF: regression in transform perf
PERF: DataFrame groupby with fast transform
Mar 29, 2016
from SO
This is then iterating over groups. Last I can see this was changed is: here. My recollection is that this was ONLY supposed to hit in a special case, and the general case is simply a repeat based on the indices.
This seems to be hitting in all cases makes transform back to super SLOW.
The text was updated successfully, but these errors were encountered: