GroupBy transform() is surprisingly slow

I came across a strange slowness in GroupBy transform() function. I put together a simple function to avoid using apply() because it can be REALLY slow:

```
def apply_by_group(grouped, f):
    """
    Applies a function to each DataFrame in a DataFrameGroupBy object, concatenates the results
    and returns the resulting DataFrame.

    Parameters
    ----------
    grouped: DataFrameGroupBy
        The grouped DataFrame that contains column(s) to be ranked and, potentially, a column with weights.
    f: callable
        Function to apply to each DataFrame.

    Returns
    -------
    DataFrame that results from applying the function to each DataFrame in the DataFrameGroupBy object and
    concatenating the results.

    """
    assert isinstance(grouped, DataFrameGroupBy)
    assert hasattr(f, '__call__')

    data_frames = []
    for key, data_frame in grouped:
        data_frames.append(f(data_frame))
    return pd.concat(data_frames)
```

Now I observe the following for the two equivalent ways of doing the same thing:

```
%timeit data.groupby(level=field_security_id).transform(lambda x: x.fillna())
```

> 1 loops, best of 3: 24.3 s per loop

```
%timeit apply_by_group(data.groupby(level=field_security_id), lambda x: x.fillna())
```

> 1 loops, best of 3: 2.72 s per loop

That was unexpected. Am I doing something wrong in using transform()?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

GroupBy transform() is surprisingly slow #2121

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

GroupBy transform() is surprisingly slow #2121

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions