# Pandas + Numba = ðŸš€


* Pandas 1.0 added support for numba-jitted functions in `.rolling().apply()`. 
* This is expanded to `groupby().aggregate()` and `.transform()` in pandas 1.1.

Thanks to Matthew Roeschke (https://github.com/mroeschke) for leading this effort and to Two Sigma for sponsoring!

A rolling ("moving window") operation: 

In [None]:
data = pd.Series(np.random.randn(1_000_000))

In [None]:
%timeit data.rolling(10).mean()

When a built-in function is not sufficient, you can "apply" your own:

In [None]:
def udf(x):
    # my User Defined Function (UDF) with custom logic
    # ...
    return np.mean(x)

In [None]:
%time data.rolling(10).apply(udf, raw=True)

This is *much* slower as the optimized, built-in method. 

But we can now use the numba JIT compiler to speed up user defined functions:

In [None]:
%time data.rolling(10).apply(udf, raw=True, engine="numba")

In [None]:
%time data.rolling(10).apply(udf, raw=True, engine="numba");

In [None]:
%timeit data.rolling(10).apply(udf, raw=True, engine="numba");

## Optionally disallow duplicate labels

In [None]:
s = pd.Series([1, 2], index=['a', 'b'])
# s = s.set_flags(allows_duplicate_labels=False)

In [None]:
s.reindex(['a', 'b', 'a'])

In [None]:
%%html
<style>
.jp-Cell.jp-mod-selected ~ .jp-Cell {
    display: none;
}
</style>