Handling of NaN #12

kmuehlbauer · 2018-05-16T08:52:18Z

Short question, is it somehow possible to extend this to handle NaN, like numpy nanmedian?

The text was updated successfully, but these errors were encountered:

ajcr · 2018-05-17T19:07:46Z

Hi @kmuehlbauer, that's a good idea, I'll have to give some thought about how this could be implemented for each rolling iterator without affecting complexity.

For now, it should be straightforward to do this for some of the functions, just by using a generator with an appropriate fill-value. For example, Sum, filling NaN with 0:

>>> import math
>>> array = [1, 2, math.nan, 7, math.nan, 3, 2]
>>> array_fill_nan = (0 if math.isnan(x) else x for x in array) # generator, fills NaN values
>>> list(rolling.Sum(array_fill_nan, 3))
[3, 9, 7, 10, 5]

This approach doesn't work for Median however, as the fill value required at each step is not necessarily constant. I'll see whether adding support for missing values is feasible here. FWIW I think pandas just consider the whole window to be NaN if it contains at least one NaN value.

If your window size is small, rolling.Apply(array, window_size, operation=np.nanmedian)) should still be quite fast.

kmuehlbauer · 2018-05-18T07:25:27Z

@ajcr Thanks for looking into this. I'll definitely try your suggestion using rolling.Apply(array, window_size, operation=np.nanmedian)).

ajcr added enhancement investigate labels May 18, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handling of NaN #12

Handling of NaN #12

kmuehlbauer commented May 16, 2018

ajcr commented May 17, 2018 •

edited

kmuehlbauer commented May 18, 2018

Handling of NaN #12

Handling of NaN #12

Comments

kmuehlbauer commented May 16, 2018

ajcr commented May 17, 2018 • edited

kmuehlbauer commented May 18, 2018

ajcr commented May 17, 2018 •

edited