# Windows in Pandas

In [40]:
import pandas as pd
import numpy as np 

Windows in pandas are like powerfull slices over a pd.Series o pd.Dataframe
###### https://pandas.pydata.org/docs/user_guide/window.html#overview
Pandas supports 4 types of windowing oparetions:
1. Rolling window: Generic fixed or variable sliding window over the values.

2. Weighted window: Weighted, non-rectangular window supplied by the scipy.signal library.

3. Expanding window: Accumulating window over the values.

4. Exponentially Weighted window: Accumulating and exponentially weighted window over the values.

Notes: 
* Only Rolling windows permits time-based windows
* Windowing operations currently only support numeric data (integer and float) and will always return float64 values.

In [27]:
# build a serie of integers 0 1 2 3 4
s = pd.Series(range(5), index=pd.date_range('2020-01-01', periods=5, freq='1D'))
print(s)

# roll in copules from the beggining and sum = 0 1 3 5 7
s.rolling(window='2D').sum()


2020-01-01    0
2020-01-02    1
2020-01-03    2
2020-01-04    3
2020-01-05    4
Freq: D, dtype: int64


2020-01-01    0.0
2020-01-02    1.0
2020-01-03    3.0
2020-01-04    5.0
2020-01-05    7.0
Freq: D, dtype: float64

As noted above, some operations support specifying a window based on a *time offset* (only rolling):

In [28]:
# build a serie with consecutive days index and integers 0 1 2 3 4
s = pd.Series(range(5), index=pd.date_range('2020-01-01', periods=5, freq='1D'))


In [29]:

# NOTE: in a two DAYS windows Add the VALUES of consecutive DATES in Pairs.
s.rolling(window='2D').sum()

2020-01-01    0.0
2020-01-02    1.0
2020-01-03    3.0
2020-01-04    5.0
2020-01-05    7.0
Freq: D, dtype: float64

In [30]:
# NOTE: in a two ROWS windows Add the VALUES of consecutive VALUES in Pairs.
s.rolling(window=2).sum()

2020-01-01    NaN
2020-01-02    1.0
2020-01-03    3.0
2020-01-04    5.0
2020-01-05    7.0
Freq: D, dtype: float64

Note: some methods support chaining a groupby operation with a windowing operation which will first group the data by the specified keys and then perform a windowing operation per group.

In [31]:
# build a dataframe with two columns (leters, consecutive numbers)
df=pd.DataFrame({'A': ['a', 'b', 'a', 'b', 'a'], 'B': range(5)})
df

Unnamed: 0,A,B
0,a,0
1,b,1
2,a,2
3,b,3
4,a,4


In [32]:
df.groupby('A').expanding().sum()

Unnamed: 0_level_0,Unnamed: 1_level_0,B
A,Unnamed: 1_level_1,Unnamed: 2_level_1
a,0,0.0
a,2,2.0
a,4,6.0
b,1,1.0
b,3,4.0


## Rolling Apply

The apply() function takes an extra func argument and performs generic rolling computations. 
* The func argument should be a **single function that produces a single value** from an **ndarray input**.
* raw specifies whether the windows are cast as Series objects (raw=False) or ndarray objects (raw=True).

In [43]:
# serie 0 1 2 3 4 5 6 7 8 9
s = pd.Series(range(10))

def mad(x):
    return np.fabs(x - x.mean()).mean() # single function produces single value from ndarray input

s.rolling(window=4).apply(mad, raw=True)


0    NaN
1    NaN
2    NaN
3    1.0
4    1.0
5    1.0
6    1.0
7    1.0
8    1.0
9    1.0
dtype: float64

In [45]:
df = pd.DataFrame(range(10), index=pd.date_range("2020", periods=10))
df

Unnamed: 0,0
2020-01-01,0
2020-01-02,1
2020-01-03,2
2020-01-04,3
2020-01-05,4
2020-01-06,5
2020-01-07,6
2020-01-08,7
2020-01-09,8
2020-01-10,9


In [52]:
dfr = df.rolling(window=2)
dfr.iloc[0]

AttributeError: 'Rolling' object has no attribute 'iloc'

In [None]:
dfr