## Rolling Data

有时候我们可能需要计算一些数据的统计值，我们可以使用 [rolling](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.rolling.html) 函数来完成。


In [None]:
DataFrame.rolling(window, # 必填参数决定窗口的大小，具体的例子可以参考文档。
                  min_periods=None, 
                  freq=None, 
                  center=False, 
                  win_type=None, 
                  on=None, 
                  axis=0, 
                  closed=None)

### 介绍 rolling 函数，先介绍一些汇总函数

| Method                                   | Description                              |
| ---------------------------------------- | ---------------------------------------- |
| [`count()`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.window.Rolling.count.html#pandas.core.window.Rolling.count) | Number of non-null observations          |
| [`sum()`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.window.Rolling.sum.html#pandas.core.window.Rolling.sum) | Sum of values                            |
| [`mean()`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.window.Rolling.mean.html#pandas.core.window.Rolling.mean) | Mean of values                           |
| [`median()`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.window.Rolling.median.html#pandas.core.window.Rolling.median) | Arithmetic median of values              |
| [`min()`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.window.Rolling.min.html#pandas.core.window.Rolling.min) | Minimum                                  |
| [`max()`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.window.Rolling.max.html#pandas.core.window.Rolling.max) | Maximum                                  |
| [`std()`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.window.Rolling.std.html#pandas.core.window.Rolling.std) | Bessel-corrected sample standard deviation |
| [`var()`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.window.Rolling.var.html#pandas.core.window.Rolling.var) | Unbiased variance                        |
| [`skew()`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.window.Rolling.skew.html#pandas.core.window.Rolling.skew) | Sample skewness (3rd moment)             |
| [`kurt()`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.window.Rolling.kurt.html#pandas.core.window.Rolling.kurt) | Sample kurtosis (4th moment)             |
| [`quantile()`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.window.Rolling.quantile.html#pandas.core.window.Rolling.quantile) | Sample quantile (value at %)             |
| [`apply()`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.window.Rolling.apply.html#pandas.core.window.Rolling.apply) | Generic apply                            |
| [`cov()`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.window.Rolling.cov.html#pandas.core.window.Rolling.cov) | Unbiased covariance (binary)             |
| [`corr()`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.window.Rolling.corr.html#pandas.core.window.Rolling.corr) | Correlation (binary)                     |

In [1]:
import pandas as pd

bars = {'value': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]}
df = pd.DataFrame(bars)


In [3]:
print(df.count())
print(df['value'].count())

value    10
dtype: int64
10


In [4]:
print(df.sum())
print(df['value'].sum())

value    45
dtype: int64
45


In [5]:
print(df.mean())
print(df['value'].mean())

value    4.5
dtype: float64
4.5


In [6]:
print(df.median())
print(df['value'].median())

value    4.5
dtype: float64
4.5


In [7]:
print(df.std()) # standard deviation，标准差，均方差
print(df['value'].std())

value    3.02765
dtype: float64
3.0276503541


In [8]:
print(df.var()) # unbiased variance , 无偏方差，均方差
print(df['value'].var())

value    9.166667
dtype: float64
9.16666666667


In [10]:
print(df.cov()) # unbiased variance (binary) 

          value
value  9.166667


## rolling 函数

rolling 窗口大小作为滑块很重要，默认滑块的方向是从 index 增长的方向一致。

In [12]:
bars = {'value': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]}
df = pd.DataFrame(bars)

In [15]:
print(df.rolling(2).std())

      value
0       NaN
1  0.707107
2  0.707107
3  0.707107
4  0.707107
5  0.707107
6  0.707107
7  0.707107
8  0.707107
9  0.707107


In [18]:
print(df.rolling(2).mean())

   value
0    NaN
1    0.5
2    1.5
3    2.5
4    3.5
5    4.5
6    5.5
7    6.5
8    7.5
9    8.5


In [19]:
print(df.rolling(10).mean())
print(df.rolling(10).std())

   value
0    NaN
1    NaN
2    NaN
3    NaN
4    NaN
5    NaN
6    NaN
7    NaN
8    NaN
9    4.5
     value
0      NaN
1      NaN
2      NaN
3      NaN
4      NaN
5      NaN
6      NaN
7      NaN
8      NaN
9  3.02765
