# Datetime Aggregates in Pandas Tutorial

This notebook explains how to aggregate time series data in `pandas`.

This notebook will use gold and silver price data from `rdatasets` for this tutorial

### Packages

The documentation for each package used in this tutorial is linked below:
* [pandas](https://pandas.pydata.org/docs/)
* [statsmodels](https://www.statsmodels.org/stable/index.html)
    * [statsmodels.api](https://www.statsmodels.org/stable/api.html#statsmodels-api)

In [1]:
import statsmodels.api as sm
import pandas as pd

## Create initial dataset

The data is from `rdatasets` imported using the Python package `statsmodels`.

In [2]:
df = sm.datasets.get_rdataset('GoldSilver', 'AER').data.reset_index().rename(columns={'index': 'date'})
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9132 entries, 0 to 9131
Data columns (total 3 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   rownames  9132 non-null   object 
 1   gold      9132 non-null   float64
 2   silver    9132 non-null   float64
dtypes: float64(2), object(1)
memory usage: 214.2+ KB


In [3]:
df['date'] = pd.to_datetime(df.date)

AttributeError: 'DataFrame' object has no attribute 'date'

## Time series aggregation

The `pandas` function `rolling` can be used to create aggregations on windows of specific lengths.  Here, an aggregate of the daily gold and silver price data will be created covering the primary week.  

First, a datetime index needs to be created from the **date** column.

In [None]:
df.set_index('date', inplace=True)

If, instead of an offset (**'7D'** representing 7 days), a number is used, it will just use the prior number of observations.

In [None]:
weekly_resample = df.rolling('7D')
aggregated_df = weekly_resample.agg(['min', 'mean', 'max', 'std'])
aggregated_df.columns = ['_'.join(col).strip() + '_week' for col in aggregated_df.columns.values]

In [None]:
aggregated_df.head(20)

Unnamed: 0_level_0,gold_min_week,gold_mean_week,gold_max_week,gold_std_week,silver_min_week,silver_mean_week,silver_max_week,silver_std_week
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
1977-12-30,100.0,100.0,100.0,,223.42,223.42,223.42,
1978-01-02,100.0,100.0,100.0,0.0,223.42,223.42,223.42,0.0
1978-01-03,100.0,100.0,100.0,0.0,223.42,225.56,229.84,3.706589
1978-01-04,100.0,100.0,100.0,0.0,223.42,225.315,229.84,3.065828
1978-01-05,100.0,100.0,100.0,0.0,223.42,225.85,229.84,2.912147
1978-01-06,100.0,100.0,100.0,0.0,223.42,226.604,229.84,2.596657
1978-01-09,100.0,100.246,101.23,0.550073,224.58,227.844,229.84,2.13547
1978-01-10,100.0,100.436,101.23,0.605169,224.58,227.67,229.62,1.960446
1978-01-11,100.0,100.886,102.25,0.94246,227.19,228.998,231.22,1.54999
1978-01-12,100.0,101.062,102.25,0.808251,227.19,228.978,231.22,1.566802
