**Group a Time Series with Pandas**

https://chrisalbon.com/python/data_wrangling/pandas_group_by_time/

**Preliminaries**

In [1]:
import pandas as pd
import numpy as np

**Create a DataFrame**

In [2]:
df = pd.DataFrame()

df['german_army'] = np.random.randint(low = 20000, high = 30000, size = 100)
df['allied_army'] = np.random.randint(low = 20000, high = 30000, size = 100)

df.index = pd.date_range('1/1/2014',periods = 100, freq='H')
df.head()

Unnamed: 0,german_army,allied_army
2014-01-01 00:00:00,25478,28745
2014-01-01 01:00:00,29766,27702
2014-01-01 02:00:00,26285,22260
2014-01-01 03:00:00,28898,22922
2014-01-01 04:00:00,27168,29223


**Truncate the dataframe**

In [3]:
df.truncate(before='1/2/2014', after='1/3/2014')

Unnamed: 0,german_army,allied_army
2014-01-02 00:00:00,22747,21609
2014-01-02 01:00:00,24353,23548
2014-01-02 02:00:00,23773,29922
2014-01-02 03:00:00,24410,22179
2014-01-02 04:00:00,25758,22055
2014-01-02 05:00:00,27217,26780
2014-01-02 06:00:00,24864,21712
2014-01-02 07:00:00,23520,28567
2014-01-02 08:00:00,26937,22073
2014-01-02 09:00:00,21862,21028


**Set the Dataframe's index**

In [4]:
df.index = df.index + pd.DateOffset(months=4, days=5)

In [10]:
#View the Dataframe
df

Unnamed: 0,german_army,allied_army
2014-05-06 00:00:00,25478,28745
2014-05-06 01:00:00,29766,27702
2014-05-06 02:00:00,26285,22260
2014-05-06 03:00:00,28898,22922
2014-05-06 04:00:00,27168,29223
2014-05-06 05:00:00,25513,26977
2014-05-06 06:00:00,27091,22197
2014-05-06 07:00:00,28608,26701
2014-05-06 08:00:00,25409,20879
2014-05-06 09:00:00,24053,23480


**Lead a variable 1 hour**

In [7]:
df.shift(1).head()

Unnamed: 0,german_army,allied_army
2014-05-06 00:00:00,,
2014-05-06 01:00:00,25478.0,28745.0
2014-05-06 02:00:00,29766.0,27702.0
2014-05-06 03:00:00,26285.0,22260.0
2014-05-06 04:00:00,28898.0,22922.0


**Lag a variable 1 hour**

In [9]:
df.shift(-1).tail()

Unnamed: 0,german_army,allied_army
2014-05-09 23:00:00,28157.0,24356.0
2014-05-10 00:00:00,27698.0,28427.0
2014-05-10 01:00:00,21297.0,22008.0
2014-05-10 02:00:00,28808.0,22798.0
2014-05-10 03:00:00,,


**Aggregate into days by summing up the value of each hourly observation**

In [11]:
df.resample('D').sum()

Unnamed: 0,german_army,allied_army
2014-05-06,612957,597673
2014-05-07,597124,591586
2014-05-08,607264,595437
2014-05-09,606337,600693
2014-05-10,105960,97589


**Aggregate into days by averaging up the value of each hourly observation**

In [12]:
df.resample('D').mean()

Unnamed: 0,german_army,allied_army
2014-05-06,25539.875,24903.041667
2014-05-07,24880.166667,24649.416667
2014-05-08,25302.666667,24809.875
2014-05-09,25264.041667,25028.875
2014-05-10,26490.0,24397.25


**Aggregate into days by taking the min value of the value of each hourly observation**

In [13]:
df.resample('D').min()

Unnamed: 0,german_army,allied_army
2014-05-06,20788,20034
2014-05-07,20617,20761
2014-05-08,20083,20196
2014-05-09,20748,20182
2014-05-10,21297,22008


**Aggregate into days by taking the median value of each day’s worth of hourly observation**

In [14]:
df.resample('D').median()

Unnamed: 0,german_army,allied_army
2014-05-06,25495.5,25480.0
2014-05-07,24818.0,24239.5
2014-05-08,25696.0,25061.0
2014-05-09,25549.0,25640.5
2014-05-10,27927.5,23577.0


**Aggregate into days by taking the first value of each day’s worth of hourly observation**

In [15]:
df.resample('D').first()

Unnamed: 0,german_army,allied_army
2014-05-06,25478,28745
2014-05-07,22747,21609
2014-05-08,29732,27427
2014-05-09,27313,20182
2014-05-10,28157,24356


**Aggregate into days by taking the last value of each day’s worth of hourly observation**

In [16]:
df.resample('D').last()

Unnamed: 0,german_army,allied_army
2014-05-06,23496,20034
2014-05-07,20617,26338
2014-05-08,28664,26282
2014-05-09,26097,20898
2014-05-10,28808,22798


**Aggregate into days by taking the first, last, highest, and lowest value of each day’s worth of hourly observation**

In [17]:
df.resample('D').ohlc()

Unnamed: 0_level_0,german_army,german_army,german_army,german_army,allied_army,allied_army,allied_army,allied_army
Unnamed: 0_level_1,open,high,low,close,open,high,low,close
2014-05-06,25478,29766,20788,23496,28745,29897,20034,20034
2014-05-07,22747,29394,20617,20617,21609,29922,20761,26338
2014-05-08,29732,29732,20083,28664,27427,29754,20196,26282
2014-05-09,27313,29691,20748,26097,20182,29843,20182,20898
2014-05-10,28157,28808,21297,28808,24356,28427,22008,22798
