# Time Series 

In [27]:
# Load data set

import pandas as pd

df = pd.read_csv("https://raw.githubusercontent.com/retomarek/edap/main/edap/sampleData/flatTempHum.csv",
                 sep = ";")
df['time'] = pd.to_datetime(df['time'], format='%Y-%m-%d %H:%M:%S')

df.head()

Unnamed: 0,time,FlatA_Hum,FlatA_Temp,FlatB_Hum,FlatB_Temp,FlatC_Hum,FlatC_Temp,FlatD_Hum,FlatD_Temp
0,2018-10-03 00:00:00,53.0,24.43,38.8,22.4,44.0,24.5,49.0,24.43
1,2018-10-03 01:00:00,53.0,24.4,38.8,22.4,44.0,24.5,49.0,24.4
2,2018-10-03 02:00:00,53.0,24.4,39.3,22.4,44.7,24.5,48.3,24.38
3,2018-10-03 03:00:00,53.0,24.4,40.3,22.4,45.0,24.5,48.0,24.33
4,2018-10-03 04:00:00,53.3,24.4,41.0,22.37,45.2,24.5,47.7,24.3


## Datetime index

In [10]:
# set index and remove column
df = df.set_index("time", drop=True)

# remove duplicates
df = df[~df.index.duplicated(keep='first')]

df.head()

Unnamed: 0_level_0,FlatA_Hum,FlatA_Temp,FlatB_Hum,FlatB_Temp,FlatC_Hum,FlatC_Temp,FlatD_Hum,FlatD_Temp
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2018-10-03 00:00:00,53.0,24.43,38.8,22.4,44.0,24.5,49.0,24.43
2018-10-03 01:00:00,53.0,24.4,38.8,22.4,44.0,24.5,49.0,24.4
2018-10-03 02:00:00,53.0,24.4,39.3,22.4,44.7,24.5,48.3,24.38
2018-10-03 03:00:00,53.0,24.4,40.3,22.4,45.0,24.5,48.0,24.33
2018-10-03 04:00:00,53.3,24.4,41.0,22.37,45.2,24.5,47.7,24.3


```{note}
The index column with 0, 1, 2 etc. has gone and now datetime is the index! 
```

## Upsampling
Increase the frequency of the samples, such as from hours to 15min

In [21]:
df15min = df.resample("15min").interpolate(method="linear")
df15min.head()

Unnamed: 0_level_0,FlatA_Hum,FlatA_Temp,FlatB_Hum,FlatB_Temp,FlatC_Hum,FlatC_Temp,FlatD_Hum,FlatD_Temp
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2018-10-03 00:00:00,53.0,24.43,38.8,22.4,44.0,24.5,49.0,24.43
2018-10-03 00:15:00,53.0,24.4225,38.8,22.4,44.0,24.5,49.0,24.4225
2018-10-03 00:30:00,53.0,24.415,38.8,22.4,44.0,24.5,49.0,24.415
2018-10-03 00:45:00,53.0,24.4075,38.8,22.4,44.0,24.5,49.0,24.4075
2018-10-03 01:00:00,53.0,24.4,38.8,22.4,44.0,24.5,49.0,24.4


```{note}
Other upsample methods are
 - .interpolate(method="linear")
 - .interpolate(method="spline", order=2) # gives more natural curve like data
 - .bfill()[:15] # backwards fill
 - .pad()[:15]  # forwards fill
 ```

```{note}
Other Frequencies

![resampleOptions](/images/pythonBasics_resampleOptions.png)
```

## Downsampling
Decrease the frequency of the samples, such as from hours to days

In [25]:
dfDaily = df.resample("D").mean()
dfDaily.head()

Unnamed: 0_level_0,FlatA_Hum,FlatA_Temp,FlatB_Hum,FlatB_Temp,FlatC_Hum,FlatC_Temp,FlatD_Hum,FlatD_Temp
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2018-10-03,50.547826,24.200435,43.379167,22.627917,46.070833,24.6325,50.652381,24.458571
2018-10-04,54.033333,24.232083,47.366667,22.7975,47.8125,24.655417,48.078261,24.489565
2018-10-05,52.682609,24.210435,47.858333,23.095417,47.9625,24.7475,50.533333,24.62125
2018-10-06,52.708696,24.180435,49.645833,23.3225,51.183333,24.6,52.054167,24.57375
2018-10-07,56.058333,24.296667,51.6625,23.587083,53.541667,24.565417,50.973913,24.384783


```{note}
Other downsample methods are
 - .min()
 - .max()
 - .median()
 - .mean()
 - .sum()
 - etc.
 ```