# Working with Time Series in Pandas.

###### In this Notebook, we'll be looking at some tools and strategies to further prepare our Pandas DataFrames for use

In [2]:
#We begin by importing the modules we will be using 

import pandas as pd

from pandas_formatting import pandas_formatting #pandas_formatting.py must be included in the working directory

In [3]:
df_306 = pandas_formatting()

In [3]:
df_306

Unnamed: 0_level_0,Temp_Exterieur
time,Unnamed: 1_level_1
2019-02-18 19:57:44.659000+01:00,8.804548
2019-02-18 19:59:44.656000+01:00,8.728136
2019-02-18 20:01:44.652000+01:00,8.670820
2019-02-18 20:03:44.649000+01:00,8.623060
2019-02-18 20:05:44.645000+01:00,8.594408
2019-02-18 20:07:44.642000+01:00,8.546648
2019-02-18 20:09:44.639000+01:00,8.508439
2019-02-18 20:11:44.635000+01:00,8.460680
2019-02-18 20:13:44.721000+01:00,8.403369
2019-02-18 20:15:46.507000+01:00,8.346058


###### So, let's say we want to divide our data in order to look at different periods of time, such as night and day over the entire range of data. We can create two new DataFrames using the 'between_time' function, taking the starting and ending times as arguments.

In [4]:
df_306_night = df_306.between_time('19:00:00', '08:00:00')
df_306_day = df_306.between_time('08:00:00', '19:00:00')

###### df_306_day = pd.Series.between_time(df_306, start_time='08:00', end_time='19:00', include_start=True, include_end=True) is another option for this task

###### Below we can see that our new DataFrames correspond to the times we've chosen.

###### Here we have our nighttime DataFrame

In [5]:
df_306_night.head(), df_306_night.tail()

(                                  Temp_Exterieur
 time                                            
 2019-02-18 19:57:44.659000+01:00        8.804548
 2019-02-18 19:59:44.656000+01:00        8.728136
 2019-02-18 20:01:44.652000+01:00        8.670820
 2019-02-18 20:03:44.649000+01:00        8.623060
 2019-02-18 20:05:44.645000+01:00        8.594408,
                                   Temp_Exterieur
 time                                            
 2019-07-11 07:31:50.109000+02:00       17.171202
 2019-07-11 07:37:40.096000+02:00       17.188635
 2019-07-11 07:43:30.186000+02:00       17.223492
 2019-07-11 07:49:20.176000+02:00       17.249640
 2019-07-11 07:55:10.169000+02:00       17.336788)

###### Here we have our daytime DataFrame

In [6]:
df_306_day.head(), df_306_day.tail()

(                                  Temp_Exterieur
 time                                            
 2019-02-19 08:01:44.624000+01:00        5.394508
 2019-02-19 08:03:44.720000+01:00        5.413615
 2019-02-19 08:05:44.718000+01:00        5.432716
 2019-02-19 08:07:44.714000+01:00        5.461374
 2019-02-19 08:09:44.712000+01:00        5.480476,
                                   Temp_Exterieur
 time                                            
 2019-07-11 15:26:40.130000+02:00       28.884310
 2019-07-11 15:32:30.118000+02:00       28.866880
 2019-07-11 15:38:20.113000+02:00       28.771011
 2019-07-11 15:44:10.099000+02:00       28.727436
 2019-07-11 15:50:00.193000+02:00       28.762297)

### If we want to look at the data for just one day, we can use the 'loc' method. It is essential to use the same format that appears in the data:

In [7]:
df_306_one_day = df_306.loc['2019-04-29']

In [8]:
df_306_one_day

Unnamed: 0_level_0,Temp_Exterieur
time,Unnamed: 1_level_1
2019-04-29 00:02:09.023000+02:00,7.801596
2019-04-29 00:07:59.024000+02:00,7.725179
2019-04-29 00:13:49.003000+02:00,7.648766
2019-04-29 00:19:38.994000+02:00,7.553247
2019-04-29 00:25:28.984000+02:00,7.457727
2019-04-29 00:31:19.074000+02:00,7.333551
2019-04-29 00:37:09.065000+02:00,7.228480
2019-04-29 00:42:59.054000+02:00,7.123410
2019-04-29 00:48:49.044000+02:00,7.018340
2019-04-29 00:54:39.035000+02:00,6.961025


#### We can use the same method to get the data for one month: 

In [9]:
df_306_one_month = df_306.loc['2019-04']

In [10]:
df_306_one_month

Unnamed: 0_level_0,Temp_Exterieur
time,Unnamed: 1_level_1
2019-04-01 00:03:19.068000+02:00,12.630631
2019-04-01 00:09:09.055000+02:00,12.534765
2019-04-01 00:14:59.089000+02:00,12.456327
2019-04-01 00:20:49.038000+02:00,12.377892
2019-04-01 00:26:39.031000+02:00,12.273312
2019-04-01 00:32:29.019000+02:00,12.203590
2019-04-01 00:38:19.008000+02:00,12.133871
2019-04-01 00:44:08.999000+02:00,12.064148
2019-04-01 00:49:58.989000+02:00,11.994431
2019-04-01 00:55:48.980000+02:00,11.959570


#### We can also get a slice of several consecutive days using the same method but with two arguments indicating the start and end of the slice we want to examine: 

In [11]:
df_306_days_slice = df_306.loc['2019-05-22':'2019-05-29']

In [12]:
df_306_days_slice

Unnamed: 0_level_0,Temp_Exterieur
time,Unnamed: 1_level_1
2019-05-22 00:01:28.987000+02:00,12.081581
2019-05-22 00:07:18.979000+02:00,11.985712
2019-05-22 00:13:09.076000+02:00,11.915993
2019-05-22 00:18:59.062000+02:00,11.837557
2019-05-22 00:24:49.051000+02:00,11.759121
2019-05-22 00:30:39.044000+02:00,11.680683
2019-05-22 00:36:29.034000+02:00,11.602247
2019-05-22 00:42:19.023000+02:00,11.523815
2019-05-22 00:48:09.012000+02:00,11.454090
2019-05-22 00:53:59.005000+02:00,11.358227


#### We can even use this method to get a slice between two times (here just one hour), as shown below. Again, it is imperative you use the same time format as the one in the DataFrame. 

In [13]:
df_306_hour_slice = df_306.loc['2019-05-25 12:00:00':'2019-05-25 13:00:00']

In [14]:
df_306_hour_slice

Unnamed: 0_level_0,Temp_Exterieur
time,Unnamed: 1_level_1
2019-05-25 12:04:58.992000+02:00,22.66172
2019-05-25 12:10:48.983000+02:00,22.722729
2019-05-25 12:15:28.977000+02:00,23.11491
2019-05-25 12:21:19.067000+02:00,22.731443
2019-05-25 12:24:49.063000+02:00,22.32183
2019-05-25 12:27:09.059000+02:00,21.947084
2019-05-25 12:32:59.047000+02:00,22.051664
2019-05-25 12:37:39.042000+02:00,22.374125
2019-05-25 12:43:29.030000+02:00,22.66172
2019-05-25 12:49:19.024000+02:00,23.01904
