# Time Series

<b>Definition</b> from <a href='https://onlinecourses.science.psu.edu/stat510/node/47'>here</a>: <i>a sequence of measurements of the same variable collected over time</i>. Examples: stock prices, demand, housing prices.

In pandas a time series is a Series object where the index is a "timestamp".
Online reference: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html

### Create time series

`pd.date_range()` create date sequence (`DatetimeIndex`)

In [1]:
import pandas as pd
dates = pd.date_range('2015-01-01', '2018-01-01')
dates

DatetimeIndex(['2015-01-01', '2015-01-02', '2015-01-03', '2015-01-04',
               '2015-01-05', '2015-01-06', '2015-01-07', '2015-01-08',
               '2015-01-09', '2015-01-10',
               ...
               '2017-12-23', '2017-12-24', '2017-12-25', '2017-12-26',
               '2017-12-27', '2017-12-28', '2017-12-29', '2017-12-30',
               '2017-12-31', '2018-01-01'],
              dtype='datetime64[ns]', length=1097, freq='D')

time series is pandas series with `DatetimeIndex` as index

In [2]:
import numpy as np
ts = pd.Series(np.random.random(len(dates)), index = dates) # assign random values to the time series

In [3]:
ts.head()

2015-01-01    0.921144
2015-01-02    0.123330
2015-01-03    0.621593
2015-01-04    0.288880
2015-01-05    0.018699
Freq: D, dtype: float64

More about generating date range

In [4]:
# 20 days starting from 4/1/2012
pd.date_range(start='4/1/2012', periods = 20)

DatetimeIndex(['2012-04-01', '2012-04-02', '2012-04-03', '2012-04-04',
               '2012-04-05', '2012-04-06', '2012-04-07', '2012-04-08',
               '2012-04-09', '2012-04-10', '2012-04-11', '2012-04-12',
               '2012-04-13', '2012-04-14', '2012-04-15', '2012-04-16',
               '2012-04-17', '2012-04-18', '2012-04-19', '2012-04-20'],
              dtype='datetime64[ns]', freq='D')

Modify the spacing by altering the `freq` argument
Listing of some Pandas frequency (offset) codes
- D: Calendar day
- W: Weekly
- M: Month end
- Q: Quarter end
- A: Year end
- H: Hours
- B: Business day

Five last calendar days of month starting from 4/1/2012

In [56]:
pd.date_range(start='4/1/2012', periods = 5, freq='M')

DatetimeIndex(['2012-04-30', '2012-05-31', '2012-06-30', '2012-07-31',
               '2012-08-31'],
              dtype='datetime64[ns]', freq='M')

### Indexing, Selction, Subsetting
TimeSeries is a subclass of Series and thus behaves in the same way with regard to indexing and selecting data based on label

You can pass a string that is interpretable as a date

In [6]:
ts['2015-01-01']

0.9211439873578869

You can pass year or month to get slices of data

In [7]:
ts['2015-01']

2015-01-01    0.921144
2015-01-02    0.123330
2015-01-03    0.621593
2015-01-04    0.288880
2015-01-05    0.018699
2015-01-06    0.511649
2015-01-07    0.568564
2015-01-08    0.290390
2015-01-09    0.104577
2015-01-10    0.055022
2015-01-11    0.338301
2015-01-12    0.313413
2015-01-13    0.802317
2015-01-14    0.192736
2015-01-15    0.735822
2015-01-16    0.515849
2015-01-17    0.180308
2015-01-18    0.129479
2015-01-19    0.596362
2015-01-20    0.481404
2015-01-21    0.075469
2015-01-22    0.328249
2015-01-23    0.065641
2015-01-24    0.276494
2015-01-25    0.415805
2015-01-26    0.330468
2015-01-27    0.684715
2015-01-28    0.014834
2015-01-29    0.860575
2015-01-30    0.343180
2015-01-31    0.934729
Freq: D, dtype: float64

You can slice with a range

In [8]:
ts['2015-01':'2015-02']

2015-01-01    0.921144
2015-01-02    0.123330
2015-01-03    0.621593
2015-01-04    0.288880
2015-01-05    0.018699
2015-01-06    0.511649
2015-01-07    0.568564
2015-01-08    0.290390
2015-01-09    0.104577
2015-01-10    0.055022
2015-01-11    0.338301
2015-01-12    0.313413
2015-01-13    0.802317
2015-01-14    0.192736
2015-01-15    0.735822
2015-01-16    0.515849
2015-01-17    0.180308
2015-01-18    0.129479
2015-01-19    0.596362
2015-01-20    0.481404
2015-01-21    0.075469
2015-01-22    0.328249
2015-01-23    0.065641
2015-01-24    0.276494
2015-01-25    0.415805
2015-01-26    0.330468
2015-01-27    0.684715
2015-01-28    0.014834
2015-01-29    0.860575
2015-01-30    0.343180
2015-01-31    0.934729
2015-02-01    0.707837
2015-02-02    0.055182
2015-02-03    0.891621
2015-02-04    0.667732
2015-02-05    0.145804
2015-02-06    0.238311
2015-02-07    0.550555
2015-02-08    0.301878
2015-02-09    0.341593
2015-02-10    0.178913
2015-02-11    0.753034
2015-02-12    0.956015
2015-02-13 

### Change frequency of time series: `asfreq()`

Reduece daily TimeSeries to montly

In [66]:
monthly = ts.asfreq('M')
monthly

2015-01-31    0.934729
2015-02-28    0.939305
2015-03-31    0.558268
2015-04-30    0.648754
2015-05-31    0.776106
2015-06-30    0.395701
2015-07-31    0.769036
2015-08-31    0.924355
2015-09-30    0.637785
2015-10-31    0.122663
2015-11-30    0.494895
2015-12-31    0.137508
2016-01-31    0.665083
2016-02-29    0.050465
2016-03-31    0.405590
2016-04-30    0.611712
2016-05-31    0.396878
2016-06-30    0.969825
2016-07-31    0.000111
2016-08-31    0.006545
2016-09-30    0.524212
2016-10-31    0.787631
2016-11-30    0.678436
2016-12-31    0.535701
2017-01-31    0.241974
2017-02-28    0.112726
2017-03-31    0.893223
2017-04-30    0.507364
2017-05-31    0.909694
2017-06-30    0.020791
2017-07-31    0.710918
2017-08-31    0.058906
2017-09-30    0.277880
2017-10-31    0.767969
2017-11-30    0.292577
2017-12-31    0.453909
Freq: M, dtype: float64

`asfreq()` optionally provide filling method to pad/backfill missing values
- bfill: use NEXT valid observation to fill
- ffill: forward fill, propagate last valid observation forward to next valid

Upsample montly timeseries to daily timeseries by backfilling

In [68]:
monthly.asfreq('D', method='bfill')

2015-01-31    0.934729
2015-02-01    0.939305
2015-02-02    0.939305
2015-02-03    0.939305
2015-02-04    0.939305
2015-02-05    0.939305
2015-02-06    0.939305
2015-02-07    0.939305
2015-02-08    0.939305
2015-02-09    0.939305
2015-02-10    0.939305
2015-02-11    0.939305
2015-02-12    0.939305
2015-02-13    0.939305
2015-02-14    0.939305
2015-02-15    0.939305
2015-02-16    0.939305
2015-02-17    0.939305
2015-02-18    0.939305
2015-02-19    0.939305
2015-02-20    0.939305
2015-02-21    0.939305
2015-02-22    0.939305
2015-02-23    0.939305
2015-02-24    0.939305
2015-02-25    0.939305
2015-02-26    0.939305
2015-02-27    0.939305
2015-02-28    0.939305
2015-03-01    0.558268
2015-03-02    0.558268
2015-03-03    0.558268
2015-03-04    0.558268
2015-03-05    0.558268
2015-03-06    0.558268
2015-03-07    0.558268
2015-03-08    0.558268
2015-03-09    0.558268
2015-03-10    0.558268
2015-03-11    0.558268
2015-03-12    0.558268
2015-03-13    0.558268
2015-03-14    0.558268
2015-03-15 

### Resampling times series to apply functions : `resample()`
Converting a time series from one frequency to another

Resample the time series to montly data

In [58]:
ts.resample('M')

DatetimeIndexResampler [freq=<MonthEnd>, axis=0, closed=right, label=right, convention=start, base=0]

Get montly total

In [69]:
ts.resample('M').sum()

2015-01-31    12.119999
2015-02-28    13.988243
2015-03-31    15.833025
2015-04-30    17.063691
2015-05-31    18.274617
2015-06-30    15.727089
2015-07-31    15.525557
2015-08-31    16.325248
2015-09-30    16.996933
2015-10-31    15.670862
2015-11-30    14.252871
2015-12-31    16.029240
2016-01-31    15.699858
2016-02-29    15.321786
2016-03-31    13.564454
2016-04-30    15.783223
2016-05-31    18.173194
2016-06-30    18.066961
2016-07-31    13.652858
2016-08-31    14.260842
2016-09-30    16.374439
2016-10-31    15.541874
2016-11-30    14.411264
2016-12-31    15.161990
2017-01-31    11.976674
2017-02-28    14.391076
2017-03-31    16.579714
2017-04-30    16.929727
2017-05-31    17.922335
2017-06-30    14.356997
2017-07-31    13.311486
2017-08-31    14.567340
2017-09-30    17.052738
2017-10-31    15.484308
2017-11-30    11.495999
2017-12-31    16.218279
2018-01-31     0.008603
Freq: M, dtype: float64

Resample the time series to get year-ending

In [70]:
ts.resample('Y').last()

2015-12-31    0.137508
2016-12-31    0.535701
2017-12-31    0.453909
2018-12-31    0.008603
Freq: A-DEC, dtype: float64

### Rolling windows: `rolling()`
A lot of time series analysis invove with rolling window, such as moving average for stock prices

This `rolling()` view makes available a number of aggregation operations by default

In [61]:
# rolling window of 5 days
x = ts.rolling(5)

In [62]:
x.mean()

2015-01-01         NaN
2015-01-02         NaN
2015-01-03         NaN
2015-01-04         NaN
2015-01-05    0.394729
2015-01-06    0.312830
2015-01-07    0.401877
2015-01-08    0.335636
2015-01-09    0.298776
2015-01-10    0.306040
2015-01-11    0.271371
2015-01-12    0.220341
2015-01-13    0.322726
2015-01-14    0.340358
2015-01-15    0.476518
2015-01-16    0.512027
2015-01-17    0.485407
2015-01-18    0.350839
2015-01-19    0.431564
2015-01-20    0.380681
2015-01-21    0.292604
2015-01-22    0.322193
2015-01-23    0.309425
2015-01-24    0.245451
2015-01-25    0.232332
2015-01-26    0.283332
2015-01-27    0.354625
2015-01-28    0.344463
2015-01-29    0.461279
2015-01-30    0.446754
2015-01-31    0.567606
2015-02-01    0.572231
2015-02-02    0.580301
2015-02-03    0.586510
2015-02-04    0.651420
2015-02-05    0.493635
2015-02-06    0.399730
2015-02-07    0.498805
2015-02-08    0.380856
2015-02-09    0.315628
2015-02-10    0.322250
2015-02-11    0.425194
2015-02-12    0.506287
2015-02-13 

### Shifting data: `shift()`
Moving data backward or forward through time

In [71]:
# shift data forward by 10 days
ts.shift(10) 

2015-01-01         NaN
2015-01-02         NaN
2015-01-03         NaN
2015-01-04         NaN
2015-01-05         NaN
2015-01-06         NaN
2015-01-07         NaN
2015-01-08         NaN
2015-01-09         NaN
2015-01-10         NaN
2015-01-11    0.921144
2015-01-12    0.123330
2015-01-13    0.621593
2015-01-14    0.288880
2015-01-15    0.018699
2015-01-16    0.511649
2015-01-17    0.568564
2015-01-18    0.290390
2015-01-19    0.104577
2015-01-20    0.055022
2015-01-21    0.338301
2015-01-22    0.313413
2015-01-23    0.802317
2015-01-24    0.192736
2015-01-25    0.735822
2015-01-26    0.515849
2015-01-27    0.180308
2015-01-28    0.129479
2015-01-29    0.596362
2015-01-30    0.481404
2015-01-31    0.075469
2015-02-01    0.328249
2015-02-02    0.065641
2015-02-03    0.276494
2015-02-04    0.415805
2015-02-05    0.330468
2015-02-06    0.684715
2015-02-07    0.014834
2015-02-08    0.860575
2015-02-09    0.343180
2015-02-10    0.934729
2015-02-11    0.707837
2015-02-12    0.055182
2015-02-13 