# Time Series (referred as TS from now)

## Agenda:
1- What makes Time Series Special?

2- Loading and Handling Time Series in Pandas

3- How to Check Stationarity of a Time Series?

4- How to make a Time Series Stationary?

5- Forecasting a Time Series

##### 1. What makes Time Series Special?   
  
    1. It is time dependent. So the basic assumption of a linear regression model that the observations 
    are independent doesn’t hold in this case.
    
    2. Along with an increasing or decreasing trend, most TS have some form of seasonality trends,
    i.e. variations specific to a particular time frame. For example, if you see the sales of
    a woolen jacket over time, you will invariably find higher sales in winter seasons.
    
Because of the inherent properties of a TS, there are various steps involved in analyzing it.    

##### 2. Loading and Handling Time Series in Pandas

Pandas has dedicated libraries for handling TS objects, particularly the datatime64[ns] class which stores time information and allows us to perform some operations really fast.

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pylab as plt
%matplotlib inline
from matplotlib.pylab import rcParams
rcParams['figure.figsize'] = 15, 6

Now, we can load the data set and look at some initial rows and data types of the columns

In [8]:
data = pd.read_csv('passenger.csv')
print (data.head())
print ('\n Data Types:')
print (data.dtypes)

   1949-01  112
0  1949-02  118
1  1949-03  132
2  1949-04  129
3  1949-05  121
4  1949-06  135

 Data Types:
1949-01    object
112         int64
dtype: object


---
this is still not read as a TS object as the data types are ‘object’ and ‘int’. In order to read the data as a time series, we have to pass special arguments to the read_csv command

In [12]:
dateparse = lambda dates: pd.datetime.strptime(dates, '%Y-%m')
data = pd.read_csv('passenger.csv', parse_dates=['Month'], index_col='Month',date_parser=dateparse)

print (data.head())

            passenger_num
Month                    
1949-01-01            112
1949-02-01            118
1949-03-01            132
1949-04-01            129
1949-05-01            121


##### the new command arguments:    
    1. parse_dates: This specifies the column which contains the date-time information. As we say above, 
    the column name is ‘Month’.
    
    2. index_col: A key idea behind using Pandas for TS data is that the index has to be the variable      depicting date-time information. So this argument tells pandas to use the ‘Month’ column as index.
    
    3. date_parser: This specifies a function which converts an input string into datetime variable.
    Be default Pandas reads data in format ‘YYYY-MM-DD HH:MM:SS’. If the data is not in this format,
    the format has to be manually defined. Something similar to the dataparse function defined here can be used for this purpose.