# Problem Statement

In order to efficiently manage the business and manpower for the coming months, the ABC airlines want to forecast the number of bookings expected for the next two quarters. You are provided with the data for past 19 months (from 26-08-2012 to 25-03-2014) and you have to forecast the values for the next two quarters (26-03-2014 to 25-09-2014).

_Note that here we will be using a validation set and hence we have the forecast values for the next two quarters_


**Data Description**
Both data and validation files have two columns: 'Date' and 'Count'.
- Date: Store the date when the observation was taken
- Count: Holds the number of bookings for the given date

# Reading Time Series Data 

In [9]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [10]:
data = pd.read_csv("data/data_missing.csv")

In [11]:
data.shape

(578, 2)

In [12]:
data.isnull().sum()

Date     0
count    9
dtype: int64

In [13]:
data.loc[data['count'].isna() == True]

Unnamed: 0,Date,count
361,2013-08-21,
362,2013-08-22,
390,2013-09-19,
487,2013-12-25,
488,2013-12-26,
546,2014-02-22,
547,2014-02-23,
548,2014-02-24,
556,2014-03-04,


In [14]:
data[360:364]

Unnamed: 0,Date,count
360,2013-08-20,147.0
361,2013-08-21,
362,2013-08-22,
363,2013-08-23,102.0


# Backward and Forward Fill

Not replcaed values here  ... jsiut tp show ..

In [15]:
(data.fillna(method='ffill'))[360:364]

Unnamed: 0,Date,count
360,2013-08-20,147.0
361,2013-08-21,147.0
362,2013-08-22,147.0
363,2013-08-23,102.0


In [16]:
(data.fillna(method='bfill'))[360:364]

Unnamed: 0,Date,count
360,2013-08-20,147.0
361,2013-08-21,102.0
362,2013-08-22,102.0
363,2013-08-23,102.0


# Interpolation

In [17]:
interpolated_linear = data.interpolate(method='linear', order=2)

In [18]:
interpolated_linear[360:364]

Unnamed: 0,Date,count
360,2013-08-20,147.0
361,2013-08-21,132.0
362,2013-08-22,117.0
363,2013-08-23,102.0


In [19]:
interpolated_poly = data.interpolate(method='polynomial', order=2)

In [20]:
interpolated_poly[360:364]

Unnamed: 0,Date,count
360,2013-08-20,147.0
361,2013-08-21,146.136961
362,2013-08-22,133.136702
363,2013-08-23,102.0


In [21]:
interpolated_poly.isnull().sum()

Date     0
count    0
dtype: int64