# Re-sampling Data 
## from a given Date Column to a another Date Column using _Pandas_

This notebook shows how to map time series data from a date vector onto a different date vector and interpolate fro the original time series data to find the proper values of the parameter(s) of interest in the new date vector

__Author: Soheil Esmaeilzadeh__

All rights reserved

#### Create toy data as 10 rows of dates, every other day:

In [2]:
import pandas as pd

In [43]:
t    = pd.date_range('2010-01-01', periods=10, freq='2D')
data = pd.DataFrame(dict(A=list(range(1,21,2))), t)
data

Unnamed: 0,A
2010-01-01,1
2010-01-03,3
2010-01-05,5
2010-01-07,7
2010-01-09,9
2010-01-11,11
2010-01-13,13
2010-01-15,15
2010-01-17,17
2010-01-19,19


#### New find index for the days in between:

In [29]:
other_t = pd.date_range(t.min(), t.max()).difference(t)
other_t

DatetimeIndex(['2010-01-02', '2010-01-04', '2010-01-06', '2010-01-08',
               '2010-01-10', '2010-01-12', '2010-01-14', '2010-01-16',
               '2010-01-18'],
              dtype='datetime64[ns]', freq=None)

#### Now, create a new date vector that is the union of the old date and the new date vectors:

In [30]:
union_t = other_t.union(data.index)
union_t

DatetimeIndex(['2010-01-01', '2010-01-02', '2010-01-03', '2010-01-04',
               '2010-01-05', '2010-01-06', '2010-01-07', '2010-01-08',
               '2010-01-09', '2010-01-10', '2010-01-11', '2010-01-12',
               '2010-01-13', '2010-01-14', '2010-01-15', '2010-01-16',
               '2010-01-17', '2010-01-18', '2010-01-19'],
              dtype='datetime64[ns]', freq='D')

#### Now, we do reindexing:

In [24]:
data.reindex(union_t)

Unnamed: 0,A
2010-01-01,1.0
2010-01-02,
2010-01-03,3.0
2010-01-04,
2010-01-05,5.0
2010-01-06,
2010-01-07,7.0
2010-01-08,
2010-01-09,9.0
2010-01-10,


#### The added dates now have values of _NaN_. 

#### Now we use interpolattion: 

We need to use the argument method='index' to ensure we interpolate relative to the size of the gaps in the index.

In [25]:
data.reindex(union_t).interpolate('index')

Unnamed: 0,A
2010-01-01,1.0
2010-01-02,2.0
2010-01-03,3.0
2010-01-04,4.0
2010-01-05,5.0
2010-01-06,6.0
2010-01-07,7.0
2010-01-08,8.0
2010-01-09,9.0
2010-01-10,10.0


#### And now the missing data values (i.e. NaNs) are filled with interpolated values.

#### Now, we also reindex again to reduce the time series to just the interpolated index values as:

In [26]:
data.reindex(union_t).interpolate('index').reindex(other_t)

Unnamed: 0,A
2010-01-02,2.0
2010-01-04,4.0
2010-01-06,6.0
2010-01-08,8.0
2010-01-10,10.0
2010-01-12,12.0
2010-01-14,14.0
2010-01-16,16.0
2010-01-18,18.0
