# Time Series

- Series of Time sequence as index in a dataframe
- Period vs timestamp
    - Period : Interval or duration
    - Timestamp : Exact point in time

### Create pandas timestamp

In [3]:
import pandas as pd # assumed imported going forward
from datetime import datetime # To manually create dates
time_stamp1 = pd.Timestamp(datetime(2017, 1, 1))
time_stamp2 = pd.Timestamp('2017-01-01')
print(time_stamp1)
print(time_stamp2)

2017-01-01 00:00:00
2017-01-01 00:00:00


### Extract information from timestamp

In [4]:
# Extract Year
print(time_stamp1.year)

# Extract Day Name
print(time_stamp1.day_name())

2017
Sunday


### Working with period

In [7]:
period = pd.Period('2017-01')
print(period) # default: month-end

daily_period = period.asfreq('D') # convert to daily
period_to_timestamp = daily_period.to_timestamp()
print(period_to_timestamp)
print(type(period_to_timestamp))
timestamp_to_period= time_stamp1.to_period('M')
print(timestamp_to_period)
print(type(timestamp_to_period))

2017-01
2017-01-31 00:00:00
<class 'pandas._libs.tslibs.timestamps.Timestamp'>
2017-01
<class 'pandas._libs.tslibs.period.Period'>


### Arithmatics

In [12]:
new_period = period + 2
# new_timestamp = pd.Timestamp('2017-01-31', 'M') + 1
print(new_period)
# print(new_timestamp)

2017-03


### Create Time sequence with timestamp

In [14]:
timestamp_index = pd.date_range(start='2017-1-1', periods=12, freq='M')
print(timestamp_index)

DatetimeIndex(['2017-01-31', '2017-02-28', '2017-03-31', '2017-04-30',
               '2017-05-31', '2017-06-30', '2017-07-31', '2017-08-31',
               '2017-09-30', '2017-10-31', '2017-11-30', '2017-12-31'],
              dtype='datetime64[ns]', freq='M')


### Create Time sequence with time period


In [15]:
period_index = pd.date_range(start='2017-1-1', periods=12, freq='M').to_period()
print(period_index)


PeriodIndex(['2017-01', '2017-02', '2017-03', '2017-04', '2017-05', '2017-06',
             '2017-07', '2017-08', '2017-09', '2017-10', '2017-11', '2017-12'],
            dtype='period[M]')


## Create a time series dataframe

In [19]:
import numpy as np
data = np.random.randint(5, size=(12, 2))
df = pd.DataFrame(data=data, index=timestamp_index)
print(df)

            0  1
2017-01-31  2  1
2017-02-28  2  0
2017-03-31  2  1
2017-04-30  1  0
2017-05-31  1  1
2017-06-30  2  1
2017-07-31  1  2
2017-08-31  1  4
2017-09-30  2  3
2017-10-31  3  1
2017-11-30  4  4
2017-12-31  0  3


# Time Series Transformation

- Make the date column into index
    - `df.set_index('date_col', inplace=True)`
- Parsing string dates
    - `pd.to_datetime(df.date_col)`
    - during data import: `df = pd.read_csv('file.csv', parse_dates=['date_col'], index_col='date_col')`
- Slicing sub-periods
    - `df['2015-3': '2016-2']`
    - Use full date with `.loc[]` : `df.loc['2016-6-1', 'price']` 
- Setting and Changing frequency of index
    - Setting frequency : `df.asfreq('D')`
    - Upsampling : Changing frequency to increasing frequency (Newer data)
        - eg: daily frequency to add week-end values `df.asfreq('D')`
    - Downsampling :  Changing frequency to decreasing frequency (Aggregating data)
        - eg : Business day frequency to get rid of week-end values `df.asfreq('B')`

# Plotting time series data

In [20]:
# df.some_col.plot(title='Time series for some_col')
# plt.tight_layout(); plt.show()

# Moving data back and forth

- Consider a  column in ascending time sequence
- Shifting / Lead
    - 1 period into future
    - Pushes 1 value down in the column
    - First value of the column becomes null
    - Last value of the column is lost
    - `df['shifted'] = df["col"].shift(periods=1)`
- Lag
    - 1 period in past
    - pulls 1 value up in the column
    - Last value of the column becomes null
    - First value of the column is lost
    - `df['lagged'] = df["col"].shift(periods=-1)`

# Calculations

- Percent change: 
    - `df['pct_change'] = df["some_col"].pct_change().mul(100)`
- Difference in value for two adjacent periods
    - `df['diff'] = df["some_col"].diff()`
- Column-wise arithmatic operations:
    - Division : `df["col1"].div(df["col2"])`
    - Multiplication : `df["col1"].mul(df["col2"])`
    - Subtract : `df["col1"].sub(df["col2"])`