In [8]:
from google.colab import drive
drive.mount('/content/drive')


Mounted at /content/drive


### Time series data Forecasting techniques

#### The Naive Approach

![alt text](https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/02/naive_new-768x495.png)

- If we want to forecast the price for the next day, we can simply take the last day value and estimate the same value for the next day.

![alt text](https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/01/Screen-Shot-2018-01-25-at-7.45.20-PM.png)

![alt text](https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/02/naive-768x519.png
)

#### The Simple average

![alt text](https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/02/avg_orig_new1-768x510.png)

- Lots of times we are provided with a dataset, which varies by a small margin throughout it’s time period, but the average at each time period remains constant. 
- In such a case we can forecast the price of the next day somewhere similar to the average of all the past days.

![alt text](https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/01/Screen-Shot-2018-01-25-at-7.45.10-PM-300x82.png)

![alt text](https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/02/avg-768x511.png)


#### The Moving average

![alt text](https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/02/mov_avg_new-768x531.png)

- Using the prices of the initial period would highly affect the forecast for the next period. 
- Therefore as an improvement over simple average, we will take the average of the prices for last few time periods only. 

![alt text](https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/01/Screen-Shot-2018-01-25-at-7.47.33-PM.png)

![alt text](https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/02/moving_avg-850x428.png)


#### Single Exponential smoothing 

![alt text](https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/01/Screen-Shot-2018-01-25-at-7.59.27-PM-768x90.png
)

-It may be sensible to attach larger weights to more recent observations than to observations from the distant past. 
- The technique which works on this principle is called Simple exponential smoothing.
- Forecasts are calculated using weighted averages where the weights decrease exponentially as observations come from further in the past, the smallest weights are associated with the oldest observations:

![alt text](https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/02/SES-768x392.png)

Doesn't this look similar to the discounted return concept from reinforcement learning? 

![alt text](https://image.slidesharecdn.com/reinforcementlearning-170329091514/95/reinforcement-learning-17-638.jpg?cb=1490778934
)

#### Holt’s linear trend method

![alt text](https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/02/hl_new-768x408.png)

- Holt extended simple exponential smoothing to allow forecasting of data with a trend.
- It is nothing more than exponential smoothing applied to both level(the average value in the series) and trend.

![alt text](https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/02/eq-768x317.png)

![alt text](https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/02/HL-768x390.png)


#### Holt’s Winter seasonal method

![alt text](https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/01/Picture1.jpg)

- The level equation shows a weighted average between the seasonally adjusted observation and the non-seasonal forecast for time t. 
- The trend equation is identical to Holt’s linear method. 
- The seasonal equation shows a weighted average between the current seasonal index, and the seasonal index of the same season last year (i.e., s time periods ago).

![alt text](https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/01/eq.png)

![alt text](https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/02/HW.png)



#### Multivariate Time Series

- Similar, but finds linear interdependencies between multiple variables. 
- Each variable has a regression like equation, where it is regressed against its own lagged values and the lagged values of other variables
- Examples includes ARIMA, ARIMAX, etc.
- Or treat it like a supervised problem and use the sequential power of LSTM networks. 

#### Reinforcement Learning for time series data?

- In reinforcement learning, there is an agent acting on the outside world, observing effects and learning to improve its behaviour. 
- In contrast, a time series forecast setting has a passive observer which do not interact with the environment. 
- Ideally, the environment 'reacts' to actions that an 'agent' takes. 

### Examples
- Game world example is 'cartpole', if the pole moves in one direction, the balance its on moves as well.
- Real world examples include any system that adapts to changes made by an agent
- In the stock market, where buy and sell actions effect the price
- In an electricity grid, where optimizing energy output effects interdependent variables like the cooling demand
- Sensor networks, public works, all sorts of routing strategies inside of an interconnected system

![alt text](http://www.turingfinance.com/wp-content/uploads/2014/04/Reinforcement-Learning.png)

### We need more simulated environments! Create simulated environments for businesses, great startup idea


In [0]:
#1 - List the dataset

In [0]:
#2 - Convert data to pandas dataframe

In [0]:
#3 - how big is our dataset?

In [0]:
#4 - Examine the dataset

In [0]:
#5 - how many labels and values do we have? 

In [0]:
#6 - How much missing data do we have?

In [0]:
#7 - Counts for each timestep in the data

In [0]:
#8 - Unique Assets

#### The Markov Decision Process

![alt text](https://slideplayer.com/slide/3007502/11/images/5/Markov+Decision+Processes+%28MDPs%29.jpg)

- States (observations) - 1 variable (target)
- Actions - Iterate 1
- Rewards - R score (loss function)

In [0]:
import os
import pandas as pd
import numpy as np
from sklearn.metrics import r2_score

# This is taken from Frans Slothoubers post on the contest discussion forum.
# https://www.kaggle.com/slothouber/two-sigma-financial-modeling/kagglegym-emulation


def r_score(y_true, y_pred, sample_weight=None, multioutput=None):
    r2 = r2_score(y_true, y_pred, sample_weight=sample_weight,
                  multioutput=multioutput)
    r = (np.sign(r2)*np.sqrt(np.abs(r2)))
    if r <= -1:
        return -1
    else:
        return r

In [0]:
class Observation(object):
    def __init__(self, train, target, features):
        self.train = train
        self.target = target
        self.features = features


class Environment(object):
    def __init__(self):
        with pd.HDFStore("/content/drive/My Drive/train.h5", "r") as hfdata:
            self.timestamp = 0
            fullset = hfdata.get("train")
            self.unique_timestamp = fullset["timestamp"].unique()
            # Get a list of unique timestamps
            # use the first half for training and
            # the second half for the test set
            n = len(self.unique_timestamp)
            i = int(n/2)
            timesplit = self.unique_timestamp[i]
            self.n = n
            self.unique_idx = i
            self.train = fullset[fullset.timestamp < timesplit]
            self.test = fullset[fullset.timestamp >= timesplit]

            # Needed to compute final score
            self.full = self.test.loc[:, ['timestamp', 'y']]
            self.full['y_hat'] = 0.0
            self.temp_test_y = None

    def reset(self):
        timesplit = self.unique_timestamp[self.unique_idx]

        self.unique_idx = int(self.n / 2)
        self.unique_idx += 1
        subset = self.test[self.test.timestamp == timesplit]

        # reset index to conform to how kagglegym works
        target = subset.loc[:, ['id', 'y']].reset_index(drop=True)
        self.temp_test_y = target['y']

        target.loc[:, 'y'] = 0.0  # set the prediction column to zero

        # changed bounds to 0:110 from 1:111 to mimic the behavior
        # of api for feature
        features = subset.iloc[:, :110].reset_index(drop=True)

        observation = Observation(self.train, target, features)
        return observation

    def step(self, target):
        timesplit = self.unique_timestamp[self.unique_idx-1]
        # Since full and target have a different index we need
        # to do a _values trick here to get the assignment working
        y_hat = target.loc[:, ['y']]
        self.full.loc[self.full.timestamp == timesplit, ['y_hat']] = y_hat._values

        if self.unique_idx == self.n:
            done = True
            observation = None
            reward = r_score(self.temp_test_y, target.loc[:, 'y'])
            score = r_score(self.full['y'], self.full['y_hat'])
            info = {'public_score': -score}
        else:
            reward = r_score(self.temp_test_y, target.loc[:, 'y'])
            done = False
            info = {}
            timesplit = self.unique_timestamp[self.unique_idx]
            self.unique_idx += 1
            subset = self.test[self.test.timestamp == timesplit]

            # reset index to conform to how kagglegym works
            target = subset.loc[:, ['id', 'y']].reset_index(drop=True)
            self.temp_test_y = target['y']

            # set the prediction column to zero
            target.loc[:, 'y'] = 0

            # column bound change on the subset
            # reset index to conform to how kagglegym works
            features = subset.iloc[:, 0:110].reset_index(drop=True)

            observation = Observation(self.train, target, features)

        return observation, reward, done, info

    def __str__(self):
        return "Environment()"

In [0]:
#9 Agent Environment Loop

In [0]:
#10 test it! 

968
806298


How could this RL framework be extended? Q learning! But we need a real-time API or simulated environment where actions effect the enviroment state

![alt text](https://www.researchgate.net/profile/Kao-Shing_Hwang/publication/220776448/figure/fig1/AS:394068661161984@1470964698231/The-Q-Learning-Algorithm-6.png)

