## Model testing and selection 
> In this notebook we test various RNN models to predict a company's following day closing stock price.
At the end we select one model to train all of our stocks.

###  1) Import data_manager module and other libraries
    
     The data_manager module has helper classes and methods for working with our stock data.
     
   > The module has the following class that we will use in this notebook:

   > ###### SimpleSequence -
   Sequence class that creates input (x) and target (y) for RNN training or prediction,
        based on given window size (x) and target (y) lengths.
        The sequence is created from end of day normalized adjusted close stock pricess.

> ###### MultiSequence -
   Sequence class that creates input (x) and target (y) for RNN training or prediction,
        based on given window size (x) and target (y) lengths.
        The sequence is created from three features i) end of day normalized adjusted close stock pricess
        ii) log normal returns and iii) normalized MFI index.
        
 We will also use a few other helper methods such as  `'companies()'` and `'split_data()'`  methods from data_manager module.

In [7]:
%load_ext autoreload
%aimport data_manager
%autoreload 1

from data_manager import *

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


###  2) Import company list
> Here we read a csv file and import a list of company trade symbols

In [8]:
#read list of companies from csv file
stocks = companies()
tickers = stocks.values.tolist()

#Select stock to perform tests
ticker = tickers[0]

print("Stock ticker selected for testing: {}".format(ticker))

Stock ticker selected for testing: ['Snap', 'SNAP', 'Social']


### 3) RNN Models
> In this step we select four RNN models that we will train and evaluate how accurate they are on unseen data.

In [9]:
import numpy as np
import pandas as pd

from keras.models import Sequential
from keras.layers import Dense, LSTM, Dropout, Bidirectional
from keras.optimizers import RMSprop

def fixed_model(X,y, learn_rate):
    """
    RNN model with one LSTM layer (output of 5) and 1 fully connected output tanh layer    
    
    Parameter
    -----------
    X: numpy array
              input sequence data.
    
    y: numpy array
              target sequence data.
    
    learn_rate: float
            Neural network learning rate.
    """
    model = Sequential()
    model.add(LSTM(5,input_shape=(X.shape[1:])))
    model.add(Dense(y.shape[1], activation='tanh'))
      
    # compile the model
    optimizer = RMSprop(lr=learn_rate)
    model.compile(loss='mean_squared_error', optimizer=optimizer)
    return model

def dynamic_model(X,y, learn_rate):
    """
    RNN model with one LSTM layer (output based on input sequence length) and 1 fully connected output tanh layer    
    
    Parameter
    -----------
    X: numpy array
              input sequence data.
    
    y: numpy array
              target sequence data.
    
    learn_rate: float
            Neural network learning rate.
    """
    model = Sequential()
    model.add(LSTM(X.shape[1],input_shape=(X.shape[1:])))
    model.add(Dense(y.shape[1], activation='tanh'))
      
    # compile the model
    optimizer = RMSprop(lr=learn_rate)
    model.compile(loss='mean_squared_error', optimizer=optimizer)
    return model

def bidirectional_model(X,y, learn_rate):
    """
    Bidirectional RNN model with one LSTM layer (output based on input sequence length), 
    one fully connected layer (output based on input sequence length) 
    and 1 fully connected output tanh layer    
    
    Parameter
    -----------
    X: numpy array
              input sequence data.
    
    y: numpy array
              target sequence data.
    
    learn_rate: float
            Neural network learning rate.
    """
    model = Sequential()
    model.add(Bidirectional(LSTM(X.shape[1],return_sequences=False), input_shape=(X.shape[1:])))
    model.add(Dense(X.shape[1]))
    model.add(Dense(y.shape[1], activation='tanh'))
      
    # compile the model
    optimizer = RMSprop(lr=learn_rate)
    model.compile(loss='mean_squared_error', optimizer=optimizer)
    return model

def stacked_model(X,y, learn_rate):
    """
    Stacked RNN model with two LSTM layers and 1 fully connected output tanh layer.
    First LSTM layer has output of 10 and the second has 5.
    
    Parameter
    -----------
    X: numpy array
              input sequence data.
    
    y: numpy array
              target sequence data.
    
    learn_rate: float
            Neural network learning rate.
    """
    model = Sequential()
    model.add(LSTM(10,return_sequences=True, input_shape=(X.shape[1:])))
    model.add(LSTM(5))
    model.add(Dense(y.shape[1], activation='tanh'))
      
    # compile the model
    optimizer = RMSprop(lr=learn_rate)
    model.compile(loss='mean_squared_error', optimizer=optimizer)
    return model

#Create list of our models for use by the testing function.
models =[]
models.append(("Fixed",fixed_model))
models.append(("Dynamic",dynamic_model))
models.append(("Bidirectional",bidirectional_model))
models.append(("Stacked",stacked_model))

### 4) Testing function
> Here we define a `'test_model()'` function to evaluate each RNN model.

In [10]:
from collections import OrderedDict

def test_model(ticker,epochs,models,seq,window_sizes):
    """
    Function to test the performance of our RNN models     
    
    Parameter
    -----------
    stock:  str
            Compnay trade ticker.
    
    epoch:  int
            Number of epochs to train RNN.
    
    models: list of RNN model functions  
            Each item is a tuple where 1st item is string name of model
            and the 2nd is a model function that accepts X,y and learn_rate paramenter.
    
    seq:    Data sequence object
            Object that has input X and target y sequence data.
    
    window_sizes: list
                  A list of different window size (sequence length X input) to test.  
    Returns:
    ---------
    Returns an ordered dictionary with the result of the model testing as six list;
    'Window Size', 'Sequence Name','Model Name',
    'Training Error','Testing Error' and 'Param Count'.
    """
    #test result data
    sizes = []
    #seq_name = []
    model_name = []
    train_errors = []
    test_errors = []
    param_count = []
    
    for window_size in window_sizes:
        print("\nWindow size: {}".format(window_size))
        print('----------------')
        for model_item in models:
            seq_obj = seq[1](ticker,window_size,1)
            X_train,y_train,X_test,y_test = split_data(seq_obj)
            model = model_item[1](X_train,y_train,0.001)
            
            # fit model!
            model.fit(X_train, y_train, epochs=epochs, batch_size=50, verbose=0)

            # print out training and testing errors
            training_error = model.evaluate(X_train, y_train, verbose=0)
            testing_error = model.evaluate(X_test, y_test, verbose=0)
            msg = " > Model: {0:<15} Param count: {1:} \tTraining error: {2:.4f}\tTesting error: {3:.4f}"
            print(msg.format(model_item[0],model.count_params(),training_error,testing_error))

            #update result variables
            param_count.append(model.count_params())
            sizes.append(window_size)
            #seq_name.append(seq[0])
            model_name.append(model_item[0])
            train_errors.append(float("{0:.4f}".format(training_error)))
            test_errors.append(float("{0:.4f}".format( testing_error)))

    table= OrderedDict()
    table['Window Size'] = sizes
    table['Sequence Name'] =  [seq[0] for _ in range(len(sizes))]
    table['Model Name'] = model_name
    table['Ticker'] = [ticker for _ in range(len(sizes))]
    table['Training Error'] = train_errors
    table['Testing Error'] = test_errors
    table['Param Count'] = param_count
        
    return table


def update_test_table(*argv):
    """Updates a model testing table 
    """
    file_path = "./data/model_test.csv"
    
    table = pd.read_csv(file_path)
    tickers = set( table['Ticker'].values.tolist())
    
    for item in argv:

        #first check if already exist 
        check = item['Ticker'][0]
        if check in tickers:
            #drop items
            idx = table[(table['Ticker']== check)  &  (table['Sequence Name']== item['Sequence Name'][0])].index
            table =  table.drop(idx)

        #append current test
        table = table.append(pd.DataFrame(item))

    table = table.reset_index(drop=True)
    table.to_csv(file_path, index = False)

def get_test_table():
    """Get testing table and returned as DataFrame
    """
    file_path = "./data/model_test.csv"
    return pd.read_csv(file_path)


### 5) Perform model testing
> We test each model using a one feature input sequence and a three feature  input sequence of different sequence size or window size.

> * The first test uses the `'SimpleSequence()'` class from the `data_manager` to evaluate how well it performs with the four RNN
models.  The `'SimpleSequence()'` is a one feature sequence based on normalized stock prices.   


> * In the second test we use the `'MultiSequence()'` class from the `data_manager`.  The `'MultiSequence()'` is a three normalize feature sequence; closing stock prices, log normal daily returns and MFI index.


> * The goals of the testing are to 1) decide which input sequence is better, 2) select the best performing window size  and 3) choose the best RNN model that best captures the target variable.

In [11]:
seed = 7
np.random.seed(seed)

#Model testing variables
epochs =100
window_sizes =[5,7,10,20]

In [13]:
print("*** Simple Sequence Model Test for {} ***".format(ticker))
print("=" * 45)

seq_name = ('Simple',SimpleSequence)

test_1  = test_model(ticker,epochs,models,seq_name,window_sizes)
update_test_table(test_1)

*** Simple Sequence Model Test for ['Snap', 'SNAP', 'Social'] ***

Window size: 5
----------------
Unexpected error for symbol ['Snap', 'SNAP', 'Social']:<class 'FileNotFoundError'>


AttributeError: 'SimpleSequence' object has no attribute '_SequenceBase__data_normal'

In [None]:
print("*** Multi Sequence Model Test for {} ***".format(ticker))
print("=" * 45)

seq_name = ('Multi',MultiSequence)

test_2  = test_model(ticker,epochs,models,seq_name,window_sizes)
update_test_table(test_2)

### 6) Evaluate and summarize test results

In [None]:
#update and get model testing table
#table = update_test_table(test_1,test_2)

table = get_test_table()

#### Summarize model testing by sequence

In [None]:
pd.pivot_table(table, values=['Training Error','Testing Error'], index=['Sequence Name']
               ,aggfunc={'Training Error':np.mean, 'Testing Error':np.mean} )

#### Summarize model testing by Ticker symbol and window size

In [None]:
pd.pivot_table(table, values=['Training Error','Testing Error'], index=['Ticker','Window Size']
               ,aggfunc={'Training Error':np.mean, 'Testing Error':np.mean} )

#### Summarize model testing by sequence and window size

In [None]:
pd.pivot_table(table, values=['Training Error','Testing Error'], index=['Sequence Name','Window Size']
               ,aggfunc={'Training Error':np.mean, 'Testing Error':np.mean} )

#### Summarize model testing by RNN model

In [None]:
pd.pivot_table(table, values=['Training Error','Testing Error'], index=['Model Name']
               ,aggfunc={'Training Error':np.mean, 'Testing Error':np.mean} )

#### Summarize model testing by sequence and RNN model

In [None]:
pd.pivot_table(table, values=['Training Error','Testing Error'], index=['Sequence Name' ,'Model Name']
               ,aggfunc={'Training Error':np.mean, 'Testing Error':np.mean} )

#### Summarize model testing by model parameter count

In [None]:
pd.pivot_table(table, values='Param Count', index=['Sequence Name','Model Name'], columns=['Window Size'])

### Testing observations
* The multi sequence input performed better than the simple sequence input.  This is evident since the training and testing errors are both smaller for the multi sequence.


* Not one particular window size captured the target variable the best.


* The dynamic and the bidirectional models performed the best as they have smallest training and testing errors.


* The model parameter count between the different models is negligible and we can perform our training on a cpu.


* All the models can probably get an improvement by adding a dropout layer since the testing error was larger than the training in every case.  Further testing is needed to check if a higher epoch count can decrease the variance between training and testing error.  

### Conclusion and model selection 
 Based on the model testing results we arrive at the following conclusions:
* We will use the multi sequence input since it better captures the target variable.


* Since no window size outperformed we will pass a list of Window sizes to our final model and return the best performing model.


* We choose the bidirectional model since its the best performing model.


### 7) Live model testing
* In in this section we define a live model which is the bidirectional model but with a dropout layer.


* We test the live model with different dropout and learning rates to uncover the optiomal rates.


* We use a window size of 10 at this point since we are only interested in finding the best learnng and drop out rates.


* We also perform a test to gage the optimal number of epochs

In [None]:
def live_model(X,y, learn_rate,dropout):
    """
     RNN model with following layers:
        1) one LSTM layer (output size based on X input sequence length)
        2) Dropout (based on given dropout rate) 
        3) fully connected tanh output layer of 1
    
    Parameter
    -----------
    X: numpy array
              input sequence data.
    
    y: numpy array
              target sequence data.
    
    learn_rate: float
            Neural network learning rate.
            
    dropout: float
            Dropout rate.
    """
    model = Sequential()
    model.add(Bidirectional(LSTM(X.shape[1],return_sequences=False), input_shape=(X.shape[1:])))
    model.add(Dense(X.shape[1]))
    model.add(Dropout(dropout))
    model.add(Dense(y.shape[1], activation='tanh'))
    
    # compile the model
    optimizer = RMSprop(lr=learn_rate)
    model.compile(loss='mean_squared_error', optimizer=optimizer)
    return model

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

window_size = 10
dropouts =  [0.0,0.25,0.4,0.50]
learn_rates = [0.01,0.001,0.0001]
batch_size = 50
epochs_live = 100

def test_live(X_train,y_train,X_test,y_test):
    best_model = None
    lowest_test_error = 2.0
    best_learn_rate = 0.0
    best_dropout_rate = 0.0
    for rate in learn_rates:
        print("\nLearn rate: {0:.4f}".format(rate))
        print('---------------------')
        lengend = []
        for dropout in dropouts:
            model = live_model(X_train,y_train,rate,dropout)
            history = model.fit(X_train, y_train, epochs=epochs_live, batch_size=batch_size, verbose=0)

            # print out training and testing errors
            training_error = model.evaluate(X_train, y_train, verbose=0)
            testing_error = model.evaluate(X_test, y_test, verbose=0)
            msg = " > Dropout: {0:.2f} Training error: {1:.4f}\tTesting error: {2:.4f}"
            print(msg.format(dropout, training_error,testing_error))
            
            #check if test error
            if lowest_test_error > testing_error:
                best_model = model
                lowest_test_error = testing_error
                best_learn_rate = rate
                best_dropout_rate = dropout
                
            #plot loss function
            plt.plot(history.history['loss'])
            lengend.append("Drop {0:.4f}".format(dropout)) 
    
        plt.title("Learn rate {0:.4f}".format(rate))
        plt.xlabel('epochs')
        plt.ylabel('loss')
        plt.legend(lengend,loc='center left', bbox_to_anchor=(1, 0.5))
        plt.show()
    
    return (best_model,lowest_test_error,best_learn_rate,best_dropout_rate)


seq_obj = MultiSequence(ticker,window_size,1)
dataset = seq_obj.original_data
X_train,y_train,X_test,y_test = split_data(seq_obj)

print("*** Live Model Testing ***")
print("=" * 40)        
results = test_live(X_train,y_train,X_test,y_test)


print("*** Best Live Model Summary***")
print("=" * 40) 
print("Testing error: {0:.4f}".format(results[1]))
print("Best learning rate: {}".format(results[2]))
print("Best dropout rate: {}".format(results[3]))

### Learn rate and dropout testing results
> * Looking at testing results we can see that learn of 0.01 and 0.001 performed better than 0.0001.
> * The dropout rates of 0.0, 0.25 and 0.40 had the best results.

### Epoch testing
> We perform a test to try and find the optimal epoch count.

In [None]:
#get fourt tickers to perform out epoch test
ticker_epochs = [tickers[i][1] for i in range(4)]

window_size = 10
dropout_rate = 0.25
epochs_list = [50,100,200,500,1000]
batch_size = 50
learn_rate = 0.001

def test_epochs():
    """
    
    """
    for symbol in ticker_epochs:
        print("\nSymbol: {}".format(symbol))
        print('---------------------')
        seq_obj = MultiSequence(symbol,window_size,1)
        X_train,y_train,X_test,y_test = split_data(seq_obj)
        lowest_test_error = 2.0
        best_epoch = 0
        for epoch in epochs_list:
            model = live_model(X_train,y_train,learn_rate,dropout_rate)
            model.fit(X_train, y_train, epochs=epoch, batch_size=batch_size, verbose=0)

            # print out training and testing errors
            training_error = model.evaluate(X_train, y_train, verbose=0)
            testing_error = model.evaluate(X_test, y_test, verbose=0)
            msg = " > Epoch: {0:} \tTraining error: {1:.4f}\tTesting error: {2:.4f}"
            print(msg.format(epoch, training_error,testing_error))

            if lowest_test_error > testing_error:
                lowest_test_error = testing_error
                best_epoch = epoch
        
        #print best epoch for symbol
        print(" ==> Best epoch {0:} with testing error of {1:.4f}".format(best_epoch,lowest_test_error))

print("*** Epoch Model Testing ***")
print("=" * 40)        
test_epochs()

### Epoch testing conclusion
> Our epoch testing finds that there is no optimal epoch count but that we should try 100 and 200 and then return the model that performs the best. 

### Best model selection

> * Here we put together everything we learn from our testing to select model for a given ticker.


> * To select the best model for a ticker we define a function that accepts a list of window sizes, drop out rates, learn rates
and epoch.  


> * We graph the model peformance versus original dataset.

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

ticker = tickers[0]
window_sizes = [5,7,10]
dropouts =  [0.0,0.25,0.4]
learn_rates = [0.01,0.001]
epochs = [100,200,500]
batch_size = 50

def best_model(ticker, window_sizes, learn_rates, dropouts, epochs, batch_size):
    """
    
    """
    #our best model variables
    best_model = None
    lowest_test_error = 2.0
    best_training_error =0.0
    best_learn_rate = 0.0
    best_dropout_rate = 0.0
    best_epoch = 0
    best_window_size = 0
    
    counter = 1
    
    for window_size in window_sizes:
        print("\nWindow size: {}".format(window_size))
        print('---------------------')
        
        #prepare our sequence data
        seq_obj = MultiSequence(ticker,window_size,1)
        X_train,y_train,X_test,y_test = split_data(seq_obj)    
    
        for rate in learn_rates:
            for dropout in dropouts:
                for epoch in epochs:
                    model = live_model(X_train,y_train,rate,dropout)
                    model.fit(X_train, y_train, epochs=epoch, batch_size=batch_size, verbose=0)

                    # print out training and testing errors
                    training_error = model.evaluate(X_train, y_train, verbose=0)
                    testing_error = model.evaluate(X_test, y_test, verbose=0)
                    msg = " > Learn rate: {0:.4f} Dropout: {1:.2f}"
                    msg += " Epoch: {2:} Training error: {3:.4f} Testing error: {4:.4f}"
                    msg = str(counter) + "   " +msg.format(rate,dropout, epoch, training_error, testing_error)
                    print(msg)

                    #check if test error 
                    if lowest_test_error > testing_error:
                        best_model = model
                        lowest_test_error = testing_error
                        best_learn_rate = rate
                        best_dropout_rate = dropout
                        best_epoch = epoch
                        best_training_error = training_error 
                        best_window_size = window_size
                    
                    #increase our print counter
                    counter += 1
                        
    best_dict ={}
    best_dict["ticker"] = ticker
    best_dict["model"] = best_model
    best_dict["test_error"] =   "{0:.4f}".format(lowest_test_error) 
    best_dict["learn_rate"] = best_learn_rate
    best_dict["dropout"] = best_dropout_rate
    best_dict["epoch"] = best_epoch
    best_dict["train_error"] =  "{0:.4f}".format(best_training_error)  
    best_dict["window_size"] = best_window_size
    
    return best_dict


print("*** Best Model Selection for {} ***".format(ticker))
print("=" * 40)      
results = best_model(ticker, window_sizes, learn_rates, dropouts, epochs, batch_size)

In [None]:
print("*** Best Model Selected Summary for {} ***".format(results["ticker"]))
print("=" * 40) 

print("Window size: {}".format(results["window_size"]))
print("Train error: {}".format(results["train_error"]))
print("Testing error: {}".format(results["test_error"]))
print("Learning rate: {}".format(results["learn_rate"]))
print("Dropout rate: {}".format(results["dropout"]))
print("Epochs: {}".format(results["epoch"]))

seq_obj = MultiSequence(results["ticker"],results["window_size"],1)
dataset = seq_obj.original_data
X_train,y_train,X_test,y_test = split_data(seq_obj)

graph_prediction(results["model"], X_train,X_test,dataset,results["window_size"])