# Predicting Financial Crisis with Machine Learning Methodology.

Predicting financial crises with accuracy is a critical task for economists, policymakers, and financial institutions, given its huge economic and social costs (Hoggarth, Reis and Saporta (2002); Ollivaud and Turner (2015); Laeven and Valencia (2018)). Recent events such as the failure of SVB Bank and the First Republic Bank, the merger of Credit Suisse, and the lingering effects of the Global Financial Crisis (GFC) underscore the importance of identifying effective early warning indicators (EWIs) of financial crises that can help avert such financial upheavals and mitigate the cost of the crises. This paper presents a novel approach to the analysis and categorization of these early warning indicators, aiming to offer a more refined understanding of their predictive utility in the face of an increasingly dynamic and volatile banking landscape.

Previously, with Sparse-TVp modeling, our study classified a broad spectrum of explanatory variables into three distinct cate- gories: stable EWIs, time-varying EWIs, and variables irrelevant to predicting financial crises. The forecasting power of and evaluations of EWIs has not been sufficiently analyzed. Most of the existing research has been focused on identifying what variables could be considered as the EWIs, such as the work of: Aldasoro, Borio, and Drehmann in 2018 (Iñaki Aldasoro (2018)), Fahlenbrach, Prilmeier, and Stulz in 2012 (Rüdiger Fahlenbrach (2022)), or Sohn and Park in 2016 (Sohn and Park (2016)). However, these research have not considered the time varying heterogeneity in the EWIs’ importance and considered them to be static over time.

We employed a sparse analysis technique in the context of Bayesian inference, known as shrinkage prior distributions. We imposed shrinkage prior distributions on both the initial values and standard deviations of time-varying coeﬀicients vector to allow for the simultaneous estimation of time-varying and stable parameters, while also distin- guishing between indicators that are useful for predicting financial crises and those that are not. The application of such inference methods has recently gained theoretical understanding, and this paper is among the first to apply them to the identification of EWIs and prediction of financial crises.

Our paper makes two significant contributions to the field. First, we have identified a set of EWIs among various macroeconomic variables. Our analysis reveals that these EWIs consist of those that have been consistently useful over a long period, as well as those that have become effective recently: mainly since the 1980s. Stable EWIs identified include credit-to-GDP ratio and the non- core funding ratio of banks, which is in line with numerous previous studies that have attempted to identify EWIs. Variables that have gained increasing importance in recent years include the global credit-to-GDP ratio, which suggests that there is a growing need to consider global factors, not just domestic ones, in predicting and understanding financial crises.

 Secondly, we have confirmed that our model surpasses alternative models, such as logistic regression and LASSO regression, in terms of out-of-sample prediction accuracy. Our results suggest that taking into account the time variation of EWIs can enhance our understanding of past crises and, at the same time, offer valuable insights into future financial crisis predictions.

This section exhibits performance evaluations for out-of-sample prediction of various machine learning models including Recurrent Neural Networks. Later on the paper, we will compare the results with Sparse TVP model.

## Import Libraries

In [1]:

from tensorflow import keras
from keras import layers
from keras.layers import BatchNormalization

#%tensorflow_version 2.x
# %tensorflow_version 1.x
import tensorflow as tf
from datetime import datetime
print(tf.__version__)

#from tf.keras.utils import to_categorical
from sklearn import linear_model
from sklearn.linear_model import LogisticRegression
from sklearn.calibration import calibration_curve
from keras.models import load_model
from keras import optimizers

import pandas as pd
import numpy as np
from sklearn.metrics import roc_auc_score, roc_curve
from sklearn.preprocessing import OneHotEncoder
import statsmodels.api as sm

import matplotlib.pyplot as plt
from scipy.stats import norm
import time 

from numpy.random import seed

from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import LSTM, SimpleRNN,GRU
from keras.regularizers import l2 as l2_reg

#import tf.compat.v1.set_random_seed as set_random_seed

from keras import backend as K 


  from pandas.core.computation.check import NUMEXPR_INSTALLED
  from pandas.core import (


2.8.0


In [5]:
# Hide long texts while deep learning
from contextlib import contextmanager
import sys, os

@contextmanager
def suppress_stdout():
    with open(os.devnull, "w") as devnull:
        old_stdout = sys.stdout
        sys.stdout = devnull
        try:  
            yield
        finally:
            sys.stdout = old_stdout

print("Now you see it")
with suppress_stdout():
    print("Now you don't")

Now you see it


## Import Data

In [4]:
df_norm = pd.read_csv('df_normalized.csv')
df_norm.rename(columns={'crisis':'precrisis'},inplace=True)
df_norm['cid'] = df_norm.country.astype('category').cat.codes

pd.set_option('display.max_rows', 100)
display(df_norm.head(10))

Unnamed: 0,year,country,precrisis,Slope of Yield Curves,Slope of Yield Curves*,Public Debt to GDP,Credit to GDP Change,Credit to GDP Change*,Current Account Change,Investment to GDP Change,Exchange Rates Change,Capital Asset Ratio Change,Noncore Funding Ratio Change,Inflation Rate,Equity Growth,Consumption Growth,Real Houseprice Growth,cid
0,1874,Australia,0,-0.311148,1.542448,-0.847096,-0.334433,0.273561,-1.757765,0.28436,-0.036177,-1.184634,0.190768,-0.201102,0.487666,1.243702,0.409785,0
1,1875,Australia,0,-0.435673,0.476695,-0.744683,-0.049236,0.624199,0.510137,1.21684,-0.036451,-1.126137,-0.328148,0.079602,0.290156,1.077924,0.413669,0
2,1876,Australia,0,-0.406018,1.461477,-0.940277,0.078602,0.035903,-0.08523,0.842393,-0.036373,-2.00702,-0.219864,-0.22084,0.149136,0.195999,-1.289768,0
3,1877,Australia,0,-0.445026,1.295201,-0.894524,0.532653,0.202339,-0.654389,1.751451,-0.033969,-0.645746,-0.959008,-0.233001,0.19467,-1.232482,-1.281108,0
4,1878,Australia,0,-0.569581,0.600902,-0.902954,0.461513,1.07826,-0.907407,0.351858,-0.033873,0.669835,-0.355592,-0.494379,-0.392927,0.448139,0.392051,0
5,1879,Australia,0,-0.538929,0.239381,-0.809081,-0.628376,0.629703,0.332873,-2.039362,-0.03543,0.632769,0.142456,-0.994401,-0.250951,0.084916,1.100273,0
6,1880,Australia,0,-0.535863,0.101709,-0.774537,-1.294604,-0.334127,1.741075,-0.343232,-0.036364,-0.408292,0.043283,-1.125778,0.452409,-1.535352,-0.161211,0
7,1881,Australia,0,-0.853721,-0.768216,-0.748546,-0.188738,-0.021616,-0.662892,1.808328,-0.036312,-2.328051,0.15038,-1.011321,0.238567,0.013518,-0.44892,0
8,1882,Australia,0,-0.731929,-0.89486,-0.747641,1.313393,0.359391,-2.664316,-0.005106,-0.036366,-2.194431,-0.81377,-0.622161,0.023392,-0.021347,-0.108647,0
9,1883,Australia,0,-0.734599,-1.076544,-0.725891,0.58621,-0.060203,-0.237749,-1.139963,-0.036526,-0.264986,0.138138,0.465016,0.346647,-0.331488,-0.684385,0


In [6]:
# Columns
list(df_norm)

['year',
 'country',
 'precrisis',
 'Slope of Yield Curves',
 'Slope of Yield Curves*',
 'Public Debt to GDP',
 'Credit to GDP Change',
 'Credit to GDP Change*',
 'Current Account Change',
 'Investment to GDP Change',
 'Exchange Rates Change',
 'Capital Asset Ratio Change',
 'Noncore Funding Ratio Change',
 'Inflation Rate',
 'Equity Growth',
 'Consumption Growth',
 'Real Houseprice Growth',
 'cid']

# Implement the prediction evaluation.


In [7]:
# Return the loss and relative usefulness given tp,tn,fn,fp and theta.
def loss_use(tpfntnfp,theta):
  """
  Parameters

  Theta: Preference Parameter; It represents the policymaker's preference whether they fear missing a crisis more than they fear causing a false alarm.
  ex) If Theta=0.8, you are giving 80% of the importance to catching a crisis and only 20% to avoiding false alarms.
  If Theta is higher, it is too cautious. If Theta is lower, you only want to sound the alarm if you are absolutely sure.

  """

  # True Positive
  tp=tpfntnfp[0]
  # False Negative
  fn=tpfntnfp[1]
  # True Negative
  tn=tpfntnfp[2]
  # False Positive
  fp=tpfntnfp[3]

  # False Negative Rates: Missed Crisis
  fnr = fn/(tp+fn)
  # False Positive Rates: False Alarm
  fpr = fp/(tn+fp)  
  loss = theta * fnr + (1-theta) * fpr
  # Relative Usefulness: Is this AI actually better than doing nothing
  ru = (np.minimum(theta,1-theta)-loss)/np.minimum(theta,1-theta)
  return fnr,fpr,loss,ru  

# Return tp,fn,fp,tn for test data given threshold t
def tpfntnfp_fun(ypred,ytrue,t):
  """
  This function sorts the AI's predictions into four categories based on whether it was right or wrong.
  
  True Positive   : Predicted a crisis, and a crisis actually happened.
  False Negative  : Predicted "No Crisis," but a crisis happened.
  False Positive  : Predicted a crisis, but nothing happened.
  True Negative)  : Predicted "No Crisis," and nothing happened.
  
  """
  tp = np.sum(np.logical_and(ypred>=t,ytrue==1))
  fn = np.sum(np.logical_and(ypred<t,ytrue==1))
  fp = np.sum(np.logical_and(ypred>=t,ytrue==0))
  tn = np.sum(np.logical_and(ypred<t,ytrue==0))
  return [tp,fn,tn,fp]

# Returns the optimal threshold for policymaker's loss function with parameter theta,
# given a set of training data for the policymaker
def optimize_threshold(ypred,ytrue,theta):
  """

  This function decides at what percentage we should actually sound the alarm. 
  This "cutoff point" is called the Threshold.

  """
  # Suffices to consider each ypred with '1' (actual crisis events) as thresholds
  T = np.sort(ypred[ytrue==1])
  t_opt = 0
  loss_opt = 1e12
  for t in T:
    #  Calculate fpr, tpr, fnr, tnr using threshold t
    # true positive = indicator correctly alerts crisis
    tp = np.sum(np.logical_and(ypred>=t,ytrue==1))
    fn = np.sum(np.logical_and(ypred<t,ytrue==1))
    fp = np.sum(np.logical_and(ypred>=t,ytrue==0))
    tn = np.sum(np.logical_and(ypred<t,ytrue==0))
    
    fnr = fn/(tp+fn) 
    fpr = fp/(tn+fp)
    loss = theta * fnr + (1-theta) * fpr
    if(loss<loss_opt):
      loss_opt = loss
      t_opt = t
  return t_opt

In [7]:
from sklearn.metrics import roc_auc_score
from keras.callbacks import Callback, EarlyStopping, ModelCheckpoint

# Override only on_epoch_end
class roc_callback(Callback):
    def __init__(self,training_data,validation_data):
        self.x = training_data[0]
        self.y = training_data[1]
        self.x_val = validation_data[0]
        self.y_val = validation_data[1]


    def on_train_begin(self, logs={}):
        return

    def on_train_end(self, logs={}):
        return

    def on_epoch_begin(self, epoch, logs={}):
        return

    # For Every 10 epoch, Calculate ROC AUCs of the train and validation dataset
    def on_epoch_end(self, epoch, logs={}):

        if((epoch+1) % 10 == 0):
          y_pred = self.model.predict(self.x)
          roc = roc_auc_score(self.y, y_pred)
          
          y_pred_val = self.model.predict(self.x_val)
          roc_val = roc_auc_score(self.y_val, y_pred_val)
          # INAUC: How well the model knows the data it has already seen (In-sample AUC)
          # OUTAUC: How well the model performs on 'new' data it hasn't seen before (Out-of-sample AUC)
          print('REPSTAT',epoch+1,'INAUC %.3f' % roc,'OUTAUC %.3f' % roc_val)
        return

    def on_batch_begin(self, batch, logs={}):
        return

    def on_batch_end(self, batch, logs={}):
        return

In [8]:
import warnings

class EarlyStopping2(Callback):
    """Stop training when a monitored quantity has stopped improving.
    # Arguments
        monitor: quantity to be monitored.
        min_delta: minimum change in the monitored quantity
            to qualify as an improvement, i.e. an absolute
            change of less than min_delta, will count as no
            improvement.
        patience: number of epochs that produced the monitored
            quantity with no improvement after which training will
            be stopped.
            Validation quantities may not be produced for every
            epoch, if the validation frequency
            (`model.fit(validation_freq=5)`) is greater than one.
        verbose: verbosity mode.
        mode: one of {auto, min, max}. In `min` mode,
            training will stop when the quantity
            monitored has stopped decreasing; in `max`
            mode it will stop when the quantity
            monitored has stopped increasing; in `auto`
            mode, the direction is automatically inferred
            from the name of the monitored quantity.
        baseline: Baseline value for the monitored quantity to reach.
            Training will stop if the model doesn't show improvement
            over the baseline.
        restore_best_weights: whether to restore model weights from
            the epoch with the best value of the monitored quantity.
            If False, the model weights obtained at the last step of
            training are used.
    """

    def __init__(self,validation_data,training_data,
                 monitor='val_loss',
                 min_delta=0,
                 patience=0,
                 verbose=0,
                 mode='auto',
                 baseline=None,
                 restore_best_weights=False, message = ' '):
        # Inherits all the basic functions of a standard Keras Callback
        super(EarlyStopping2, self).__init__()
        self.x_val = validation_data[0]
        self.y_val = validation_data[1]
        self.x_train = training_data[0]
        self.y_train = training_data[1]
        
        self.monitor = monitor
        self.baseline = baseline
        self.patience = patience
        self.verbose = verbose
        self.min_delta = min_delta
        self.wait = 0
        self.stopped_epoch = 0
        self.restore_best_weights = restore_best_weights
        self.best_weights = None
        self.best_epoch = 0
        self.message = message

        if mode not in ['auto', 'min', 'max']:
            warnings.warn('EarlyStopping mode %s is unknown, '
                          'fallback to auto mode.' % mode,
                          RuntimeWarning)
            mode = 'auto'
        # Loss
        if mode == 'min':
            self.monitor_op = np.less
        # AUC
        elif mode == 'max':
            self.monitor_op = np.greater
        # auto
        else:
            # If monitors metrics contains 'acc', 'higher is better' and uses the "greater than" operator
            if 'acc' in self.monitor:
                self.monitor_op = np.greater
            # Otherwise, it assumes lower is better
            else:
                self.monitor_op = np.less

        # Directional Adjustment: 'Minimum change' value with the direction of improvement (positive for gains, negative for reductions)
        if self.monitor_op == np.greater:
            self.min_delta *= 1
        else:
            self.min_delta *= -1

    def on_train_begin(self, logs=None):
        # Reset
        self.wait = 0
        self.stopped_epoch = 0
        if self.baseline is not None:
            self.best = self.baseline
        else:
            self.best = np.Inf if self.monitor_op == np.less else -np.Inf

    def on_epoch_end(self, epoch, logs=None):
        y_pred_val = self.model.predict(self.x_val)
        y_pred_train = self.model.predict(self.x_train)
        current = roc_auc_score(self.y_val, y_pred_val)
        current_train = roc_auc_score(self.y_train, y_pred_train)
        print("Epoch " +str(epoch+1) + " AUC callback, validation: " + str(current) + " training" + str(current_train) + self.message)
        # For 10 epoch, no scoring
        if(epoch<10):
          current = 0

        if current is None:
            return

        if(current > self.best):
            self.best = current
            self.best_epoch = epoch + 1
            # Reset
            self.wait = 0
            # If restore_best_weights = True, memorize the weight
            # If False as our default, ignore
            if self.restore_best_weights:
                self.best_weights = self.model.get_weights()
        else:
            self.wait += 1
            if self.wait >= self.patience:
                self.stopped_epoch = epoch
                # emergency brake
                self.model.stop_training = True
                if self.restore_best_weights:
                    self.model.set_weights(self.best_weights)

    def on_train_end(self, logs=None):
        if self.stopped_epoch > 0 and self.verbose > 0:
            print('Epoch %05d: early stopping' % (self.stopped_epoch + 1))

### Calculate AUC

This code is a custom metric function designed to calculate the Area Under the ROC Curve (AUC) using older TensorFlow (specifically tf.contrib) features. It is more complex than a standard metric because it has to manage internal variables to keep a "running total" of the score during training.

In older versions of TensorFlow, calculating AUC wasn't as simple as a single number because AUC requires looking at the entire dataset at once to sort values and set thresholds. This function uses streaming_auc to update the score incrementally as each batch of data passes through the model.

* `tf.contrib.metrics.streaming_auc`: This is the engine. It tracks two things:

    * `value`: The current AUC score.
    * `update_op`: The operation that actually updates the "running count" of True Positives and False Positives.
<br>
<br>

* `metric_vars`: When you calculate a "streaming" metric, TensorFlow creates hidden "local variables" to store the counts. This line finds those specific variables so they don't get lost.

* `tf.add_to_collection`: This is a technical "fix." It moves those hidden local variables into the GLOBAL_VARIABLES group. This ensures that when the model starts a new session (starts running), these variables are properly initialized and ready to work.

* `tf.control_dependencies([update_op])`: This is the most important part for accuracy. It tells TensorFlow: "Do not return the AUC score until you have finished updating it with the latest data." This ensures the score you see on your screen is always up-to-date.

Under TF 2.x, Use `tf.keras.metrics.AUC()`

In [9]:
# define roc_callback, inspired by https://github.com/keras-team/keras/issues/6050#issuecomment-329996505
def auc_roc(y_true, y_pred):
    # any tensorflow metric
    value, update_op = tf.contrib.metrics.streaming_auc(y_pred, y_true,num_thresholds=1000)
    print("Stats AUC:" + str(value) + " Shape:" + str(y_pred.shape[0]))
    
    # find all variables created for this metric
    metric_vars = [i for i in tf.local_variables() if 'auc_roc' in i.name.split('/')[1]]

    # Add metric variables to GLOBAL_VARIABLES collection.
    # They will be initialized for new session.
    for v in metric_vars:
        tf.add_to_collection(tf.GraphKeys.GLOBAL_VARIABLES, v)

    # force to update metric values
    with tf.control_dependencies([update_op]):
        value = tf.identity(value)
        return value

In [11]:
# Returns the model and ts_mode.
# ts_mode = 0 for models that use lagged value of features
# ts_mode = 1 for RNN models that handle past features with timestep
import keras
from keras.layers import Input,Dense,Lambda
from keras.models import Model
def getModel(mm ,    # Choose the model
             units , # All NN models - unit for the layers
             Nf ,    # All NN models - number of feature types
             reg_weight , # All NN models - for L2 reg
             timestep ,   # All RNN models
             algo ,    # All NN models - 'adam', 'rmsprop', 'nadam','adagrad','adamax','adadelta','sgd'
             dropout , # All NN models - dropout weight
             batchnormalization , # Multilayer perceptron - true/false
             hiddenlayers ,       # Multilayer perceptron
             nlags,              # Multilayer perceptron and logit - lags for features (max. value = timestep)
             return_state,
             rnn_mode,
             learning_rate):
  n_categories=1
  # Each algo uses Keras default values for learning rate and other params.
  
  if(algo=='adam'):
    algo = tf.keras.optimizers.Adam(lr =learning_rate);
  else:
    algo = tf.keras.optimizers.RMSprop(lr =learning_rate);
  mod = Sequential()
  if mm == 1:
    # Simplest
    if(n_categories>2):
      mod.add(Dense(n_categories, activation='softmax', input_dim=Nf*nlags, kernel_regularizer=l2_reg(reg_weight[3])))
    else:
      mod.add(Dense(1, activation='sigmoid', input_dim=Nf*nlags, kernel_regularizer=l2_reg(reg_weight[3])))
    ts_mode = 0

  if mm == 3:
    if(return_state == False):

      if(hiddenlayers>1):
        mod.add(SimpleRNN(units, input_shape=(timestep, Nf), return_sequences = True, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout))
        for i in range(2,hiddenlayers):
          mod.add(SimpleRNN(units, return_sequences = True, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout))
        mod.add(SimpleRNN(units, return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout))
      else:
        mod.add(SimpleRNN(units, input_shape=(timestep, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout))
              
      if(n_categories>2):
        mod.add(Dense(n_categories, activation='softmax',  kernel_regularizer=l2_reg(reg_weight[3])))
      else:
        mod.add(Dense(1, activation='sigmoid', kernel_regularizer=l2_reg(reg_weight[3])))
      
      ts_mode = 1

    else:
      inputs = Input(shape=(timestep,Nf))
      
      if(rnn_mode==1):
        ip0,ip1,ip2,ip3,ip4 = Lambda(lambda x: tf.split(x,timestep,axis=1))(inputs)

        op1 = SimpleRNN(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = True,dropout=dropout,recurrent_dropout=dropout)(inputs = ip0)
        op2 = SimpleRNN(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = True,dropout=dropout,recurrent_dropout=dropout)(inputs = ip1, initial_state=op1[1:])
        op3 = SimpleRNN(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = True,dropout=dropout,recurrent_dropout=dropout)(inputs = ip2, initial_state=op2[1:])
        op4 = SimpleRNN(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = True,dropout=dropout,recurrent_dropout=dropout)(inputs = ip3, initial_state=op3[1:])
        op5 = SimpleRNN(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = False,dropout=dropout,recurrent_dropout=dropout)(inputs = ip4, initial_state=op4[1:])

      elif(rnn_mode==2):
        ip0,ip1 = Lambda(lambda x: tf.split(x,[timestep-1,1],axis=1))(inputs)
        op1 = SimpleRNN(units, input_shape=(timestep-1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = True,dropout=dropout,recurrent_dropout=dropout)(inputs = ip0)
        op5 = SimpleRNN(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = False,dropout=dropout,recurrent_dropout=dropout)(inputs = ip1, initial_state=op1[1:])

      predictions = Dense(1, activation='sigmoid', kernel_regularizer=l2_reg(reg_weight[3]))(op5)  #, kernel_regularizer=l2_reg(reg_weight[0])
      mod = Model(inputs=inputs, outputs=predictions)
     
      ts_mode=1
      
 
  if mm == 4 and dropout==0.0:
    if(return_state == False):
      # LSTM
      if(hiddenlayers>1):
        mod.add(LSTM(units, input_shape=(timestep, Nf), return_sequences = True, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2])))
        for i in range(2,hiddenlayers):
          mod.add(LSTM(units, return_sequences = True, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2])))
        mod.add(LSTM(units, return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2])))
      else:
        mod.add(LSTM(units, input_shape=(timestep, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2])))

      if(n_categories>2):
        mod.add(Dense(n_categories, activation='softmax',  kernel_regularizer=l2_reg(reg_weight[3])))
      else:
        mod.add(Dense(1, activation='sigmoid', kernel_regularizer=l2_reg(reg_weight[3])))
      
      ts_mode = 1
    else:
      inputs = Input(shape=(timestep,Nf))
      
      if(rnn_mode==1):
        ip0,ip1,ip2,ip3,ip4 = Lambda(lambda x: tf.split(x,timestep,axis=1))(inputs)

        op1 = LSTM(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = True)(inputs = ip0)
        op2 = LSTM(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = True)(inputs = ip1, initial_state=op1[1:])
        op3 = LSTM(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = True)(inputs = ip2, initial_state=op2[1:])
        op4 = LSTM(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = True)(inputs = ip3, initial_state=op3[1:])
        op5 = LSTM(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = False)(inputs = ip4, initial_state=op4[1:])

      elif(rnn_mode==2):
        ip0,ip1 = Lambda(lambda x: tf.split(x,[timestep-1,1],axis=1))(inputs)
        op1 = LSTM(units, input_shape=(timestep-1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = True)(inputs = ip0)
        op5 = LSTM(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = False)(inputs = ip1, initial_state=op1[1:])

      elif(rnn_mode==3):
        op1 = LSTM(units, input_shape=(timestep, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = True)(inputs)
        op5 = keras.layers.concatenate([op1[0],op1[2]], axis=-1)

      if(n_categories>2):
        predictions = Dense(n_categories, activation='softmax', kernel_regularizer=l2_reg(reg_weight[3]))(op5)
      else:
        predictions = Dense(1, activation='sigmoid', kernel_regularizer=l2_reg(reg_weight[3]))(op5)
      mod = Model(inputs=inputs, outputs=predictions)
    
      ts_mode=1      

  if mm == 4 and dropout>0.0:
    if(return_state == False):
      # LSTM
      if(hiddenlayers>1):
        mod.add(LSTM(units, input_shape=(timestep, Nf), return_sequences = True, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout))
        for i in range(2,hiddenlayers):
          mod.add(LSTM(units, return_sequences = True, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout))
        mod.add(LSTM(units, return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout))
      else:
        mod.add(LSTM(units, input_shape=(timestep, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout))
 
      if(n_categories>2):
        mod.add(Dense(n_categories, activation='softmax',  kernel_regularizer=l2_reg(reg_weight[3])))
      else:
        mod.add(Dense(1, activation='sigmoid', kernel_regularizer=l2_reg(reg_weight[3])))
      
      ts_mode = 1
    else:
      inputs = Input(shape=(timestep,Nf))
      
      if(rnn_mode==1):
        ip0,ip1,ip2,ip3,ip4 = Lambda(lambda x: tf.split(x,timestep,axis=1))(inputs)

        op1 = LSTM(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout,return_state = True)(inputs = ip0)
        op2 = LSTM(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout,return_state = True)(inputs = ip1, initial_state=op1[1:])
        op3 = LSTM(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout,return_state = True)(inputs = ip2, initial_state=op2[1:])
        op4 = LSTM(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout,return_state = True)(inputs = ip3, initial_state=op3[1:])
        op5 = LSTM(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout,return_state = False)(inputs = ip4, initial_state=op4[1:])

      elif(rnn_mode==2):
        ip0,ip1 = Lambda(lambda x: tf.split(x,[timestep-1,1],axis=1))(inputs)
        op1 = LSTM(units, input_shape=(timestep-1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout,return_state = True)(inputs = ip0)
        op5 = LSTM(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout,return_state = False)(inputs = ip1, initial_state=op1[1:])

      elif(rnn_mode==3):
        op1 = LSTM(units, input_shape=(timestep, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout,return_state = True)(inputs)
        op5 = keras.layers.concatenate([op1[0],op1[2]], axis=-1)

      if(n_categories>2):
        predictions = Dense(n_categories, activation='softmax', kernel_regularizer=l2_reg(reg_weight[3]))(op5)
      else:
        predictions = Dense(1, activation='sigmoid', kernel_regularizer=l2_reg(reg_weight[3]))(op5)  #, kernel_regularizer=l2_reg(reg_weight[0])
      mod = Model(inputs=inputs, outputs=predictions)

      ts_mode=1      


  if mm == 5 and dropout==0:
    if(return_state == False):
      # GRU
      if(hiddenlayers>1):
        mod.add(GRU(units, input_shape=(timestep, Nf), return_sequences = True, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2])))
        for i in range(2,hiddenlayers):
          mod.add(GRU(units, return_sequences = True, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2])))
        mod.add(GRU(units, return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2])))
      else:
        mod.add(GRU(units, input_shape=(timestep, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2])))

      if(n_categories>2):
        mod.add(Dense(n_categories, activation='softmax',  kernel_regularizer=l2_reg(reg_weight[3])))
      else:
        mod.add(Dense(1, activation='sigmoid', kernel_regularizer=l2_reg(reg_weight[3])))    
      ts_mode = 1

    else:
      inputs = Input(shape=(timestep,Nf))
      
      if(rnn_mode==1):
        ip0,ip1,ip2,ip3,ip4 = Lambda(lambda x: tf.split(x,timestep,axis=1))(inputs)

        op1 = GRU(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = True)(inputs = ip0)
        op2 = GRU(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = True)(inputs = ip1, initial_state=op1[1:])
        op3 = GRU(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = True)(inputs = ip2, initial_state=op2[1:])
        op4 = GRU(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = True)(inputs = ip3, initial_state=op3[1:])
        op5 = GRU(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = False)(inputs = ip4, initial_state=op4[1:])

      elif(rnn_mode==2):
        ip0,ip1 = Lambda(lambda x: tf.split(x,[timestep-1,1],axis=1))(inputs)
        op1 = GRU(units, input_shape=(timestep-1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = True)(inputs = ip0)
        op5 = GRU(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),return_state = False)(inputs = ip1, initial_state=op1[1:])

      
      if(n_categories>2):
        predictions = Dense(n_categories, activation='softmax', kernel_regularizer=l2_reg(reg_weight[3]))(op5)
      else:
        predictions = Dense(1, activation='sigmoid', kernel_regularizer=l2_reg(reg_weight[3]))(op5)
      mod = Model(inputs=inputs, outputs=predictions)

      ts_mode=1
  if mm == 5 and dropout>0:
    if(return_state == False):
      # GRU
      if(hiddenlayers>1):
        mod.add(GRU(units, input_shape=(timestep, Nf), return_sequences = True, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout))
        for i in range(2,hiddenlayers):
          mod.add(GRU(units, return_sequences = True, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout))
        mod.add(GRU(units, return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout))
      else:
        mod.add(GRU(units, input_shape=(timestep, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout))
      if(dropout>0.0):
        mod.add(Dropout(dropout))
      if(n_categories>2):
        mod.add(Dense(n_categories, activation='softmax',  kernel_regularizer=l2_reg(reg_weight[3])))
      else:
        mod.add(Dense(1, activation='sigmoid', kernel_regularizer=l2_reg(reg_weight[3])))     
      ts_mode = 1

    else:
      inputs = Input(shape=(timestep,Nf))
      
      if(rnn_mode==1):
        ip0,ip1,ip2,ip3,ip4 = Lambda(lambda x: tf.split(x,timestep,axis=1))(inputs)

        op1 = GRU(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout,return_state = True)(inputs = ip0)
        op2 = GRU(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout,return_state = True)(inputs = ip1, initial_state=op1[1:])
        op3 = GRU(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout,return_state = True)(inputs = ip2, initial_state=op2[1:])
        op4 = GRU(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout,return_state = True)(inputs = ip3, initial_state=op3[1:])
        op5 = GRU(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout,return_state = False)(inputs = ip4, initial_state=op4[1:])

      elif(rnn_mode==2):
        ip0,ip1 = Lambda(lambda x: tf.split(x,[timestep-1,1],axis=1))(inputs)
        op1 = GRU(units, input_shape=(timestep-1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout,return_state = True)(inputs = ip0)
        op5 = GRU(units, input_shape=(1, Nf), return_sequences = False, kernel_regularizer=l2_reg(reg_weight[0]), recurrent_regularizer=l2_reg(reg_weight[1]),activity_regularizer=l2_reg(reg_weight[2]),dropout=dropout,recurrent_dropout=dropout,return_state = False)(inputs = ip1, initial_state=op1[1:])

      if(n_categories>2):
        predictions = Dense(n_categories, activation='softmax', kernel_regularizer=l2_reg(reg_weight[3]))(op5)
      else:
        predictions = Dense(1, activation='sigmoid', kernel_regularizer=l2_reg(reg_weight[3]))(op5)
      mod = Model(inputs=inputs, outputs=predictions)

      ts_mode=1
      

  if mm == 0:
    # Logistic (use statsmodels library's logistic regression)
    return 0,2
  if mm == 6:
    # Lasso (use sklearn lasso regression)
    mod = linear_model.Lasso(alpha=reg_weight[0])
    ts_mode = 3
    return mod, ts_mode
  if mm == 7:
    # Regularized logistic regression    
    mod = LogisticRegression( C = 1/reg_weight[0],penalty='l1', solver='liblinear' )
    ts_mode = 4
    return mod, ts_mode
  
  if(n_categories>2):
    mod.compile(loss='categorical_crossentropy', optimizer=algo, metrics=['accuracy'])
  else:
    mod.compile(loss='binary_crossentropy', optimizer=algo, metrics=['accuracy'])
  
  return mod,ts_mode

In [12]:
# Fits the model
def fitModel( XTRAIN, 
              YTRAIN, 
              XTEST, 
              YTEST, 
              mod, 
              ts_mode, 
              timestep, 
              nlags, 
              batch_size, 
              epochs, 
              print_data = True, 
              validate = False, 
              patience = 300, 
              message = ' ', 
              class_weight={0: 0.5, 1: 0.5},
              logit_reg_weight=0.1):
              
        history = 0;
        if ts_mode == 0:
          # Reshape the data to use the lagged values of features
          XTRAIN, XTEST = reshape_features(XTRAIN, XTEST, timestep, nlags)
          if(validate):
            my_callbacks = [EarlyStopping2(validation_data=(XTEST, YTEST),training_data=(XTRAIN, YTRAIN),monitor='auc_roc', patience=patience, verbose=1, mode='max',restore_best_weights=False, message = message)]
            history = mod.fit(XTRAIN, YTRAIN, batch_size=batch_size, epochs=epochs, shuffle=True,verbose = 0, callbacks=my_callbacks, class_weight=class_weight)
          else:
            with suppress_stdout():
              history = mod.fit(XTRAIN, YTRAIN, batch_size=batch_size, shuffle=True,epochs=epochs,verbose = 0, class_weight=class_weight)
          yp = mod.predict(XTEST)
          t_opt = optimize_threshold(mod.predict(XTRAIN),YTRAIN[:,0],0.5)  
          y_pred = yp[:,0]
          y_true = YTEST[:,0]
          AUC_train=roc_auc_score( YTRAIN[:,0],mod.predict( XTRAIN ));
          if(print_data):
            print('Training results, Accuracy %.3f' % history.history['acc'][-1], 'AUC  %.3f' % roc_auc_score( YTRAIN[:,0],mod.predict( XTRAIN ) ) )
        elif ts_mode == 1:
          if(validate):
            my_callbacks = [EarlyStopping2(validation_data=(XTEST, YTEST),training_data=(XTRAIN, YTRAIN),monitor='auc_roc', patience=patience, verbose=1, mode='max',restore_best_weights=False, message = message)] #, ModelCheckpoint(filepath="weights{epoch:03d}.hdf5")]
            history = mod.fit(XTRAIN, YTRAIN, batch_size=batch_size, epochs=epochs, shuffle=True,verbose = 0, callbacks=my_callbacks, class_weight=class_weight)
          else:
            with suppress_stdout():
              history = mod.fit(XTRAIN, YTRAIN, batch_size=batch_size, shuffle=True,epochs=epochs,verbose = 0, class_weight=class_weight)
          yp = mod.predict(XTEST)
          t_opt = optimize_threshold(mod.predict(XTRAIN),YTRAIN[:,0],0.5)
          y_pred = yp[:,0]
          y_true = YTEST[:,0]
          AUC_train=roc_auc_score( YTRAIN[:,0],mod.predict( XTRAIN ));
          if(print_data):
            print('Training results, Accuracy %.3f' % history.history['acc'][-1], 'AUC  %.3f' % roc_auc_score( YTRAIN[:,0],mod.predict( XTRAIN ) ) )
        elif ts_mode == 2: # Logistic regression
          XTRAIN, XTEST = reshape_features(XTRAIN, XTEST, timestep, nlags)
          XTRAIN = sm.add_constant(XTRAIN)
          XTEST = sm.add_constant(XTEST)
          YTRAIN = YTRAIN[:,0]
          with suppress_stdout():
            if(logit_reg_weight>0):
                mod = sm.Logit(YTRAIN, XTRAIN).fit_regularized(alpha=logit_reg_weight,method='l1');
            else:
                mod = sm.Logit(YTRAIN, XTRAIN).fit(method='bfgs'); #_regularized(alpha=0.01,method='l1');
          y_pred = mod.predict(XTEST)
          y_true = YTEST[:,0]
          t_opt = optimize_threshold(mod.predict(XTRAIN),YTRAIN,0.5)
          AccRate = np.sum((mod.predict( XTRAIN ) > .5)==YTRAIN)/len(YTRAIN) 
          AUC_train = roc_auc_score( YTRAIN,mod.predict( XTRAIN ));
          if(print_data):
            print('Training results, Accuracy  %.3f' % AccRate, 'AUC  %.3f' % roc_auc_score( YTRAIN,mod.predict( XTRAIN ) ) )
        elif ts_mode == 3: # Lasso
          XTRAIN, XTEST = reshape_features(XTRAIN, XTEST, timestep, nlags)
          XTRAIN = sm.add_constant(XTRAIN)
          print('Size:' + str(XTRAIN.shape))
          XTEST = sm.add_constant(XTEST)
          YTRAIN = YTRAIN[:,0]
          with suppress_stdout():
            mod.fit(X = XTRAIN, y = YTRAIN)
          y_pred = mod.predict(XTEST)
          y_true = YTEST[:,0]
          t_opt = optimize_threshold(mod.predict(XTRAIN),YTRAIN,0.5)
          AccRate = np.sum((mod.predict( XTRAIN ) > .5)==YTRAIN)/len(YTRAIN) 
          AUC_train = roc_auc_score( YTRAIN,mod.predict( XTRAIN ));
          if(print_data):
            print('Training results, Accuracy  %.3f' % AccRate, 'AUC  %.3f' % roc_auc_score( YTRAIN,mod.predict( XTRAIN ) ) )
        elif ts_mode == 4: # Logit regu
          XTRAIN, XTEST = reshape_features(XTRAIN, XTEST, timestep, nlags)
          XTRAIN = sm.add_constant(XTRAIN)
          print('Size:' + str(XTRAIN.shape))
          XTEST = sm.add_constant(XTEST)
          YTRAIN = YTRAIN[:,0]
          with suppress_stdout():
            mod.fit(X = XTRAIN, y = YTRAIN)
          y_pred = mod.predict_proba(XTEST)[:,1]
          y_true = YTEST[:,0]
          t_opt = optimize_threshold(mod.predict_proba(XTRAIN)[:,1],YTRAIN,0.5)
          AccRate = np.sum((mod.predict_proba( XTRAIN )[:,1] > .5)==YTRAIN)/len(YTRAIN) 
          AUC_train = roc_auc_score( YTRAIN,mod.predict_proba( XTRAIN )[:,1] ) ;
          if(print_data):
            print('Training results, Accuracy  %.3f' % AccRate, 'AUC  %.3f' % roc_auc_score( YTRAIN,mod.predict_proba( XTRAIN )[:,1] ) )
        return mod, y_pred,y_true,t_opt, AUC_train; 

In [13]:
# Plot keras training loss and accuracy
def plot_history(history):

  # Plot training accuracy values
  plt.plot(history.history['acc'])
  plt.title('Model accuracy')
  plt.ylabel('Accuracy')
  plt.xlabel('Epoch')
  plt.legend(['Train', 'Test'], loc='upper left')
  plt.show()

  # Plot training loss values
  plt.plot(history.history['loss'])
  plt.title('Model loss')
  plt.ylabel('Loss')
  plt.xlabel('Epoch')
  plt.legend(['Train', 'Test'], loc='upper left')
  plt.show()

In [14]:
# Plot ROC curve given true classes and predictions
def plot_auc(yt,yp):
  fpr,tpr,thresholds = roc_curve(yt,yp)
  xgrid = np.linspace(0,1,100) 
  plt.plot(fpr,tpr,'blue')
  plt.plot(xgrid,xgrid,'red')
  plt.xlabel("False positive rate")
  plt.ylabel("True positive rate")
  plt.show()

In [15]:
# Reshapes 3D (obs, time step, features) array into 2D array (obs, time step * features),
# i.e. the lagged values of features are taken as features in the new 2D array.
def reshape_features(XTRAIN, XTEST, timestep, nlags):
  XTRAIN = XTRAIN[:,(timestep-nlags):timestep,:]
  XTEST = XTEST[:,(timestep-nlags):timestep,:]
  n1,n2,n3 = np.shape(XTRAIN)
  XTRAIN = np.reshape(XTRAIN,(n1,n2*n3),'F')
  n1,n2,n3 = np.shape(XTEST)
  XTEST = np.reshape(XTEST,(n1,n2*n3),'F')
  return XTRAIN, XTEST

In [16]:
# Reshapes 3D (obs, time step, features) array into 2D array (obs, time step * features),
# i.e. the lagged values of features are taken as features in the new 2D array.
def reshape_features_ksi(XTEST, timestep, nlags):  
  XTEST = XTEST[:,(timestep-nlags):timestep,:]  
  n1,n2,n3 = np.shape(XTEST)
  XTEST = np.reshape(XTEST,(n1,n2*n3),'F')
  return XTEST

In [17]:
# Reshapes data into a three dimensional array such that:
# DIM
# 0 - Training observation index - dim n
# 1 - Timestep dimension = Lagged value of features - dim timestep
# 2 - Feature dimension - dim Nf
def reshape_data(df2,timestep,Nf,time_start,time_end):
  c_train = df2['country'].values
  year_train = df2['year'].values
  x_train = df2[all_predictors[0:Nf]].values
  y_train = df2['precrisis'].values
  cid_train = df2['cid'].values

  n,p = np.shape(x_train)
  Xt = np.empty(shape=(0,timestep,Nf))
  yt = np.empty(shape=(0,1),dtype=int)
  ct = list([])
  yeart = list([])
  cid = np.empty(shape=(0,),dtype=int)

  for ii in range(timestep-1,n): 
    if(c_train[ii-timestep+1]==c_train[ii] and year_train[ii-timestep+1]==year_train[ii]-timestep+1 and y_train[ii]!=-1 and year_train[ii]>=time_start and year_train[ii]<=time_end): # check that there is no break
      Xt = np.append(Xt,np.reshape(x_train[(ii-timestep+1):(ii+1),:],(1,timestep,Nf)),axis=0) # append the lagged features
      yt = np.append(yt,np.reshape(y_train[ii],(1,1)),axis=0) # append the target variable
      cid = np.append(cid,np.reshape(cid_train[ii],(1,)),axis=0) # append the target variable
      ct.append(c_train[ii]) # append the country
      yeart.append(year_train[ii]) # append the year 
  return Xt,yt,ct,yeart,cid

In [18]:
def reshape_data_d2c(df2,timestep,Nf,time_start,time_end):
  c_train = df2['country'].values
  year_train = df2['year'].values
  x_train = df2[all_predictors[0:Nf]].values
  cid_train = df2['cid'].values
  dp_train = df2['disttoprevcris'].values
  dn_train = df2['disttonextcris'].values

  n,p = np.shape(x_train)
  Xt = np.empty(shape=(0,timestep,Nf))
  ct = list([])
  yeart = list([])
  cid = np.empty(shape=(0,),dtype=int)
  dp = list([])
  dn = list([])

  for ii in range(timestep-1,n): 
    if(c_train[ii-timestep+1]==c_train[ii] and year_train[ii-timestep+1]==year_train[ii]-timestep+1 and year_train[ii]>=time_start and year_train[ii]<=time_end): # check that there is no break
      Xt = np.append(Xt,np.reshape(x_train[(ii-timestep+1):(ii+1),:],(1,timestep,Nf)),axis=0) # append the lagged features
      cid = np.append(cid,np.reshape(cid_train[ii],(1,)),axis=0) # append the target variable
      ct.append(c_train[ii]) # append the country
      dp.append(dp_train[ii])
      dn.append(-dn_train[ii])
      yeart.append(year_train[ii]) # append the year 
  return Xt,ct,yeart,cid,dp,dn

# Country by country cross validation


In [19]:
import random

#from tf.keras.models import clone_model
# Evaluate permance of a given model using country-by-country cross-validation.
def cross_validation2(mm ,  # must choose model class
                          df , # must choose input data
                          batch_size = 16, # Define training params for NN
                          epochs = 50,
                          # Rest of them are input for getModel:
                          units = 10, # All NN models - unit for the layers
                          Nf = 14,     # All NN models - number of feature types
                          reg_weight = [0.001,0,0,0.001], # All NN models - for L2 reg
                          timestep = 5,     # All RNN models
                          algo = 'adam',    # All NN models - 'adam', 'rmsprop', 'nadam','adagrad','adamax','adadelta','sgd'
                          dropout = 0.0,    # All NN models - dropout weight
                          batchnormalization = False, # Multilayer perceptron - true/false
                          print_epoch_stats = False,
                          hiddenlayers = 1,           # Multilayer perceptron
                          nlags = 1,                 # Multilayer perceptron and logit - lags for features
                          reps = 1, # How many times to train the neural network
                          patience = 2,
                          return_state = False,
                          rnn_mode = 1,
                          learning_rate=0.01,
                          fhandle=0,
                          fcast_horizon=1,
                          sub_epochs=1,
                          class_weight={0: 0.5, 1: 0.5},
                          plot_reliability=False,
                          validate=False,
                          time_start = 0,
                          time_end = 0, save_model=False,code=0): # output file handle
  
  name = modelname(mm=mm,rnn_mode=rnn_mode,return_state=return_state,lags=nlags);
  # First some helper variables
  ts_mode=0;
  if(mm>2):
    ts_mode=1
      
  if(ts_mode==1):
    data_mode='rnn';
  else:
    data_mode='time-window';
  if(mm==0):
    pred_mode='statsmodel';
  else:
    pred_mode='keras';
  
  df_train= df;  
  
  wreturn_state = 0;
  if(return_state==True):
    wreturn_state = 1;
  # Reshape the data and add timestep lags
  Xt,yt,ct,yeart,cid = reshape_data(df_train,timestep,Nf,time_start,time_end);

  # Reset random seed
  t_start = time.time()
  seed(2)
  random.seed(3)
  
  summary_not_shown = True;

  all_countries = df_train.country.unique()
  c=-1
  K.clear_session()
  models = [];
  best_weights = [];
  AUC_best=0;
  for epoch in range(0,epochs):
    
    # Allocate tables to store predictions, statistics etc.
    all_ytest = np.empty(shape =(0,));
    all_yp = np.empty(shape =(0,));
    
    c = -1;
    ave_AUC_train=0;    
    rc=-1
    for country in all_countries:
      r = -1
      c=c+1;
          
      # Specify the training and test data when one country is excluded
      train = [i for i, val in enumerate(ct) if(val!=country)];
      test = [i for i, val in enumerate(ct) if(val==country)];
          
      # Extract the training and test data from the full-sample.
      XTRAIN=Xt[train,:,:];
      YTRAIN=yt[train,:];
      XTEST=Xt[test,:,:];
      YTEST=yt[test,:];
        
        
      # Allocate tables for reps
      ypreds = np.zeros((YTEST.shape[0],reps),dtype=float);

      for rep in range(0,reps):
        rc=rc+1;
        r=r+1;
        if(epoch==0):
          mod, ts_mode = getModel(mm, units, Nf, reg_weight, timestep , algo, dropout, batchnormalization, hiddenlayers, nlags, return_state = return_state, rnn_mode = rnn_mode,learning_rate=learning_rate);
          models.append(mod);
          if(mm>0):
            best_weights.append(mod.get_weights());
          
        
        # Fit model
        if(validate):
          models[rc],ypreds[:,r],y_true,t_opt, AUC_train = fitModel(XTRAIN, YTRAIN, XTEST, YTEST, models[rc], ts_mode, timestep, nlags, batch_size, sub_epochs, print_data = False,validate=True,class_weight=class_weight,logit_reg_weight=reg_weight[0])    
        else:
          models[rc],ypreds[:,r],y_true,t_opt, AUC_train = fitModel(XTRAIN, YTRAIN, XTEST, YTEST, models[rc], ts_mode, timestep, nlags, batch_size, sub_epochs, print_data = False,class_weight=class_weight,logit_reg_weight=reg_weight[0])

        # Calculate simple average of all training AUCs
        ave_AUC_train = ave_AUC_train+AUC_train;
        
      
      # Amend the statistics and predictions etc.      
      all_ytest = np.append(all_ytest,y_true,axis=0);
      y_pred = np.mean(ypreds,axis=1);
      all_yp = np.append(all_yp,y_pred,axis=0);

      # END OF C LOOP
    cmax=c;
    AUC = roc_auc_score(all_ytest,all_yp);
    ave_AUC_train=ave_AUC_train/(rc+1);
    if(AUC>AUC_best):
        rc=-1;
        for c in range(0,cmax+1):
          for r in range(0,reps):
            rc=rc+1;
            if(mm>0):
              best_weights[rc] = models[rc].get_weights();
            AUC_best=AUC;
            AUC_best_ave_AUC_train = ave_AUC_train;
            best_epoch=epoch;

    
    if(print_epoch_stats):
      #f.write( "%d;%d;%d;%d;%d;%d;%7.6f;%7.6f;%7.6f;%7.6f;%7.6f\n" % (mm,fcast_horizon,rnn_mode,wreturn_state,hiddenlayers,epoch,AUC,reg_weight[0],reg_weight[1],reg_weight[2],reg_weight[3]) );
      #f.flush();
      print("Stats - Epoch:",(epoch+1)*sub_epochs,"AUC-val %5.3f " %(AUC),"AUC-train %5.3f" %(ave_AUC_train))
        
    # END OF EPOCH LOOP
  
  # POST-PROCESSING PART I:
  # Retrieve the best weights and evaluate AUC for all forecast horizons
  
  df2=df;

  Xt2,yt2,ct2,_,_ = reshape_data(df2,timestep,Nf,time_start,time_end);
  rc=-1;
  
  all_ytest2 = np.empty(shape =(0,));
  all_yp2 = np.empty(shape =(0,));
  
  date_time = datetime.now().strftime("%Y_%m_%d-%H_%M_%S-")
  for country in all_countries:
    test2 = [i for i, val in enumerate(ct2) if(val==country)];
    
    XTEST2=Xt2[test2,:,:];
    y_true2=yt2[test2,0];
        
    if(data_mode=='time-window'):
      XTEST2 = reshape_features_ksi(XTEST2, timestep, nlags);
  
      if(pred_mode!='keras'):

        if(XTEST2.shape[0]>0):
            XTEST2 = sm.add_constant(XTEST2,has_constant='add');
        
    # Allocate tables for reps
    ypreds2 = np.zeros((y_true2.shape[0],reps),dtype=float);
 
    for r in range(0,reps):
      rc=rc+1;
      if(mm>0):
        models[rc].set_weights(best_weights[rc]);
        if(save_model):
          models[rc].save(name + '_' + str(rc) + "-" + date_time + '.h5');
      if(pred_mode=='keras'):
     
        if(XTEST2.shape[0]>0):
          ypreds2[:,r] = models[rc].predict(XTEST2)[:,0];
     
      else:            
     
        if(XTEST2.shape[0]>0):
          ypreds2[:,r] = models[rc].predict(XTEST2);
     
    # Amend the statistics and predictions etc.
    all_ytest2 = np.append(all_ytest2,y_true2,axis=0);
    y_pred2 = np.mean(ypreds2,axis=1);
    all_yp2 = np.append(all_yp2,y_pred2,axis=0);

  AUC2 = roc_auc_score(all_ytest2,all_yp2);

  np.savetxt(date_time + name + "-probs2.out", np.concatenate((np.reshape(all_ytest2,(-1,1)),np.reshape(all_yp2,(-1,1))),axis=1), delimiter=";",header="y;p");
  
  # Pick the relevant thing for rest of the analysis
  if(fcast_horizon==2):
    all_ytest = all_ytest2;
    all_yp = all_yp2;
  
  # Calculate usefulness statistics
  t_opt=optimize_threshold(all_yp,all_ytest,0.5);
  tpfntnfp=tpfntnfp_fun(all_yp,all_ytest,t_opt);
  fnr,fpr,loss,ru=loss_use(tpfntnfp,0.5);
  
  # Plot the calibration curve
  if(plot_reliability):
    bin_prob_true,bin_prob_pred = calibration_curve(all_ytest, all_yp, normalize=False, n_bins=10, strategy='quantile');
    fig, ax = plt.subplots()
    ax.scatter(bin_prob_pred,bin_prob_true,color='black');
    ax.plot([0,max(max(bin_prob_pred),max(bin_prob_true))],[0,max(max(bin_prob_pred),max(bin_prob_true))],color='black', linestyle='dashed');
    ax.set(xlabel='Mean predicted value', ylabel='Fraction of positives',title=name)
    plt.show();
    fig.savefig(date_time + name + "-cali.png",dpi=300)
  
  print("Results",(epoch+1)*sub_epochs,"AUC-val %5.3f " %(AUC2),"AUC-train %5.3f" %(AUC_best_ave_AUC_train))

  crises = np.sum(all_ytest);
  obs = all_ytest.shape[0];

  statsout = "Cross-validation;" + name + ";" + str(fcast_horizon) + ";" + str(time_start) + ";" + str(time_end);
  statsout += ";" + str(epochs*sub_epochs) + ";" + str(best_epoch*sub_epochs) + ";" + str(units) + ";" + str(reg_weight[0]) + ";" + algo + ";";
  statsout += str(learning_rate) + ";" + str(batch_size) + ";" + str(reps) + ";"+ str(timestep) + ";%5.3f;;%5.3f" %(AUC2, AUC_best_ave_AUC_train);
  statsout += ";%5.3f;%5.3f;%5.3f;%5.3f;" %(fnr,fpr,loss,ru) + str(crises) + ";" + str(obs)
  
# Create sample x  (1 (y true) + 1 (y pred) + Nf (phi)) array and dump it to csv
  dataout = np.zeros((all_ytest.shape[0],Nf+4),dtype=float);
  dataout[:,0] = all_ytest;
  dataout[:,1] = all_yp;
  dataout[:,3+Nf] = cid;
  filename = date_time + name + "-probs.csv"  
  if(code!=0):
        filename = '/Users/choejunhoe/Desktop/ECON_SP/python_data_graph' + str(code) + '.csv';    
  np.savetxt(filename, dataout, delimiter=";",header="y;p;phi1;phi2;phi3;phi4;phi5");
    
  

  return statsout;

In [20]:
def modelname(mm=0,rnn_mode=1,return_state=False,lags=1):
    if(mm==0):
        return "Logit(" + str(lags) + ")";
    if(mm==1):
        return "Logit LASSO(" + str(lags) + ")";
    if(mm==2):
        return "MLP(" + str(lags) + ")";
    if(mm>2):
        if(mm==3):
          name = "RNN";
        if(mm==4):
          name = "LSTM";
        if(mm==5):
          name = "GRU";
        if(return_state):
          if(rnn_mode==1):
            name = name + "_nps";
          if(rnn_mode==2):
            name = name + "_pps";
          if(rnn_mode==3):
            name = name + "-RS";
        return name;

# Sequential out-of-sample evaluation

In [21]:
#from tf.keras.models import clone_model
# Evaluate permance of a given model using country-by-country cross-validation.
def sequential_evaluation(mm ,  # must choose model class
                          df , # must choose input data
                          batch_size = 16, # Define training params for NN
                          epochs = 100,
                          # Rest of them are input for getModel:
                          units = 10, # All NN models - unit for the layers
                          Nf = 5,     # All NN models - number of feature types
                          reg_weight = [0.001,0,0,0.001], # All NN models - for L2 reg
                          timestep = 5,     # All RNN models
                          algo = 'adam',    # All NN models - 'adam', 'rmsprop', 'nadam','adagrad','adamax','adadelta','sgd'
                          dropout = 0.0,    # All NN models - dropout weight
                          batchnormalization = False, # Multilayer perceptron - true/false
                          print_epoch_stats = False,
                          hiddenlayers = 1,           # Multilayer perceptron
                          nlags = 1,                 # Multilayer perceptron and logit - lags for features
                          reps = 10, # How many times to train the neural network
                          patience = 2,
                          return_state = False,
                          rnn_mode = 1,
                          learning_rate=0.01,
                          fhandle=0,
                          fcast_horizon=1,
                          sub_epochs=1,
                          class_weight={0: 0.5, 1: 0.5},
                          plot_reliability=False,
                          validate=False,save_model=False,test_start_year=2003, # Define test set
                          test_end_year=2016,train_start_year=1970, # Define test set
                          train_end_year=2002,code=0,d2cgraph=False): # output file handle
  
  name = modelname(mm=mm,rnn_mode=rnn_mode,return_state=return_state,lags=nlags);
  date_time = datetime.now().strftime("%Y_%m_%d-%H_%M_%S-")
  # First some helper variables
  ts_mode=0;
  if(mm>2):
    ts_mode=1
      
  if(ts_mode==1):
    data_mode='rnn';
  else:
    data_mode='time-window';
  if(mm==0):
    pred_mode='statsmodel';
  else:
    pred_mode='keras';
  df_train=df_norm;  

  wreturn_state = 0;
  if(return_state==True):
    wreturn_state = 1;
  # Reshape the data and add timestep lags
  Xt,yt,ct,yeart,cid = reshape_data(df_train,timestep,Nf,train_start_year,test_end_year);

  # Set the training and test data for this year.
  train = [i for i, val in enumerate(yeart) if(val>=train_start_year and val<=train_end_year)]
  test = [i for i, val in enumerate(yeart) if(val>=test_start_year and val<=test_end_year)]
      
  # Extract the data from the full-sample.
  XTRAIN=Xt[train,:,:]
  YTRAIN=yt[train,:]
  XTEST=Xt[test,:,:]
  YTEST=yt[test,:]
  cid=cid[test];
  # Reset random seed
  t_start = time.time()
  seed(2)
  random.seed(3)
  print("Crises train:"+ str(np.sum(YTRAIN)))
  print("Crises test:"+ str(np.sum(YTEST)))

  summary_not_shown = True;

  c=-1
  K.clear_session()
  models = [];
  best_weights = [];
  AUC_best=0;
  ypreds = np.zeros((YTEST.shape[0],reps),dtype=float)
  for epoch in range(0,epochs):
      ave_AUC_train =0;
      for r in range(0,reps):

        if(epoch==0):
          mod, ts_mode = getModel(mm, units, Nf, reg_weight, timestep , algo, dropout, batchnormalization, hiddenlayers, nlags, return_state = return_state, rnn_mode = rnn_mode,learning_rate=learning_rate);
          models.append(mod);
          if(mm>0):
            best_weights.append(mod.get_weights());
        
        # Fit model
        if(validate):
          models[r],ypreds[:,r],y_true,t_opt, AUC_train = fitModel(XTRAIN, YTRAIN, XTEST, YTEST, models[r], ts_mode, timestep, nlags, batch_size, sub_epochs, print_data = False,validate=True,class_weight=class_weight,logit_reg_weight=reg_weight[0])    
        else:
          models[r],ypreds[:,r],y_true,t_opt, AUC_train = fitModel(XTRAIN, YTRAIN, XTEST, YTEST, models[r], ts_mode, timestep, nlags, batch_size, sub_epochs, print_data = False,class_weight=class_weight,logit_reg_weight=reg_weight[0])    

        # Calculate simple average of all training AUCs
        ave_AUC_train = ave_AUC_train+AUC_train;
        
      # Amend the statistics and predictions etc.      
      y_pred = np.mean(ypreds,axis=1);

      AUC = roc_auc_score(y_true,y_pred);
      ave_AUC_train=ave_AUC_train/(reps);
      if(AUC>AUC_best):
        for r in range(0,reps):
            if(mm>0):
              best_weights[r] = models[r].get_weights();
            AUC_best=AUC;
            AUC_best_ave_AUC_train = ave_AUC_train;
            best_epoch=epoch;

      if(print_epoch_stats):
        f.write( "%d;%d;%d;%d;%d;%d;%7.6f;%7.6f;%7.6f;%7.6f;%7.6f\n" % (mm,fcast_horizon,rnn_mode,wreturn_state,hiddenlayers,epoch,AUC,reg_weight[0],reg_weight[1],reg_weight[2],reg_weight[3]) );
        f.flush();
      print("Stats - Epoch:",(epoch+1)*sub_epochs,"AUC-val %5.3f " %(AUC),"AUC-train %5.3f" %(ave_AUC_train))
        
    # END OF EPOCH LOOP
  
  # POST-PROCESSING PART I:
  # Retrieve the best weights and evaluate AUC for all forecast horizons

  df2=df;
  Xt2,yt2,ct2,yeart2,_ = reshape_data(df2,timestep,Nf,train_start_year,test_end_year);
  test2 = [i for i, val in enumerate(yeart2) if(val>=test_start_year and val<=test_end_year)]
  XTEST2=Xt2[test2,:,:]
  y_true2=yt2[test2,0]
        
  if(data_mode=='time-window'):
      XTEST2 = reshape_features_ksi(XTEST2, timestep, nlags);
      if(pred_mode!='keras'):
        XTEST2 = sm.add_constant(XTEST2,has_constant='add');

  # Allocate tables for reps
  ypreds2 = np.zeros((y_true2.shape[0],reps),dtype=float);

  for r in range(0,reps):
      if(mm>0):
        models[r].set_weights(best_weights[r]);
        if(save_model):
          models[rc].save('seq_' + name + '_' + str(rc) + "-" + date_time + '.h5');
      if(pred_mode=='keras'):

        if(XTEST2.shape[0]>0):
          ypreds2[:,r] = models[r].predict(XTEST2)[:,0];

      else:            

        if(XTEST2.shape[0]>0):
          ypreds2[:,r] = models[r].predict(XTEST2);

  y_pred2 = np.mean(ypreds2,axis=1);  
  AUC2 = roc_auc_score(y_true2,y_pred2);
  np.savetxt(date_time + name + "-probs2.out", np.concatenate((np.reshape(y_true2,(-1,1)),np.reshape(y_pred2,(-1,1))),axis=1), delimiter=";",header="y;p");
  
  # Pick the relevant thing for rest of the analysis
  if(fcast_horizon==2):
    y_true = y_true2;
    y_pred = y_pred2;
  
  # Calculate usefulness statistics
  t_opt=optimize_threshold(y_pred,y_true,0.5);
  tpfntnfp=tpfntnfp_fun(y_pred,y_true,t_opt);
  fnr,fpr,loss,ru=loss_use(tpfntnfp,0.5);
  
  print("Results",(epoch+1)*sub_epochs,"AUC-val %5.3f" %(AUC2),"AUC-train %5.3f" %(AUC_best_ave_AUC_train))

  crises = np.sum(y_true);
  crises_train = np.sum(YTRAIN);
  obs = y_true.shape[0];
  obs_train = YTRAIN.shape[0];

  statsout = "Sequential;" + name + ";" + str(fcast_horizon) + ";" + str(train_start_year) + ";" + str(train_end_year)  + ";" + str(test_start_year) + ";" + str(test_end_year);
  statsout += ";" + str(epochs*sub_epochs) + ";" + str(best_epoch*sub_epochs) + ";" + str(units) + ";" + str(reg_weight[0]) + ";" + algo + ";";
  statsout += str(learning_rate) + ";" + str(batch_size) + ";" + str(reps) + ";"+ str(timestep) + ";%5.3f;%5.3f" %(AUC2,AUC_best_ave_AUC_train);
  statsout += ";%5.3f;%5.3f;%5.3f;%5.3f;" %(fnr,fpr,loss,ru) + str(crises) + ";" + str(crises_train) + ";" + str(obs) + ";" + str(obs_train)

# Create sample x  (1 (y true) + 1 (y pred) + Nf (phi)) array and dump it to csv
  dataout = np.zeros((y_true.shape[0],Nf+4),dtype=float);
  dataout[:,0] = y_true;
  dataout[:,1] = y_pred;
  dataout[:,3+Nf] = cid;
  filename = date_time + name + "-probs.csv"  
  if(code!=0):
        filename = '/Users/choejunhoe/Desktop/ECON_SP/python_data_graph' + str(code) + '.csv';    
  np.savetxt(filename, dataout, delimiter=";",header="y;p;phi1;phi2;phi3;phi4;phi5");

  return statsout;

# Country by Country Validation

In [74]:
import os
all_predictors=['Slope of Yield Curves',
                        'Slope of Yield Curves*',
                        'Public Debt to GDP',
                        'Credit to GDP Change',
                        'Credit to GDP Change*',
                        'Current Account Change',
                        'Investment to GDP Change',
                        'Exchange Rates Change',
                        'Capital Asset Ratio Change',
                        'Noncore Funding Ratio Change',
                        'Inflation Rate',
                        'Equity Growth',
                        'Consumption Growth',
                        'Real Houseprice Growth'];    

df3=df_norm

# Cross validation
file_path = '/Users/choejunhoe/Desktop/ECON_SP/python_data_graphcross_fc1_reps50_1870_2017.csv';    
filename = file_path.split("/")[-1]

f=open(filename, "w")
reps=5;
epochs = 100;
for dates in [[1870,2017]]: 
        end_year=dates[1];
        start_year=dates[0];

        t = time.time();
        
        
      
        # logit(1)
        #f.write(cross_validation2(reps=1,mm=0,epochs=1,nlags=1,reg_weight=[0],df=df3,fcast_horizon=2,time_start=start_year,time_end=end_year));
        #f.write("\n");f.flush();elapsed = time.time() - t;print(elapsed);
        # logit(5)
        #f.write(cross_validation2(reps=1,mm=0,epochs=1,nlags=5,reg_weight=[0],df=df3,fcast_horizon=1,time_start=start_year,time_end=end_year));
        #f.write("\n");f.flush();elapsed = time.time() - t;print(elapsed);
        
        #RNN
        f.write(cross_validation2(reps=reps,mm=3,df=df3,fcast_horizon=1,epochs=epochs,time_start=start_year,time_end=end_year));
        f.write("\n");f.flush();elapsed = time.time() - t;print(elapsed);
        #LSTM
        f.write(cross_validation2(reps=reps,mm=4,df=df3,fcast_horizon=1,epochs=epochs,time_start=start_year,time_end=end_year));
        f.write("\n");f.flush();elapsed = time.time() - t;print(elapsed);
        #GRU
        f.write(cross_validation2(reps=reps,mm=5,df=df3,fcast_horizon=1,epochs=epochs,time_start=start_year,time_end=end_year));
        f.write("\n");f.flush();elapsed = time.time() - t;print(elapsed);
        
f.close()

  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Ad

Results 100 AUC-val 0.841  AUC-train 0.921
2438.5395929813385


  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Ad

Results 100 AUC-val 0.836  AUC-train 0.997
5423.960280179977


  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Ad

Results 100 AUC-val 0.852  AUC-train 0.939
8257.760063171387


# Squencial Evaluation

In [55]:
import os
    # Simulation params
file_path = '/Users/choejunhoe/Desktop/Special Project/data/model_data/seq_fc1_reps5.csv';    
filename = file_path.split("/")[-1]

f=open(filename, "w")
epochs = 100;
    
train_end_year=1980;
train_start_year=1870;
test_start_year=1981; # Define test set
test_end_year=2017;
reps=50
print(dates)
    
all_predictors=['Slope of Yield Curves',
                        'Slope of Yield Curves*',
                        'Public Debt to GDP',
                        'Credit to GDP Change',
                        'Credit to GDP Change*',
                        'Current Account Change',
                        'Investment to GDP Change',
                        'Exchange Rates Change',
                        'Capital Asset Ratio Change',
                        'Noncore Funding Ratio Change',
                        'Inflation Rate',
                        'Equity Growth',
                        'Consumption Growth',
                        'Real Houseprice Growth'];    
    
    
df3=df_norm;

#Logit    
f.write(sequential_evaluation(reg_weight=[0.0],reps=1,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=0,nlags=1,df=df3,fcast_horizon=1,plot_reliability=False,epochs=1));
f.write("\n");f.flush();     
#Logit(5)
f.write(sequential_evaluation(reg_weight=[0.0],reps=1,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=0,nlags=5,df=df3,fcast_horizon=1,plot_reliability=False,epochs=1));
f.write("\n");f.flush();
    
#RNN
f.write(sequential_evaluation(reps=reps,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=3,df=df3,fcast_horizon=1,plot_reliability=False,epochs=epochs));
f.write("\n");f.flush();
#LSTM
f.write(sequential_evaluation(reps=reps,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=4,df=df3,fcast_horizon=1,plot_reliability=False,epochs=epochs));
f.write("\n");f.flush();
#GRU
f.write(sequential_evaluation(reps=reps,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=5,df=df3,fcast_horizon=1,plot_reliability=False,epochs=epochs));
f.write("\n");f.flush();
        
        
    
f.close()

[1870, 2017]
Crises train:18
Crises test:44
Stats - Epoch: 1 AUC-val 0.706  AUC-train 0.864
Results 1 AUC-val 0.706 AUC-train 0.864
Crises train:18
Crises test:44
Stats - Epoch: 1 AUC-val 0.474  AUC-train 0.920
Results 1 AUC-val 0.474 AUC-train 0.920


  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)


Crises train:18
Crises test:44


  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Ad

Stats - Epoch: 1 AUC-val 0.704  AUC-train 0.766
Stats - Epoch: 2 AUC-val 0.717  AUC-train 0.844
Stats - Epoch: 3 AUC-val 0.712  AUC-train 0.871
Stats - Epoch: 4 AUC-val 0.706  AUC-train 0.883
Stats - Epoch: 5 AUC-val 0.700  AUC-train 0.889
Stats - Epoch: 6 AUC-val 0.692  AUC-train 0.896
Stats - Epoch: 7 AUC-val 0.687  AUC-train 0.901
Stats - Epoch: 8 AUC-val 0.680  AUC-train 0.906
Stats - Epoch: 9 AUC-val 0.669  AUC-train 0.913
Stats - Epoch: 10 AUC-val 0.673  AUC-train 0.915
Stats - Epoch: 11 AUC-val 0.693  AUC-train 0.921
Stats - Epoch: 12 AUC-val 0.683  AUC-train 0.925
Stats - Epoch: 13 AUC-val 0.685  AUC-train 0.927
Stats - Epoch: 14 AUC-val 0.705  AUC-train 0.930
Stats - Epoch: 15 AUC-val 0.694  AUC-train 0.934
Stats - Epoch: 16 AUC-val 0.691  AUC-train 0.938
Stats - Epoch: 17 AUC-val 0.694  AUC-train 0.941
Stats - Epoch: 18 AUC-val 0.690  AUC-train 0.944
Stats - Epoch: 19 AUC-val 0.698  AUC-train 0.944
Stats - Epoch: 20 AUC-val 0.705  AUC-train 0.948
Stats - Epoch: 21 AUC-val 0.6

  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Ad

Stats - Epoch: 1 AUC-val 0.673  AUC-train 0.825
Stats - Epoch: 2 AUC-val 0.612  AUC-train 0.894
Stats - Epoch: 3 AUC-val 0.579  AUC-train 0.903
Stats - Epoch: 4 AUC-val 0.562  AUC-train 0.910
Stats - Epoch: 5 AUC-val 0.553  AUC-train 0.916
Stats - Epoch: 6 AUC-val 0.538  AUC-train 0.921
Stats - Epoch: 7 AUC-val 0.531  AUC-train 0.927
Stats - Epoch: 8 AUC-val 0.524  AUC-train 0.932
Stats - Epoch: 9 AUC-val 0.509  AUC-train 0.936
Stats - Epoch: 10 AUC-val 0.508  AUC-train 0.940
Stats - Epoch: 11 AUC-val 0.512  AUC-train 0.945
Stats - Epoch: 12 AUC-val 0.505  AUC-train 0.951
Stats - Epoch: 13 AUC-val 0.506  AUC-train 0.954
Stats - Epoch: 14 AUC-val 0.501  AUC-train 0.958
Stats - Epoch: 15 AUC-val 0.502  AUC-train 0.960
Stats - Epoch: 16 AUC-val 0.502  AUC-train 0.961
Stats - Epoch: 17 AUC-val 0.498  AUC-train 0.965
Stats - Epoch: 18 AUC-val 0.503  AUC-train 0.968
Stats - Epoch: 19 AUC-val 0.500  AUC-train 0.970
Stats - Epoch: 20 AUC-val 0.495  AUC-train 0.972
Stats - Epoch: 21 AUC-val 0.4

  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Ad

Stats - Epoch: 1 AUC-val 0.714  AUC-train 0.844
Stats - Epoch: 2 AUC-val 0.651  AUC-train 0.879
Stats - Epoch: 3 AUC-val 0.631  AUC-train 0.887
Stats - Epoch: 4 AUC-val 0.617  AUC-train 0.891
Stats - Epoch: 5 AUC-val 0.607  AUC-train 0.897
Stats - Epoch: 6 AUC-val 0.599  AUC-train 0.900
Stats - Epoch: 7 AUC-val 0.599  AUC-train 0.904
Stats - Epoch: 8 AUC-val 0.603  AUC-train 0.908
Stats - Epoch: 9 AUC-val 0.591  AUC-train 0.913
Stats - Epoch: 10 AUC-val 0.593  AUC-train 0.918
Stats - Epoch: 11 AUC-val 0.592  AUC-train 0.921
Stats - Epoch: 12 AUC-val 0.584  AUC-train 0.925
Stats - Epoch: 13 AUC-val 0.588  AUC-train 0.929
Stats - Epoch: 14 AUC-val 0.589  AUC-train 0.932
Stats - Epoch: 15 AUC-val 0.587  AUC-train 0.934
Stats - Epoch: 16 AUC-val 0.595  AUC-train 0.938
Stats - Epoch: 17 AUC-val 0.586  AUC-train 0.941
Stats - Epoch: 18 AUC-val 0.576  AUC-train 0.945
Stats - Epoch: 19 AUC-val 0.581  AUC-train 0.947
Stats - Epoch: 20 AUC-val 0.590  AUC-train 0.950
Stats - Epoch: 21 AUC-val 0.5

# Analysis spliting 1990

In [75]:
import os
    # Simulation params
file_path = '/Users/choejunhoe/Desktop/Special Project/data/model_data/seq_fc1_reps5.csv';    
filename = file_path.split("/")[-1]

f=open(filename, "w")
epochs = 100;
    
train_end_year=1990;
train_start_year=1870;
test_start_year=1991; # Define test set
test_end_year=2017;
reps=50
print(dates)
    
all_predictors=['Slope of Yield Curves',
                        'Slope of Yield Curves*',
                        'Public Debt to GDP',
                        'Credit to GDP Change',
                        'Credit to GDP Change*',
                        'Current Account Change',
                        'Investment to GDP Change',
                        'Exchange Rates Change',
                        'Capital Asset Ratio Change',
                        'Noncore Funding Ratio Change',
                        'Inflation Rate',
                        'Equity Growth',
                        'Consumption Growth',
                        'Real Houseprice Growth'];    
    
    
df3=df_norm;

#Logit    
f.write(sequential_evaluation(reg_weight=[0.0],reps=1,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=0,nlags=1,df=df3,fcast_horizon=1,plot_reliability=False,epochs=1));
f.write("\n");f.flush();     
#Logit(5)
f.write(sequential_evaluation(reg_weight=[0.0],reps=1,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=0,nlags=5,df=df3,fcast_horizon=1,plot_reliability=False,epochs=1));
f.write("\n");f.flush();
    
#RNN
f.write(sequential_evaluation(reps=reps,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=3,df=df3,fcast_horizon=1,plot_reliability=False,epochs=epochs));
f.write("\n");f.flush();
#LSTM
f.write(sequential_evaluation(reps=reps,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=4,df=df3,fcast_horizon=1,plot_reliability=False,epochs=epochs));
f.write("\n");f.flush();
#GRU
f.write(sequential_evaluation(reps=reps,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=5,df=df3,fcast_horizon=1,plot_reliability=False,epochs=epochs));
f.write("\n");f.flush();
        
        
    
f.close()

[1870, 2017]
Crises train:36
Crises test:26


  super(Adam, self).__init__(name, **kwargs)


Stats - Epoch: 1 AUC-val 0.761  AUC-train 0.850
Results 1 AUC-val 0.761 AUC-train 0.850
Crises train:36
Crises test:26
Stats - Epoch: 1 AUC-val 0.665  AUC-train 0.873
Results 1 AUC-val 0.665 AUC-train 0.873
Crises train:36
Crises test:26


  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Ad

Stats - Epoch: 1 AUC-val 0.615  AUC-train 0.781
Stats - Epoch: 2 AUC-val 0.685  AUC-train 0.841
Stats - Epoch: 3 AUC-val 0.727  AUC-train 0.854
Stats - Epoch: 4 AUC-val 0.743  AUC-train 0.863
Stats - Epoch: 5 AUC-val 0.744  AUC-train 0.869
Stats - Epoch: 6 AUC-val 0.748  AUC-train 0.872
Stats - Epoch: 7 AUC-val 0.758  AUC-train 0.877
Stats - Epoch: 8 AUC-val 0.760  AUC-train 0.882
Stats - Epoch: 9 AUC-val 0.762  AUC-train 0.885
Stats - Epoch: 10 AUC-val 0.761  AUC-train 0.890
Stats - Epoch: 11 AUC-val 0.773  AUC-train 0.894
Stats - Epoch: 12 AUC-val 0.760  AUC-train 0.897
Stats - Epoch: 13 AUC-val 0.767  AUC-train 0.899
Stats - Epoch: 14 AUC-val 0.763  AUC-train 0.904
Stats - Epoch: 15 AUC-val 0.767  AUC-train 0.909
Stats - Epoch: 16 AUC-val 0.771  AUC-train 0.913
Stats - Epoch: 17 AUC-val 0.759  AUC-train 0.915
Stats - Epoch: 18 AUC-val 0.767  AUC-train 0.918
Stats - Epoch: 19 AUC-val 0.752  AUC-train 0.923
Stats - Epoch: 20 AUC-val 0.752  AUC-train 0.926
Stats - Epoch: 21 AUC-val 0.7

  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Ad

Stats - Epoch: 1 AUC-val 0.624  AUC-train 0.833
Stats - Epoch: 2 AUC-val 0.622  AUC-train 0.867
Stats - Epoch: 3 AUC-val 0.637  AUC-train 0.874
Stats - Epoch: 4 AUC-val 0.648  AUC-train 0.882
Stats - Epoch: 5 AUC-val 0.653  AUC-train 0.887
Stats - Epoch: 6 AUC-val 0.658  AUC-train 0.893
Stats - Epoch: 7 AUC-val 0.664  AUC-train 0.899
Stats - Epoch: 8 AUC-val 0.662  AUC-train 0.903
Stats - Epoch: 9 AUC-val 0.661  AUC-train 0.909
Stats - Epoch: 10 AUC-val 0.664  AUC-train 0.913
Stats - Epoch: 11 AUC-val 0.656  AUC-train 0.920
Stats - Epoch: 12 AUC-val 0.655  AUC-train 0.924
Stats - Epoch: 13 AUC-val 0.652  AUC-train 0.928
Stats - Epoch: 14 AUC-val 0.633  AUC-train 0.935
Stats - Epoch: 15 AUC-val 0.626  AUC-train 0.939
Stats - Epoch: 16 AUC-val 0.617  AUC-train 0.943
Stats - Epoch: 17 AUC-val 0.604  AUC-train 0.947
Stats - Epoch: 18 AUC-val 0.587  AUC-train 0.949
Stats - Epoch: 19 AUC-val 0.591  AUC-train 0.954
Stats - Epoch: 20 AUC-val 0.588  AUC-train 0.958
Stats - Epoch: 21 AUC-val 0.5

  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Ad

Stats - Epoch: 1 AUC-val 0.660  AUC-train 0.854
Stats - Epoch: 2 AUC-val 0.669  AUC-train 0.866
Stats - Epoch: 3 AUC-val 0.698  AUC-train 0.870
Stats - Epoch: 4 AUC-val 0.712  AUC-train 0.873
Stats - Epoch: 5 AUC-val 0.720  AUC-train 0.878
Stats - Epoch: 6 AUC-val 0.722  AUC-train 0.881
Stats - Epoch: 7 AUC-val 0.720  AUC-train 0.885
Stats - Epoch: 8 AUC-val 0.726  AUC-train 0.889
Stats - Epoch: 9 AUC-val 0.722  AUC-train 0.893
Stats - Epoch: 10 AUC-val 0.721  AUC-train 0.896
Stats - Epoch: 11 AUC-val 0.721  AUC-train 0.901
Stats - Epoch: 12 AUC-val 0.721  AUC-train 0.905
Stats - Epoch: 13 AUC-val 0.717  AUC-train 0.910
Stats - Epoch: 14 AUC-val 0.723  AUC-train 0.915
Stats - Epoch: 15 AUC-val 0.719  AUC-train 0.918
Stats - Epoch: 16 AUC-val 0.714  AUC-train 0.923
Stats - Epoch: 17 AUC-val 0.707  AUC-train 0.927
Stats - Epoch: 18 AUC-val 0.706  AUC-train 0.930
Stats - Epoch: 19 AUC-val 0.712  AUC-train 0.935
Stats - Epoch: 20 AUC-val 0.708  AUC-train 0.939
Stats - Epoch: 21 AUC-val 0.7

# Dropout= 0.2

In [20]:
import os
    # Simulation params
file_path = '/Users/choejunhoe/Desktop/Special Project/data/model_data/seq_fc1_reps5.csv';    
filename = file_path.split("/")[-1]

f=open(filename, "w")
epochs = 100;
    
train_end_year=1980;
train_start_year=1870;
test_start_year=1981; # Define test set
test_end_year=2017;
reps=50
    
all_predictors=['Slope of Yield Curves',
                        'Slope of Yield Curves*',
                        'Public Debt to GDP',
                        'Credit to GDP Change',
                        'Credit to GDP Change*',
                        'Current Account Change',
                        'Investment to GDP Change',
                        'Exchange Rates Change',
                        'Capital Asset Ratio Change',
                        'Noncore Funding Ratio Change',
                        'Inflation Rate',
                        'Equity Growth',
                        'Consumption Growth',
                        'Real Houseprice Growth'];    
    
    
df3=df_norm;

#Logit    
# f.write(sequential_evaluation(reg_weight=[0.0],reps=1,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=0,nlags=1,df=df3,fcast_horizon=1,plot_reliability=False,epochs=1));
# f.write("\n");f.flush();     
#Logit(5)
# f.write(sequential_evaluation(reg_weight=[0.0],reps=1,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=0,nlags=5,df=df3,fcast_horizon=1,plot_reliability=False,epochs=1));
#f.write("\n");f.flush();
    
#RNN
f.write(sequential_evaluation(reps=reps,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=3,df=df3,fcast_horizon=1,dropout=0.2,plot_reliability=False,epochs=epochs));
f.write("\n");f.flush();
#LSTM
f.write(sequential_evaluation(reps=reps,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=4,df=df3,fcast_horizon=1,dropout=0.2,plot_reliability=False,epochs=epochs));
f.write("\n");f.flush();
#GRU
f.write(sequential_evaluation(reps=reps,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=5,df=df3,fcast_horizon=1,dropout=0.2,plot_reliability=False,epochs=epochs));
f.write("\n");f.flush();
        
        
    
f.close()

Crises train:18
Crises test:44


  super(Adam, self).__init__(name, **kwargs)
2023-05-14 15:55:36.490023: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__

Stats - Epoch: 1 AUC-val 0.698  AUC-train 0.712
Stats - Epoch: 2 AUC-val 0.746  AUC-train 0.798
Stats - Epoch: 3 AUC-val 0.747  AUC-train 0.849
Stats - Epoch: 4 AUC-val 0.745  AUC-train 0.862
Stats - Epoch: 5 AUC-val 0.735  AUC-train 0.867
Stats - Epoch: 6 AUC-val 0.733  AUC-train 0.871
Stats - Epoch: 7 AUC-val 0.726  AUC-train 0.875
Stats - Epoch: 8 AUC-val 0.720  AUC-train 0.877
Stats - Epoch: 9 AUC-val 0.712  AUC-train 0.880
Stats - Epoch: 10 AUC-val 0.703  AUC-train 0.882
Stats - Epoch: 11 AUC-val 0.703  AUC-train 0.883
Stats - Epoch: 12 AUC-val 0.692  AUC-train 0.885
Stats - Epoch: 13 AUC-val 0.688  AUC-train 0.886
Stats - Epoch: 14 AUC-val 0.681  AUC-train 0.887
Stats - Epoch: 15 AUC-val 0.682  AUC-train 0.889
Stats - Epoch: 16 AUC-val 0.683  AUC-train 0.890
Stats - Epoch: 17 AUC-val 0.678  AUC-train 0.891
Stats - Epoch: 18 AUC-val 0.680  AUC-train 0.892
Stats - Epoch: 19 AUC-val 0.672  AUC-train 0.893
Stats - Epoch: 20 AUC-val 0.677  AUC-train 0.894
Stats - Epoch: 21 AUC-val 0.6

  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Ad

Stats - Epoch: 1 AUC-val 0.674  AUC-train 0.815
Stats - Epoch: 2 AUC-val 0.619  AUC-train 0.888
Stats - Epoch: 3 AUC-val 0.588  AUC-train 0.897
Stats - Epoch: 4 AUC-val 0.572  AUC-train 0.903
Stats - Epoch: 5 AUC-val 0.562  AUC-train 0.908
Stats - Epoch: 6 AUC-val 0.551  AUC-train 0.911
Stats - Epoch: 7 AUC-val 0.543  AUC-train 0.914
Stats - Epoch: 8 AUC-val 0.540  AUC-train 0.918
Stats - Epoch: 9 AUC-val 0.531  AUC-train 0.921
Stats - Epoch: 10 AUC-val 0.522  AUC-train 0.924
Stats - Epoch: 11 AUC-val 0.523  AUC-train 0.928
Stats - Epoch: 12 AUC-val 0.515  AUC-train 0.931
Stats - Epoch: 13 AUC-val 0.508  AUC-train 0.936
Stats - Epoch: 14 AUC-val 0.508  AUC-train 0.939
Stats - Epoch: 15 AUC-val 0.503  AUC-train 0.943
Stats - Epoch: 16 AUC-val 0.508  AUC-train 0.945
Stats - Epoch: 17 AUC-val 0.508  AUC-train 0.948
Stats - Epoch: 18 AUC-val 0.507  AUC-train 0.951
Stats - Epoch: 19 AUC-val 0.510  AUC-train 0.953
Stats - Epoch: 20 AUC-val 0.501  AUC-train 0.956
Stats - Epoch: 21 AUC-val 0.5

  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Ad

Stats - Epoch: 1 AUC-val 0.723  AUC-train 0.831
Stats - Epoch: 2 AUC-val 0.675  AUC-train 0.872
Stats - Epoch: 3 AUC-val 0.637  AUC-train 0.876
Stats - Epoch: 4 AUC-val 0.620  AUC-train 0.880
Stats - Epoch: 5 AUC-val 0.615  AUC-train 0.882
Stats - Epoch: 6 AUC-val 0.604  AUC-train 0.885
Stats - Epoch: 7 AUC-val 0.598  AUC-train 0.887
Stats - Epoch: 8 AUC-val 0.593  AUC-train 0.890
Stats - Epoch: 9 AUC-val 0.595  AUC-train 0.892
Stats - Epoch: 10 AUC-val 0.589  AUC-train 0.896
Stats - Epoch: 11 AUC-val 0.592  AUC-train 0.897
Stats - Epoch: 12 AUC-val 0.589  AUC-train 0.900
Stats - Epoch: 13 AUC-val 0.589  AUC-train 0.903
Stats - Epoch: 14 AUC-val 0.589  AUC-train 0.905
Stats - Epoch: 15 AUC-val 0.585  AUC-train 0.906
Stats - Epoch: 16 AUC-val 0.583  AUC-train 0.908
Stats - Epoch: 17 AUC-val 0.587  AUC-train 0.911
Stats - Epoch: 18 AUC-val 0.582  AUC-train 0.913
Stats - Epoch: 19 AUC-val 0.581  AUC-train 0.915
Stats - Epoch: 20 AUC-val 0.580  AUC-train 0.917
Stats - Epoch: 21 AUC-val 0.5

In [26]:
# Split data by 1990
import os
    # Simulation params
file_path = '/Users/choejunhoe/Desktop/Special Project/data/model_data/seq_fc1_reps5.csv';    
filename = file_path.split("/")[-1]

f=open(filename, "w")
epochs = 100;
    
train_end_year=1990;
train_start_year=1870;
test_start_year=1991; # Define test set
test_end_year=2017;
reps=50
    
all_predictors=['Slope of Yield Curves',
                        'Slope of Yield Curves*',
                        'Public Debt to GDP',
                        'Credit to GDP Change',
                        'Credit to GDP Change*',
                        'Current Account Change',
                        'Investment to GDP Change',
                        'Exchange Rates Change',
                        'Capital Asset Ratio Change',
                        'Noncore Funding Ratio Change',
                        'Inflation Rate',
                        'Equity Growth',
                        'Consumption Growth',
                        'Real Houseprice Growth'];    
    
    
df3=df_norm;

#Logit    
# f.write(sequential_evaluation(reg_weight=[0.0],reps=1,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=0,nlags=1,df=df3,fcast_horizon=1,plot_reliability=False,epochs=1));
# f.write("\n");f.flush();     
#Logit(5)
# f.write(sequential_evaluation(reg_weight=[0.0],reps=1,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=0,nlags=5,df=df3,fcast_horizon=1,plot_reliability=False,epochs=1));
#f.write("\n");f.flush();
    
#RNN
f.write(sequential_evaluation(reps=reps,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=3,df=df3,fcast_horizon=1,dropout=0.2,plot_reliability=False,epochs=epochs));
f.write("\n");f.flush();
#LSTM
f.write(sequential_evaluation(reps=reps,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=4,df=df3,fcast_horizon=1,dropout=0.2,plot_reliability=False,epochs=epochs));
f.write("\n");f.flush();
#GRU
f.write(sequential_evaluation(reps=reps,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=5,df=df3,fcast_horizon=1,dropout=0.2,plot_reliability=False,epochs=epochs));
f.write("\n");f.flush();
        
        
    
f.close()

Crises train:36
Crises test:26


  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Ad

Stats - Epoch: 1 AUC-val 0.676  AUC-train 0.755
Stats - Epoch: 2 AUC-val 0.718  AUC-train 0.830
Stats - Epoch: 3 AUC-val 0.753  AUC-train 0.846
Stats - Epoch: 4 AUC-val 0.764  AUC-train 0.853
Stats - Epoch: 5 AUC-val 0.769  AUC-train 0.855
Stats - Epoch: 6 AUC-val 0.771  AUC-train 0.858
Stats - Epoch: 7 AUC-val 0.777  AUC-train 0.861
Stats - Epoch: 8 AUC-val 0.776  AUC-train 0.863
Stats - Epoch: 9 AUC-val 0.779  AUC-train 0.864
Stats - Epoch: 10 AUC-val 0.782  AUC-train 0.865
Stats - Epoch: 11 AUC-val 0.783  AUC-train 0.866
Stats - Epoch: 12 AUC-val 0.780  AUC-train 0.868
Stats - Epoch: 13 AUC-val 0.784  AUC-train 0.868
Stats - Epoch: 14 AUC-val 0.782  AUC-train 0.869
Stats - Epoch: 15 AUC-val 0.782  AUC-train 0.871
Stats - Epoch: 16 AUC-val 0.779  AUC-train 0.871
Stats - Epoch: 17 AUC-val 0.778  AUC-train 0.872
Stats - Epoch: 18 AUC-val 0.780  AUC-train 0.871
Stats - Epoch: 19 AUC-val 0.783  AUC-train 0.872
Stats - Epoch: 20 AUC-val 0.781  AUC-train 0.872
Stats - Epoch: 21 AUC-val 0.7

  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Ad

Stats - Epoch: 1 AUC-val 0.629  AUC-train 0.834
Stats - Epoch: 2 AUC-val 0.623  AUC-train 0.867
Stats - Epoch: 3 AUC-val 0.643  AUC-train 0.874
Stats - Epoch: 4 AUC-val 0.657  AUC-train 0.879
Stats - Epoch: 5 AUC-val 0.665  AUC-train 0.884
Stats - Epoch: 6 AUC-val 0.664  AUC-train 0.887
Stats - Epoch: 7 AUC-val 0.668  AUC-train 0.891
Stats - Epoch: 8 AUC-val 0.662  AUC-train 0.895
Stats - Epoch: 9 AUC-val 0.669  AUC-train 0.898
Stats - Epoch: 10 AUC-val 0.674  AUC-train 0.901
Stats - Epoch: 11 AUC-val 0.677  AUC-train 0.904
Stats - Epoch: 12 AUC-val 0.668  AUC-train 0.907
Stats - Epoch: 13 AUC-val 0.670  AUC-train 0.910
Stats - Epoch: 14 AUC-val 0.672  AUC-train 0.913
Stats - Epoch: 15 AUC-val 0.674  AUC-train 0.916
Stats - Epoch: 16 AUC-val 0.660  AUC-train 0.917
Stats - Epoch: 17 AUC-val 0.659  AUC-train 0.921
Stats - Epoch: 18 AUC-val 0.654  AUC-train 0.922
Stats - Epoch: 19 AUC-val 0.644  AUC-train 0.924
Stats - Epoch: 20 AUC-val 0.653  AUC-train 0.927
Stats - Epoch: 21 AUC-val 0.6

  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Ad

Stats - Epoch: 1 AUC-val 0.646  AUC-train 0.845
Stats - Epoch: 2 AUC-val 0.670  AUC-train 0.863
Stats - Epoch: 3 AUC-val 0.688  AUC-train 0.865
Stats - Epoch: 4 AUC-val 0.701  AUC-train 0.868
Stats - Epoch: 5 AUC-val 0.717  AUC-train 0.869
Stats - Epoch: 6 AUC-val 0.722  AUC-train 0.870
Stats - Epoch: 7 AUC-val 0.730  AUC-train 0.872
Stats - Epoch: 8 AUC-val 0.736  AUC-train 0.874
Stats - Epoch: 9 AUC-val 0.734  AUC-train 0.875
Stats - Epoch: 10 AUC-val 0.730  AUC-train 0.878
Stats - Epoch: 11 AUC-val 0.735  AUC-train 0.879
Stats - Epoch: 12 AUC-val 0.737  AUC-train 0.880
Stats - Epoch: 13 AUC-val 0.731  AUC-train 0.882
Stats - Epoch: 14 AUC-val 0.735  AUC-train 0.884
Stats - Epoch: 15 AUC-val 0.734  AUC-train 0.885
Stats - Epoch: 16 AUC-val 0.729  AUC-train 0.886
Stats - Epoch: 17 AUC-val 0.728  AUC-train 0.889
Stats - Epoch: 18 AUC-val 0.721  AUC-train 0.890
Stats - Epoch: 19 AUC-val 0.723  AUC-train 0.893
Stats - Epoch: 20 AUC-val 0.724  AUC-train 0.894
Stats - Epoch: 21 AUC-val 0.7

In [24]:
import os
all_predictors=['Slope of Yield Curves',
                        'Slope of Yield Curves*',
                        'Public Debt to GDP',
                        'Credit to GDP Change',
                        'Credit to GDP Change*',
                        'Current Account Change',
                        'Investment to GDP Change',
                        'Exchange Rates Change',
                        'Capital Asset Ratio Change',
                        'Noncore Funding Ratio Change',
                        'Inflation Rate',
                        'Equity Growth',
                        'Consumption Growth',
                        'Real Houseprice Growth'];    

df3=df_norm


# Cross validation
file_path = '/Users/choejunhoe/Desktop/Special Project/data/model_data/cross_fc1_reps50_1870_2017.csv';    
filename = file_path.split("/")[-1]

f=open(filename, "w")
reps=5;
epochs = 100;
for dates in [[1870,2017]]: 
        end_year=dates[1];
        start_year=dates[0];

        t = time.time();
        
        
      
        #logit(1)
       # f.write(cross_validation2(reps=1,mm=0,epochs=1,nlags=1,reg_weight=[0],df=df3,fcast_horizon=2,time_start=start_year,time_end=end_year));
       # f.write("\n");f.flush();elapsed = time.time() - t;print(elapsed);
        #logit(5)
       # f.write(cross_validation2(reps=1,mm=0,epochs=1,nlags=5,reg_weight=[0],df=df3,fcast_horizon=1,time_start=start_year,time_end=end_year));
       # f.write("\n");f.flush();elapsed = time.time() - t;print(elapsed);
        
        #RNN
        f.write(cross_validation2(reps=reps,mm=3,df=df3,fcast_horizon=1,epochs=epochs,time_start=start_year,dropout=0.2,time_end=end_year));
        f.write("\n");f.flush();elapsed = time.time() - t;print(elapsed);
        #LSTM
        f.write(cross_validation2(reps=reps,mm=4,df=df3,fcast_horizon=1,epochs=epochs,time_start=start_year,dropout=0.2,time_end=end_year));
        f.write("\n");f.flush();elapsed = time.time() - t;print(elapsed);
        #GRU
        f.write(cross_validation2(reps=reps,mm=5,df=df3,fcast_horizon=1,epochs=epochs,time_start=start_year,dropout=0.2,time_end=end_year));
        f.write("\n");f.flush();elapsed = time.time() - t;print(elapsed);
        
f.close()

# Sequential Evaluation
#file_path = '/Users/choejunhoe/Desktop/Special Project/data/model_data/seq_fc1_reps5.csv';    
#filename = file_path.split("/")[-1]

#f=open(filename, "w")
#epochs = 1;
    
#train_end_year=1980;
#train_start_year=1870;
#test_start_year=1981; # Define test set
#test_end_year=2017;
#reps=50
#print(dates)    


#Logit    
#f.write(sequential_evaluation(reg_weight=[0.0],reps=1,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=0,nlags=1,df=df3,fcast_horizon=1,plot_reliability=False,epochs=1));
#f.write("\n");f.flush();     

#Logit(5)
# f.write(sequential_evaluation(reg_weight=[0.0],reps=1,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=0,nlags=5,df=df3,fcast_horizon=1,plot_reliability=False,epochs=1));
#f.write("\n");f.flush();
    
#RNN
#f.write(sequential_evaluation(reps=reps,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=3,df=df3,fcast_horizon=1,plot_reliability=False,epochs=epochs));
#f.write("\n");f.flush();

#LSTM
#f.write(sequential_evaluation(reps=reps,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=4,df=df3,fcast_horizon=1,plot_reliability=False,epochs=epochs));
#f.write("\n");f.flush();

#GRU
#f.write(sequential_evaluation(reps=reps,test_start_year=test_start_year,test_end_year=test_end_year,train_start_year=train_start_year,train_end_year=train_end_year,mm=5,df=df3,fcast_horizon=1,plot_reliability=False,epochs=epochs));
#f.write("\n");f.flush();
        
        
    
#f.close()




  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Ad

Results 100 AUC-val 0.845  AUC-train 0.910
18905.828763008118


  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Ad

Results 100 AUC-val 0.843  AUC-train 0.998
22363.39267206192


  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Adam, self).__init__(name, **kwargs)
  super(Ad

Results 100 AUC-val 0.853  AUC-train 0.935
25676.7450299263
