# An Exercise of Financial Crises Prediction Using Machine Learning Techniques
## Inspired by Bluwstein et al. (2019)
# Authors:                Ana Margarida Silva da Costa & Lea Katharina Paoli
# Matriculation Nr.      Q00087 & E12499
# Purpose:               Assignment for Advanced Financial Economics 

# Installation of necessary packages

In [1]:
pip install numpy

Note: you may need to restart the kernel to use updated packages.


In [2]:
pip install --upgrade pandas-datareader

Requirement already up-to-date: pandas-datareader in c:\users\magui\anaconda3\lib\site-packages (0.10.0)
Note: you may need to restart the kernel to use updated packages.


In [3]:
pip install -U scikit-learn

Requirement already up-to-date: scikit-learn in c:\users\magui\anaconda3\lib\site-packages (1.0.1)
Note: you may need to restart the kernel to use updated packages.


In [4]:
pip install openpyxl

Note: you may need to restart the kernel to use updated packages.


In [5]:
pip install seaborn

Note: you may need to restart the kernel to use updated packages.


In [6]:
pip install plotly

Note: you may need to restart the kernel to use updated packages.


Now to check whether the installation worked properly:

In [7]:
pip list

Package                            Version
---------------------------------- -------------------
alabaster                          0.7.12
anaconda-client                    1.7.2
anaconda-navigator                 2.0.3
anaconda-project                   0.8.3
argh                               0.26.2
argon2-cffi                        20.1.0
asn1crypto                         1.4.0
astroid                            2.4.2
astropy                            4.0.2
async-generator                    1.10
atomicwrites                       1.4.0
attrs                              20.3.0
autopep8                           1.5.4
Babel                              2.8.1
backcall                           0.2.0
backports.functools-lru-cache      1.6.1
backports.shutil-get-terminal-size 1.0.0
backports.tempfile                 1.0
backports.weakref                  1.0.post1
bcrypt                             3.2.0
beautifulsoup4                     4.9.3
bitarray                           1

It appears as it has worked properly, since we can find the previous packages in the list

# Importing the necessary Python modules

In [22]:
import pickle # this module allows us to store training date into a file and therefore run more quickly previous code
import numpy as np
import pandas as pd 
import plotly.graph_objects as go

from sklearn import (
    model_selection, linear_model, ensemble, metrics, neural_network, pipeline, model_selection, \
    tree, preprocessing, pipeline
)
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.linear_model import LogisticRegressionCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import StratifiedKFold #as opposed to KFold
import matplotlib.pyplot as plt
import seaborn as sn 

# Preparatory work

Loading the dataset:

In [9]:
df=pd.read_excel("JSTdatasetR4.xlsx",sheet_name="Data")

Creating the desired variables in a new Data Frame (some changes compared to what was recommended in the pdf)

In [10]:
#let's make a copy, in order to preserve the original dataset
df_copy=df.copy()
print("this is the original:") #added
print(df.head()) #added
print("this is the copy:") #added
df_copy.head() #added
print(df_copy.head())

#let's create new (temporary) columns with the transformed variables we need:
#-slope of the yield curve
df_copy["slope_yield_curve"]=df_copy["ltrate"]/100-df_copy["stir"]/100

#-credit: loans to the privete sector / gdp
df_copy["credit"]=df_copy["tloans"]/df_copy["gdp"]

#-debt service ratio: credit * long term interest rate
df_copy["debt_serv_ratio"]= (df_copy["tloans"]/df_copy["gdp"])*df_copy["ltrate"]/100

#-broad money over gdp
df_copy["bmoney_gdp"]=df_copy["money"]/df_copy["gdp"]

#-current account over gdp
df_copy["curr_acc_gdp"]=df_copy["ca"]/df_copy["gdp"]


# Now we need to compute 1-year absolute variations and percentage variations for a few variables
# Obviously this must be done country-wise, so we cannot act on the dataframe as it is.
# a Convenient way of doing this is the Pandas method 'groupby()'
df_copy_group=df_copy.groupby("iso") # 'iso' is the country code

# create 1 year-variation of credit from grouped dataframe and add back to initial dataframe
df_copy["delta_credit"]=df_copy_group["credit"].diff(periods=1)

# create 1 year-variation of debt ser ratio from grouped dataframe and add back to initial dataframe
df_copy["delta_debt_serv_ratio"]=df_copy_group["debt_serv_ratio"].diff(periods=1)

# create 1 year-variation of investment/gdp from grouped dataframe and add back to initial dataframe
df_copy["delta_investm_ratio"]=df_copy_group["iy"].diff(periods=1)

# create 1 year-variation of public debt/gdp from grouped dataframe and add back to initial dataframe
df_copy["delta_pdebt_ratio"]=df_copy_group["debtgdp"].diff(periods=1)

# create 1 year-variation of broad money / gdp from grouped dataframe and add back to initial dataframe
df_copy["delta_bmoney_gdp"]=df_copy_group["bmoney_gdp"].diff(periods=1)

# create 1 year-variation of current / gdp from grouped dataframe and add back to initial dataframe
df_copy["delta_curr_acc_gdp"]=df_copy_group["curr_acc_gdp"].diff(periods=1)

# now we need to create new variables which are 1-year growth rates of existing ones
# we will need this function to apply to the columns of the dataframe
def lag_pct_change(x):
    """ Computes percentage changes """
    lag = np.array(pd.Series(x).shift(1))
    return ((x - lag) / lag) #brackets added


# create 1 year growth rate of CPI from grouped dataframe and add back to initial dataframe
df_copy["growth_cpi"]=df_copy_group["cpi"].apply(lag_pct_change)

# create 1 year growth rate of consumption per capita from grouped dataframe and add back to initial dataframe
df_copy["growth_cons"]=df_copy_group["rconpc"].apply(lag_pct_change)

# Now let's create the crises early warning label: a dummy variable which takes value one if in the next year or two there will be a crises
# temporary array of zeros, dimension = number of rows in database
temp_array=np.zeros(len(df_copy))

# loop to create dummy
for i in np.arange(0,len(df_copy)-2):
    temp_array[i]= 1 \
    if ( (df_copy.loc[i+1,'crisisJST']== 1) or (df_copy.loc[i+2,'crisisJST']== 1) ) else 0

#put the dummy in the dataframe
df_copy["crisis_warning"]=temp_array.astype("int64")
print("this is the copy with all of the variables we have added:")
print(df_copy.head())

# create a smaller dataframe including only the variables we are interested in: the first ten are predictors (X) and the last one is the output, or label (y). Also, drop the observations where some variable is missing
variables=["slope_yield_curve","delta_credit","delta_debt_serv_ratio","delta_investm_ratio","delta_pdebt_ratio","delta_bmoney_gdp","delta_curr_acc_gdp","growth_cpi","growth_cons","eq_tr","crisis_warning"]
df_final=df_copy[variables].dropna()
print("this is the smaller dataframe with just the variables we are interested in:")
print(df_final.head())

# let's also create a version of our dataframe which includes the year
df_final_withyear=df_copy[["year"]+variables].dropna()
print("this is the previous dataframe, but it includes the year:")
print(df_final_withyear.head())

this is the original:
   year    country  iso  ifs     pop      rgdpmad     rgdppc     rconpc  \
0  1870  Australia  AUS  193  1775.0  3273.239437  13.836157  21.449734   
1  1871  Australia  AUS  193  1675.0  3298.507463  13.936864  19.930801   
2  1872  Australia  AUS  193  1722.0  3553.426249  15.044247  21.085006   
3  1873  Australia  AUS  193  1769.0  3823.629169  16.219443  23.254910   
4  1874  Australia  AUS  193  1822.0  3834.796926  16.268228  23.458050   

      gdp        iy  ...  eq_capgain     eq_dp  eq_capgain_interp  \
0  208.78  0.109266  ...   -0.070045  0.071417                NaN   
1  211.56  0.104579  ...    0.041654  0.065466                NaN   
2  227.40  0.130438  ...    0.108945  0.062997                NaN   
3  266.54  0.124986  ...    0.083086  0.064484                NaN   
4  287.58  0.141960  ...    0.119389  0.063503                NaN   

   eq_tr_interp  eq_dp_interp  bond_rate  eq_div_rtn  capital_tr  risky_tr  \
0           NaN           NaN   0.

# Parametrization
In this section we will assign functions for the parametrization needed in splitting. Not just for random splitting of test and training sets, but also for kfolds which ae used to test models.

In [11]:
# Percentage test dataset 
percent_test = 0.25
print("percent_test is set to " + str(percent_test))

# Cross-Validation Splits
number_of_splits = 5
kf = StratifiedKFold(n_splits=number_of_splits)
print("number_of_splits is set to " + str(number_of_splits))

percent_test is set to 0.25
number_of_splits is set to 5


# Functions necessary

In [12]:
# these functions are needed to plot the trees-graphs 
def surface_scatter_plot(X,y,f, ngrid=50, width=860, height=700): 
    scatter = go.Scatter3d(x=X.iloc[:,0],y=X.iloc[:,1],z=y,
                           mode='markers',
                           marker=dict(size=2, opacity=0.3)
                        )

    xlo=X.min()
    xhi=X.max()
    xgrid = np.linspace(xlo,xhi,ngrid)
    ey = np.zeros((len(xgrid),len(xgrid)))
    colorscale = [[0, colors[0]], [1, colors[2]]]
    for i in range(len(xgrid)):
        for j in range(len(xgrid)):
            ey[j,i] = f([xgrid[i],xgrid[j]])
    
    surface = go.Surface(x=xgrid, y=xgrid, z=ey, colorscale=colorscale, opacity=1.0)
    
    fig = go.FigureWidget(
        data=layers,
        layout = go.Layout(
            autosize=True,
            scene=dict(
                xaxis_title='X1',
                yaxis_title='X2',
                zaxis_title='Y'
            ),
            width=width,
            height=height,
            template=plotly_template,
        )
    )
    return fig


def plot_roc(mod, X, y):
    # predicted_probs is an N x 2 array, where N is number of observations and 2 is number of classes
    predicted_probs = mod.predict_proba(X_test)

    # keep the second column, for label=1
    predicted_prob1 = predicted_probs[:, 1]

    fpr, tpr, _ = metrics.roc_curve(y_test, predicted_prob1)

    # Plot ROC curve
    fig, ax = plt.subplots()
    ax.plot([0, 1], [0, 1], "k--")
    ax.plot(fpr, tpr)
    ax.set_xlabel("False Positive Rate")
    ax.set_ylabel("True Positive Rate")
    ax.set_title("ROC Curve")

    return fig, ax

# Analysis

In [13]:
print(df_copy.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2499 entries, 0 to 2498
Data columns (total 65 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   year                      2499 non-null   int64  
 1   country                   2499 non-null   object 
 2   iso                       2499 non-null   object 
 3   ifs                       2499 non-null   int64  
 4   pop                       2499 non-null   float64
 5   rgdpmad                   2499 non-null   float64
 6   rgdppc                    2499 non-null   float64
 7   rconpc                    2411 non-null   float64
 8   gdp                       2474 non-null   float64
 9   iy                        2279 non-null   float64
 10  cpi                       2499 non-null   float64
 11  ca                        2344 non-null   float64
 12  imports                   2458 non-null   float64
 13  exports                   2458 non-null   float64
 14  narrowm 

## 1) Split sample (test and training sets)

In [14]:
# First we will take a look at the dataset to get an impression of its structure first -> df.shape will return (x1, x2), where x1=nrrows and x2=nrcolumns
print("shape")
print(df.shape)
print("head")
print(df.head())
print("df_final.columns")
print(df_final.columns)
X = df_final.drop("crisis_warning", axis=1)
print("X.columns")
print(X.columns) #little redundant as X=df 

shape
(2499, 51)
head
   year    country  iso  ifs     pop      rgdpmad     rgdppc     rconpc  \
0  1870  Australia  AUS  193  1775.0  3273.239437  13.836157  21.449734   
1  1871  Australia  AUS  193  1675.0  3298.507463  13.936864  19.930801   
2  1872  Australia  AUS  193  1722.0  3553.426249  15.044247  21.085006   
3  1873  Australia  AUS  193  1769.0  3823.629169  16.219443  23.254910   
4  1874  Australia  AUS  193  1822.0  3834.796926  16.268228  23.458050   

      gdp        iy  ...  eq_capgain     eq_dp  eq_capgain_interp  \
0  208.78  0.109266  ...   -0.070045  0.071417                NaN   
1  211.56  0.104579  ...    0.041654  0.065466                NaN   
2  227.40  0.130438  ...    0.108945  0.062997                NaN   
3  266.54  0.124986  ...    0.083086  0.064484                NaN   
4  287.58  0.141960  ...    0.119389  0.063503                NaN   

   eq_tr_interp  eq_dp_interp  bond_rate  eq_div_rtn  capital_tr  risky_tr  \
0           NaN           NaN   0.

In [15]:
y = df_final["crisis_warning"] 
X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, test_size=0.25)

Verify if the plit worked:

In [16]:
N=len(X)
print("number of observation = " + str(N))
N2= len(X_test) + len(X_train)
print(N2)
ptimesN=percent_test*N
oneminusptimesN=(1-percent_test)*N
print("number of obs. in test data set should be " + str(ptimesN))
print("number of obs. in test data set is " + str(len(X_test))) #should equal to (1-percent_test)*N
print("number of obs. in training data set should be " + str(oneminusptimesN))
print("number of obs. in training data set is " + str(len(X_train))) #should equal to percent_test*N

number of observation = 1680
1680
number of obs. in test data set should be 420.0
number of obs. in test data set is 420
number of obs. in training data set should be 1260.0
number of obs. in training data set is 1260


## 2) Fitting of modules

### 2.1.) Logistic regression

Remember from previous code that:
- X = df_final.drop("crisis_warning", axis=1) (code line 14)
- y = df_final["crisis_warning"] (code line 15)

In [40]:
logistic_model = linear_model.LogisticRegression(solver="lbfgs")
logistic_model.fit(X_train, y_train)

beta_0 = logistic_model.intercept_[0]
beta_1 = logistic_model.coef_[0][0]
print(f"Logistic regression: y(x) = {beta_0:.4f} + {beta_1:.4f} X")

Logistic regression: y(x) = -2.6246 + -0.6531 X


Alternative version in which we can see all coefficients individually (better option for answering question 5):

In [17]:
import statsmodels.api as sm
logit_model=sm.Logit(y_train,X_train)
result=logit_model.fit()
print(result.summary2())

# used the instructions of the following website: https://towardsdatascience.com/building-a-logistic-regression-in-python-step-by-step-becd4d56c9c8

Optimization terminated successfully.
         Current function value: 0.400266
         Iterations 8
                             Results: Logit
Model:                 Logit               Pseudo R-squared:    -0.520   
Date:                  2021-12-03 15:28    BIC:                 1080.0591
No. Observations:      1260                Log-Likelihood:      -504.34  
Df Model:              9                   LL-Null:             -331.86  
Df Residuals:          1250                LLR p-value:         1.0000   
Converged:             1.0000              Scale:               1.0000   
No. Iterations:        8.0000                                            
-------------------------------------------------------------------------
                       Coef.   Std.Err.    z     P>|z|   [0.025   0.975] 
-------------------------------------------------------------------------
slope_yield_curve     -39.6666   4.8123  -8.2428 0.0000 -49.0984 -30.2347
delta_credit           -5.4177   2.8064 

### 2.2.) Logistic regression with LASSO

In [29]:
n_folds = 5
# candidate regularization parameters, smaller means heavier penalty, thus coefficients more shrinked to zero.
C_values = [0.001, 0.01, 0.05, 0.1, 1., 100.]
# define model
my_l1reg_logistic = LogisticRegressionCV(Cs=C_values, cv=n_folds, penalty='l1', 
                           refit=True, scoring='roc_auc', 
                           solver='liblinear', random_state=0,
                           fit_intercept=True)
# fit the model
my_l1reg_logistic.fit(X_train, y_train)
# these are already the best coefficients
coefs = my_l1reg_logistic.coef_
# mean of scores of class "1"
scores = my_l1reg_logistic.scores_[1]
mean_scores = np.mean(scores, axis=0)
# from this, you can visually inspect which C_value has the highest average score, thus is selected by the cross-validation
coefs

array([[-2.35593916e+01,  9.82545373e+00,  6.77380718e+01,
         1.67866194e+01, -6.64037236e+00,  3.06020818e+00,
         1.30441532e+00, -2.54070430e+00, -1.06442667e+01,
         5.08466184e-02]])

### 2.3.) Random trees

We will begin by fitting a tree to this simulated data.

In [29]:
import numpy as np
# Simulate some data and plot it
n = 1000
Xsim = np.random.rand(n,2)
def Ey_x(x):
    return 1/3*(np.sin(5*x[0])*np.sqrt(x[1])*np.exp(-(x[1]-0.5)**2))

ysim = np.apply_along_axis(Ey_x, 1, Xsim) + np.random.randn(n)*0.1

In [30]:
import plotly.graph_objects as go

In [31]:
def surface_scatter_plot(X,y,f, xlo=0., xhi=1., ngrid=50,
                         width=860, height=700, f0=Ey_x, show_f0=False):
    scatter = go.Scatter3d(x=X[:,0],y=X[:,1],z=y,
                           mode='markers',
                           marker=dict(size=2, opacity=0.3)
    )
    xgrid = np.linspace(xlo,xhi,ngrid)
    ey = np.zeros((len(xgrid),len(xgrid)))
    ey0 = np.zeros((len(xgrid),len(xgrid)))
    colorscale = [[0, colors[0]], [1, colors[2]]]
    for i in range(len(xgrid)):
        for j in range(len(xgrid)):
            ey[j,i] = f([xgrid[i],xgrid[j]])
            ey0[j,i]= f0([xgrid[i],xgrid[j]])
    surface = go.Surface(x=xgrid, y=xgrid, z=ey, colorscale=colorscale, opacity=1.0)
    if (show_f0):
        surface0 = go.Surface(x=xgrid, y=xgrid, z=ey0, opacity=0.8, colorscale=colorscale)
        layers = [scatter, surface, surface0]
    else:
        layers = [scatter, surface]
    fig = go.FigureWidget(
        data=layers,
        layout = go.Layout(
            autosize=True,
            scene=dict(
                xaxis_title='X1',
                yaxis_title='X2',
                zaxis_title='Y'
            ),
            width=width,
            height=height,
            template=plotly_template,
        )
    )
    return fig

fig = surface_scatter_plot(Xsim, ysim, Ey_x)
fig.show()

NameError: name 'colors' is not defined

### 2.4.) Random Forests

In [17]:
from sklearn.ensemble import RandomForestRegressor
forest = RandomForestRegressor(n_estimators = 10).fit(X_train,y_train)
fig=surface_scatter_plot(X_train,y_train,lambda x: forest.predict([x]),
                         show_f0=True)
fig.show()

TypeError: surface_scatter_plot() got an unexpected keyword argument 'show_f0'

### 2.5.) Neural Networks

In [18]:

hiddenlayers = 2
nrneurons_1stlayer = np.array([100, 500, 1000]) 
nrneurons_2ndlayer = np.array([100, 500, 1000]) 
# The ith element represents the number of neurons in the ith hidden layer.


#‘lbfgs’ is an optimizer in the family of quasi-Newton methods.
# Alpha is a parameter for regularization term, aka penalty term, that combats overfitting by constraining the size of the weights.
# https://scikit-learn.org/stable/auto_examples/neural_networks/plot_mlp_alpha.html

# for i, j in itertools.product(nrneurons_1stlayer, nrneurons_2ndlayer): #cartesian product 
#     print("we  look at the following combinations of neurons per layer: " + str(i) + "+" + str(j)

for i, j in itertools.product(nrneurons_1stlayer, nrneurons_2ndlayer): 
     print("number of hiddenlayers is equal to "+ str(hiddenlayers))
     print("number of neurons per layer is equal to "+ str(i) + "and "+ str(j) + "respect.")
     model5=neural_network.MLPClassifier((i, j), activation="logistic", verbose=True, solver="lbfgs", alpha=0.0)
     model5.fit(X_train, y_train)
     mse_model5 = metrics.mean_squared_error(y, model5.predict(X))
     print(str(mse_model5))


hiddenlayers = 1
nrneurons_1stlayer = np.array([100, 500, 1000]) 

for i in nrneurons_1stlayer: 
     print("number of hiddenlayers is equal to "+ str(hiddenlayers))
     print("number of neurons per layer is equal to "+ str(i))
     model5=neural_network.MLPClassifier((i), activation="logistic", verbose=True, solver="lbfgs", alpha=0.0)
     model5.fit(X_train, y_train)
     mse_model5 = metrics.mean_squared_error(y, model5.predict(X))
     print(str(mse_model5))
    
# Standardize 
print("STANDARDIZATION")
model5_scaled = pipeline.make_pipeline(
    preprocessing.StandardScaler(),  # this will do the input scaling
    neural_network.MLPClassifier((30, 20)) 
)

hiddenlayers = 1
nrneurons_1stlayer = np.array([100, 500, 1000]) 

for i in nrneurons_1stlayer: 
     print("number of hiddenlayers is equal to "+ str(hiddenlayers))
     print("number of neurons per layer is equal to "+ str(i))
     model5_scaled=pipeline.make_pipeline(
        preprocessing.StandardScaler(),  # this will do the input scaling
        neural_network.MLPClassifier((i), activation="logistic", verbose=True, solver="lbfgs", alpha=0.0)
        )
     model5_scaled.fit(X_train, y_train)
     mse_model5_scaled = metrics.mean_squared_error(y, model5_scaled.predict(X))
     print(str(mse_model5_scaled))


NameError: name 'itertools' is not defined

## 3) ROC curves

In [28]:
#Plot the ROC curves for the best versions of your models and compute the AUROC. 

# Cross-Validation
from train_index, test_index in kf.split(ENTER NAME OF DATASET)
print(train_index, test_index)

def get_score(model, X_train, X_test, y_train, y_test):
    model.fit(X_train, y_train)
    return model.score(x_test)

get_score(LogisticRegression, X_train, X_test, y_train, y_test)

for train_index, test_index in kf.split(df_final):
    X_train, X_test, y_train, y_test = df_final[train_index], df_final[test_index], df_final[train]  
#-------- Confusion Matrices Comparison
#Plot the true value of y against the prediced value 
y_predicted = model.predict(X_test)
cm = confusion_matrix(y_test, y_predicted)

%matplotlib inline
plt.figure(figsize=(10,7))
sn.heatmap(cm, annot=True)
plt.xlabel('Predicted Value')
plt.ylabel('True Value')


SyntaxError: invalid syntax (<ipython-input-28-0cc72afbce4d>, line 4)

## 5) Which variables 'survive' in the logistic regression with L1 regularization ?

Considering the outputs in which we coded the logistic regression with L1 regularization, most coefficients remain at a value very different to 0. 
Only the last coefficient (eq_tr), which refers to the Nominal Total Equity return. This means that this variable is not a good predictor for whether tere will be an economic crisis, whilst all others are and "survive" the L1 regularization.
Comparing the coefficient values with and without the L1 regularization, we can see that the beta of each of the variables that survived has changed,including in some cases their sign.
- The slope_of_the_yield_curve's coefficient went from a -39,67 to a -23,56. Meaning that even with the regularization, this variable still is quite significant for predicting crisis.
- The delta_credit's coefficient went from a -5,42 to a 9,83. With the regularization, whilst the variable remains significant, its impact completly shifts. Whilst previously 1 year-variation of credit led to a lower likelihood of a crisis happening, the opposite occurs with the regularization.
- The delta_debt_serv_ratio's coefficient went from a 35,46 to a 67,74. This means that with the L1 regularization the impact of the 1 year-variation of debt ser ratio on the likelihood of a crisis hapenning increased.
- The delta_investm_ratio's coefficient went from a 15,97 to a 16,79.  After the L1 regularization, the impact of a 1 year-variation of investment/gdp ratio hasn't changed too much from what was initially predicted.
- The delta_pdebt_ratio's coefficient went from a -3,72 to a -6,64. With the L1 regularization, the impact the 1 year-variation of public debt/gdp on the likelihood of a crisis happening has increased.
- The delta_bmoney_gdp's coefficient went from a -12,11 to a 3,06. With the L1 regularization, the impact of a 1 year-variation of broad money / gdp has completly shifted. Taking the regularization into account, a higher variation of this variable leads to a higher probability of a crisis happening. We have the opposite effect if we don't consider this.
- The delta_curr_acc_gdp's coefficient went from a -9,84 to a 1,30. After the L1 regularization, the impact of 1 year-variation of current / gdp on the likelihood of a crisis happening has shifted - it used to be negative and now is positive.
- The growth_cpi's coefficient went from a -22,99 to a -10,65. The impact of annual inflation, whilst higher in magnitude, hasn't changed in terms of signal.