Regularize: Add bias to reduce variance in order to add a penalty to the less important features

In [1]:
import pandas as pd
import sklearn

from sklearn.linear_model import LinearRegression
from sklearn.linear_model import Lasso
from sklearn.linear_model import Ridge

from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

In [2]:
df_happiness = pd.read_csv("../ml_pro/data/happiness.csv")
df_happiness.describe()

Unnamed: 0,rank,score,high,low,gdp,family,lifexp,freedom,generosity,corruption,dystopia
count,155.0,155.0,155.0,155.0,155.0,155.0,155.0,155.0,155.0,155.0,155.0
mean,78.0,5.354019,5.452326,5.255713,0.984718,1.188898,0.551341,0.408786,0.246883,0.12312,1.850238
std,44.888751,1.13123,1.118542,1.14503,0.420793,0.287263,0.237073,0.149997,0.13478,0.101661,0.500028
min,1.0,2.693,2.864884,2.521116,0.0,0.0,0.0,0.0,0.0,0.0,0.377914
25%,39.5,4.5055,4.608172,4.374955,0.663371,1.042635,0.369866,0.303677,0.154106,0.057271,1.591291
50%,78.0,5.279,5.370032,5.193152,1.064578,1.253918,0.606042,0.437454,0.231538,0.089848,1.83291
75%,116.5,6.1015,6.1946,6.006527,1.318027,1.414316,0.723008,0.516561,0.323762,0.153296,2.144654
max,155.0,7.537,7.62203,7.479556,1.870766,1.610574,0.949492,0.658249,0.838075,0.464308,3.117485


In [3]:
X = df_happiness[['gdp', 'family', 'lifexp','freedom','corruption','generosity','dystopia']]
y = df_happiness[['score']]
print(X.shape)
print(y.shape)

(155, 7)
(155, 1)


In [4]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=45)

modelLinear = LinearRegression().fit(X_train, y_train)
y_predict_linear = modelLinear.predict(X_test)

Implementing de regularization model. Parameter alpha represents the severity of the penalty

In [6]:
modelLasso = Lasso(alpha = 0.02).fit(X_train, y_train)
y_predict_lasso = modelLasso.predict(X_test)

modelRidge = Ridge(alpha = 1).fit(X_train, y_train)
y_predict_ridge = modelRidge.predict(X_test)

Comparing the mean square error of the three predictions

In [11]:
linear_loss = mean_squared_error(y_test, y_predict_linear)
print("Linear loss: ", linear_loss)

lasso_loss = mean_squared_error(y_test, y_predict_lasso)
print("Lasso loss: ", lasso_loss)

ridge_loss = mean_squared_error(y_test, y_predict_ridge)
print("Ridge loss: ", ridge_loss)


Linear loss:  8.748205928591658e-08
Lasso loss:  0.042820650791422644
Ridge loss:  0.00685687368857917


In this case, the regularization didn't help in minimizing the loss function (MSE). A posible cause might be that our dataset is clean and all features are important

In [9]:
print("Lasso coeficients: ", modelLasso.coef_)


Lasso coeficients:  [1.25237102 0.85856018 0.43605845 0.81300757 0.         0.40526095
 0.90969398]


Lasso model erased the 'corruption' feature

In [10]:
print("Ridge coeficients: ", modelRidge.coef_)

Ridge coeficients:  [[1.05812344 0.94848039 0.84262862 0.89571211 0.60527691 0.79231583
  0.96260432]]


Ridge model didn't erase any feature

Implementing ElasticNet. (Ridge) 0 < alpha < 1 (Lasso)

In [16]:
from sklearn.linear_model import ElasticNet

alphas = range(1,10)

for alpha in alphas:
    
    modelElastic = ElasticNet(random_state=0, alpha=alpha/10)

    modelElastic.fit(X_train, y_train)

    y_predict_elastic = modelElastic.predict(X_test)

    elastic_loss = mean_squared_error(y_test, y_predict_elastic)

    print("ElasticNet loss (alpha= ", alpha/10, "):", elastic_loss)

ElasticNet loss (alpha=  0.1 ): 0.2798739258014397
ElasticNet loss (alpha=  0.2 ): 0.6220627853993117
ElasticNet loss (alpha=  0.3 ): 0.9390881415508358
ElasticNet loss (alpha=  0.4 ): 1.1934073459155696
ElasticNet loss (alpha=  0.5 ): 1.3894033552098712
ElasticNet loss (alpha=  0.6 ): 1.5477756811502728
ElasticNet loss (alpha=  0.7 ): 1.600565118936508
ElasticNet loss (alpha=  0.8 ): 1.600565118936508
ElasticNet loss (alpha=  0.9 ): 1.600565118936508
