## Ridge Regression
You just learned about lasso regression, which introduces a penalty and tries to eliminate certain features from the data. Ridge regression takes an alternative approach by introducing a penalty that penalizes large weights. As a result, the optimization process tries to reduce the magnitude of the coefficients without completely eliminating them.

## Fixing Model Overfitting Using Ridge Regression
The goal of this exercise is to teach you how to identify when your model starts overfitting, and to use ridge regression to fix overfitting in your model.

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression, Ridge
from sklearn.metrics import mean_squared_error
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import MinMaxScaler, PolynomialFeatures

In [2]:
_df = pd.read_csv('https://raw.githubusercontent.com/'\
                 'PacktWorkshops/The-Data-Science-Workshop/'\
                 'master/Chapter07/Dataset/ccpp.csv')
_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9568 entries, 0 to 9567
Data columns (total 5 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   AT      9568 non-null   float64
 1   V       9568 non-null   float64
 2   AP      9568 non-null   float64
 3   RH      9568 non-null   float64
 4   PE      9568 non-null   float64
dtypes: float64(5)
memory usage: 373.9 KB


In [3]:
# features and labels
X = _df.drop(['PE'], axis=1).values
y = _df['PE'].values

In [4]:
# split data into training and evaluation sets
train_X, eval_X, train_y, eval_y = train_test_split(X, y, train_size=0.8, random_state=0)

In [5]:
# instantiate LinearRegression
lr_model_1 = LinearRegression()

#fit model
lr_model_1.fit(train_X, train_y)

LinearRegression()

In [6]:
# make predictions on the evaluation dataset
lr_model_1_preds = lr_model_1.predict(eval_X)

In [7]:
# R2 of the model
print('lr_model_1 Score: {}'.format(lr_model_1.score(eval_X, eval_y)))

lr_model_1 Score: 0.9325315554761303


In [8]:
# MSE
print('lr_model_1 MSE: {}'.format(mean_squared_error(eval_y, lr_model_1_preds)))

lr_model_1 MSE: 19.733699303497637


The first model was trained on four features. You will now train a new model on four cubed features.

In [9]:
# create a list of tuples to serve as a pipeline
steps = [('scaler', MinMaxScaler()), 
         ('poly', PolynomialFeatures(degree=3)), 
        ('lr', LinearRegression())]

In [10]:
# create an instance of a pipeline
lr_model_2 = Pipeline(steps)

In [12]:
# train pipeline instance
lr_model_2.fit(train_X, train_y)

Pipeline(steps=[('scaler', MinMaxScaler()),
                ('poly', PolynomialFeatures(degree=3)),
                ('lr', LinearRegression())])

In [13]:
# R2 of model 2
print('lr_model_2 R2 score: {}'.format(lr_model_2.score(eval_X, eval_y)))

lr_model_2 R2 score: 0.9443678654045206


In [14]:
# preds of model 2
lr_model_2_preds = lr_model_2.predict(eval_X)

In [15]:
# MSE of model 2
print('lr_model_2 MSE: {}'.format(mean_squared_error(eval_y, lr_model_2_preds)))

lr_model_2 MSE: 16.27172263220768


In [16]:
# inspect model coefficients (weights)
print(lr_model_2[-1].coef_)

[ 7.72661789e-14 -1.77278028e+02 -4.60337188e+01 -1.60520675e+02
 -1.23076123e+02  6.23358210e+00  8.19655844e+00  1.45478576e+02
  1.88658651e+02  2.43740192e+01  1.80553150e+02 -1.08058561e+02
  1.09713294e+02  1.79121906e+02  1.06460596e+02  2.67290613e+01
  7.79833654e+01  3.69241324e+01 -1.13863997e+02 -1.42673215e+02
 -9.69606773e+01  1.90706809e+02 -5.56429546e+01 -1.32595225e+02
 -9.41682917e+01  9.40112729e+01 -1.18732510e+02 -7.64871610e+01
 -4.18714081e+01  6.36772260e+01  4.42340977e+01 -3.81114691e+01
 -4.71547759e+01 -9.16797074e+01 -2.52346805e+01]


In [17]:
# check for the number of coefficients in this model
print(len(lr_model_2[-1].coef_))

35


In [18]:
# create a steps list with PolynomialFeatures of degree 10
steps = [('scaler', MinMaxScaler()), 
        ('poly', PolynomialFeatures(degree=10)), 
        ('lr', LinearRegression())]

In [19]:
# create model 3
lr_model_3 = Pipeline(steps)

In [20]:
# fit model 3
lr_model_3.fit(train_X, train_y)

Pipeline(steps=[('scaler', MinMaxScaler()),
                ('poly', PolynomialFeatures(degree=10)),
                ('lr', LinearRegression())])

In [21]:
# R2 of model 3
print('lr_model_3 R2 score: {}'.format(lr_model_3.score(eval_X, eval_y)))

lr_model_3 R2 score: 0.5683445811859165


You can see from the preceding figure that the R2 score is now 0.56. The previous model had an R2 score of 0.944. This model has an R2 score that is considerably worse than the one of the previous model, lr_model_2. This happens when your model is overfitting.

In [22]:
# preds for model 3
lr_model_3_preds = lr_model_3.predict(eval_X)

In [23]:
# MSE of model 3
print('lr_model_3_preds MSE: {}'.format(mean_squared_error(eval_y, lr_model_3_preds)))

lr_model_3_preds MSE: 126.25395913179554


In [24]:
# print coefficients of model 3
print(len(lr_model_3[-1].coef_))

1001


In [25]:
# inspect first 35 weights to get a sense of the individual magnitueds
print(lr_model_3[-1].coef_[:35])

[ 3.92505101e+05 -6.90885527e+07 -4.12732195e+07  2.27924135e+07
 -4.76789946e+07  2.96662372e+08  2.73270121e+08  1.07845355e+08
  3.73718730e+08  8.79697148e+07 -2.35335367e+07  2.46253911e+08
 -2.61103329e+08  1.86100397e+07  1.41131427e+08 -6.53882597e+08
 -8.90637240e+08 -1.06074229e+09 -1.29264008e+09 -4.28439276e+08
  5.31443921e+07 -1.30409864e+09  4.41022600e+08 -8.86227463e+08
 -8.78158963e+08 -1.97159667e+06 -5.39373932e+08 -3.68353344e+08
  9.82102192e+08 -2.76731667e+08 -6.28828639e+08  8.14253722e+08
  5.43202144e+08 -2.03046371e+08 -2.42929104e+08]


In [26]:
# pipeline with lasso
steps = [('scaler', MinMaxScaler()), 
        ('poly', PolynomialFeatures()), 
        ('lr', Ridge(alpha=0.9))]

In [27]:
# create instance of pipeline
ridge_model = Pipeline(steps)

In [28]:
# fit pipeline on training data
ridge_model.fit(train_X, train_y)

Pipeline(steps=[('scaler', MinMaxScaler()), ('poly', PolynomialFeatures()),
                ('lr', Ridge(alpha=0.9))])

In [29]:
# ridge R2
print('ridge_model R2 score: {}'.format(ridge_model.score(eval_X, eval_y)))

ridge_model R2 score: 0.940908326527359


In [30]:
# predictions for ridge
ridge_model_preds = ridge_model.predict(eval_X)

In [32]:
# ridge MSE
print('ridge_model MSE: {}'.format(mean_squared_error(eval_y, ridge_model_preds)))

ridge_model MSE: 17.28359567022492


In [33]:
# number of lasso weights
print(len(ridge_model[-1].coef_))

15


In [35]:
print(ridge_model[-1].coef_[:35])

[ 0.00000000e+00 -6.68392539e+01 -2.67323709e+01  1.90794176e+01
  1.84448707e+01  1.02928525e+01  1.54794956e+01 -4.37677751e+00
 -2.24824899e+01  5.42873597e-02  5.96517701e+00 -3.16100954e-01
 -1.02587561e+01 -8.30009697e+00 -9.54415165e+00]
