In [3]:
import numpy as np
import sklearn

# first --> pip install import_ipynb then write below lines to add files or modules made by us 
from IPython import get_ipython
import import_ipynb
get_ipython().run_line_magic('run','03_LASSO_Regression.ipynb')
# but the above lines not only load or import the module but also run the code 

Data shape: (150, 4)

Labels shape: (150,)

Coefficients: array([ 0.        , -0.        ,  0.40811896,  0.        ])

Intercept: -0.5337110569441172

R2: 0.895821120274704



# A. Bayesian techniques
So far, we've discussed hyperparameter optimization through cross-validation. Another way to optimize the hyperparameters of a regularized regression model is with Bayesian techniques.

In Bayesian statistics, the main idea is to make certain assumptions about the probability distributions of a model's parameters before being fitted on data. These initial distribution assumptions are called priors for the model's parameters.

In a Bayesian ridge regression model, there are two hyperparameters to optimize: α and λ. The α hyperparameter serves the same exact purpose as it does for regular ridge regression; namely, it acts as a scaling factor for the penalty term.

The λ hyperparameter acts as the precision of the model's weights. Basically, the smaller the λ value, the greater the variance between the individual weight values.

# B. Hyperparameter priors
Both the α and λ hyperparameters have gamma distribution priors, meaning we assume both values come from a gamma probability distribution.

There's no need to know the specifics of a gamma distribution, other than the fact that it's a probability distribution defined by a shape parameter and scale parameter.

Specifically, the α hyperparameter has prior:


Γ(α 
1
​
 ,α 
2
​
 )

and the λ hyperparameter has prior:

Γ(λ 
1
​
 ,λ 
2
​
 )

where Γ(k, θ) represents a gamma distribution with shape parameter k and scale parameter θ.

# C. Tuning the model
When finding the optimal weight settings of a Bayesian ridge regression model for an input dataset, we also concurrently optimize the α and λ hyperparameters based on their prior distributions and the input data.

This can all be done with the BayesianRidge object (part of the linear_model module). Like all the previous regression objects, this one can be initialized with no required arguments.

In [7]:
# predefined dataset from previous chapter
print('Data shape: {}\n'.format(data.shape))
print('Labels shape: {}\n'.format(labels.shape))

from sklearn import linear_model
reg = linear_model.BayesianRidge()
reg.fit(data, labels)
print('Coefficients: {}\n'.format(repr(reg.coef_)))
print('Intercept: {}\n'.format(reg.intercept_))
print('R2: {}\n'.format(reg.score(data, labels)))
print('Alpha: {}\n'.format(reg.alpha_))
print('Lambda: {}\n'.format(reg.lambda_))

Data shape: (150, 4)

Labels shape: (150,)

Coefficients: array([-0.11362625, -0.03526763,  0.24468776,  0.57300547])

Intercept: 0.16501980374055758

R2: 0.9303174820768509

Alpha: 20.975705701144673

Lambda: 9.53356207176252



We can manually specify the α1 and α2 gamma parameters for α with the alpha_1 and alpha_2 keyword arguments when initializing BayesianRidge. Similarly, we can manually set λ1 and λ2 with the lambda_1 and lambda_2 keyword arguments. The default value for each of the four gamma parameters is 10-6.