<img src="../logo.png",width=200,height=60>


## Optimizing $\alpha$ value for Ridge

Ridge is nothing but Regularized version of Least Squares. Often we don't know which value of $\alpha$ would give us the best results. What we can do is try out with different values and then select the one with the best cross-validation accuracy. 

In [1]:
# Create made up data
import numpy as np

from sklearn.datasets import make_regression

X, y = make_regression(n_features=3, effective_rank=2, noise=10)
# effective rank is the number of variables that 
# are enough to describe the input variables. Hence most 
# of the input data will be linear combination of these
# singluar vectors. Rest of the variables will be fairly
# irrelevant to the output. 

# noise is the standard deviation of the gaussian applied to
# output

In [2]:
alpha_grid = np.linspace(0.1, 1, 10)
alpha_grid

array([0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])

In [3]:
from sklearn.linear_model import RidgeCV

clf = RidgeCV(alphas=alpha_grid, store_cv_values=True)
clf.fit(X, y)
print("Best alpha: {}".format(clf.alpha_))
print("Costs: {}".format(clf.cv_values_[:2]))

Best alpha: 0.1
Costs: [[2.14472645e+01 1.07975748e+01 4.75876414e+00 1.58200543e+00
  2.24238378e-01 4.03349487e-02 6.19464371e-01 1.69378453e+00
  3.08537575e+00 4.67431768e+00]
 [8.05372323e+01 8.38260387e+01 8.54738373e+01 8.61833547e+01
  8.63413419e+01 8.61706111e+01 8.58041286e+01 8.53233981e+01
  8.47793690e+01 8.42043106e+01]]


It will internally run multiple iterations of cross validation and select the alpha with least average cost. 

## Scoring with Mean Absolute Error

In [4]:
from sklearn.metrics import mean_absolute_error, make_scorer
l1_error = make_scorer(mean_absolute_error, greater_is_better=False)

In [5]:
clf = RidgeCV(alphas=alpha_grid, store_cv_values=True, scoring=l1_error)
clf.fit(X, y)
print("Best alpha: {}".format(clf.alpha_))
print("Costs: {}".format(clf.cv_values_[:2]))

Best alpha: 0.2
Costs: [[-13.78000134 -12.43484854 -11.33034137 -10.40666017  -9.62242035
   -8.94804657  -8.36182161  -7.84742743  -7.39235843  -6.98686516]
 [ -1.08635116  -1.2677534   -1.35730348  -1.39559635  -1.40410149
   -1.39490997  -1.3751491   -1.34916383  -1.31966859  -1.28838798]]


While the best alpha is same, the error are relatively smaller. This is because of the fact that by defualt RMSE is used (in the previous example) and in the last example we've used MAE

Same technique is also applicable to __Lasso__ and __LassoCV__

<p>&copy; 2018 Stacklabs<p>
