## 5.18.2 Plot Ridge coefﬁcients as a function of the regularization

Shows the effect of collinearity in the coefﬁcients of an estimator. 

Ridge Regression is the estimator used in this example. Each color represents a different feature of the coefﬁcient vector, and this is displayed as a function of the regularization parameter. 

This example also shows the usefulness of applying Ridge regression to highly ill-conditioned matrices. For such matrices, a slight change in the target variable can cause huge variances in the calculated weights. In such cases, it is useful to set a certain regularization (alpha) to reduce this variation (noise). 

When alpha is very large, the regularization effect dominates the squared loss function and the coefﬁcients tend to zero. At the end of the path, as alpha tends toward zero and the solution tends towards the ordinary least squares, coefﬁcients exhibit big oscillations. In practise it is necessary to tune alpha in such a way that a balance is maintained between both.


In [None]:
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
from sklearn import linear_model

# X is the 10x10 Hilbert matrix
X = 1./(np.arange(1,11) + np.arange(0,10)[:,np.newaxis])
y = np.ones(10)

n_alphas = 200
alphas = np.logspace(-10, -2, n_alphas)

coefs = []
ridge = linear_model.Ridge(fit_intercept=False)
for a in alphas:
    ridge.alpha = a
    ridge.fit(X,y)
    coefs.append(ridge.coef_)

# Display results
ax = plt.gca()
ax.plot(alphas,coefs)
ax.set_xscale('log')
ax.set_xlim(ax.get_xlim()[::-1]) # reverse axis
plt.xlabel('alpha')
plt.ylabel('weights')
plt.title('Ridge coefficients as a function of the regularization') 
plt.axis('tight')
plt.show()

## 5.18.13 Linear Regression Example

This example uses the only the ﬁrst feature of the diabetes dataset, in order to illustrate a two-dimensional plot of this regression technique. The straight line can be seen in the plot, showing how linear regression attempts to draw a straight line that will best minimize the residual sum of squares between the observed responses in the dataset, and the responses predicted by the linear approximation. 

The coefﬁcients, the residual sum of squares and the variance score are also calculated.

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
from sklearn import datasets, linear_model
from sklearn.metrics import mean_squared_error, r2_score

# load the diabetes dataset
diabetes = datasets.load_diabetes()

# Use only one feature
diabetes_X = diabetes.data[:,np.newaxis,2]
print(diabetes_X.shape)

# Split the data into training/testing sets
diabetes_X_train = diabetes_X[:-20]
diabetes_X_test = diabetes_X[-20:]
diabetes_y_train = diabetes.target[:-20]
diabetes_y_test = diabetes.target[-20:]

# Create linear regression object 
regr = linear_model.LinearRegression()

# Train the mdoel
regr.fit(diabetes_X_train, diabetes_y_train)

# Make prediction using the test set
diabetes_y_pred = regr.predict(diabetes_X_test)

# The coefficient
print('Coefficients: \n', regr.coef_)

# The mean squared error
print("Mean squared error: {:.2f}".format(
                mean_squared_error(diabetes_y_test,diabetes_y_pred)))

# Explained variance score: 1 is perfect prediction
print("Variance score: {:.2f}".format(
                r2_score(diabetes_y_test,diabetes_y_pred)))

# Plot output
plt.figure(figsize=(10,6))
plt.scatter(diabetes_X_test, diabetes_y_test, color='black')
plt.plot(diabetes_X_test, diabetes_y_pred, color='blue',linewidth=3)
plt.xticks(())
plt.yticks(())
plt.show()

## 5.18.23 Lasso and Elastic Net for Sparse Signals

Estimates Lasso and Elastic-Net regression models on a manually generated sparse signal corrupted with an additive noise. Estimated coefﬁcients are compared with the ground-truth.

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
from sklearn.metrics import r2_score

# Generate some sparse data to play with 
np.random.seed(42)
n_samples, n_features = 50, 100
X = np.random.randn(n_samples,n_features)

# Decreasing coef w. alternated signs for visualization
idx = np.arange(n_features)
coef = (-1)** idx * np.exp(-idx/10)
coef[10:] = 0 # sparsify coef
y = np.dot(X, coef)

# Add noise
y += 0.01 * np.random.normal(size=n_samples)

n_samples = X.shape[0]
X_train, y_train = X[:n_samples // 2], y[:n_samples // 2]
X_test, y_test = X[n_samples // 2:], y[n_samples // 2:]

# Lasso
from sklearn.linear_model import Lasso

alpha = 0.1
lasso = Lasso(alpha=alpha)

y_pred_lasso = lasso.fit(X_train, y_train).predict(X_test)
r2_score_lasso = r2_score(y_test, y_pred_lasso)
print(lasso)
print("r^2 on test data : %f" % r2_score_lasso)

# ElasticNet
from sklearn.linear_model import ElasticNet

enet = ElasticNet(alpha=alpha, l1_ratio=0.7)
y_pred_enet = enet.fit(X_train, y_train).predict(X_test)
r2_score_enet = r2_score(y_test, y_pred_enet)
print(enet)
print("r^2 on test data : %f" % r2_score_enet)

m, s, _ = plt.stem(np.where(enet.coef_)[0],
                   enet.coef_[enet.coef_ != 0],
                   markerfmt='x',
                   label='Elastic net coefficients')
plt.setp([m, s], color='#2ca02c')
m, s, _ = plt.stem(np.where(lasso.coef_)[0],
                   lasso.coef_[lasso.coef_ != 0],
                   markerfmt='x',
                   label='Lasso coefficients')
plt.setp([m, s], color='#ff7f0e')
plt.stem(np.where(coef)[0],
         coef[coef != 0],
         markerfmt='bx',
         label='True coefficients')
plt.legend(loc='best')
plt.title("Lasso $R^2$: %.3f, Elastic Net $R^2$: %.3f" 
          % (r2_score_lasso, r2_score_enet))
plt.show()