<img src="../Pics/MLSb-T.png" width="160">
<br><br>
<center><u><H1>Lasso, Ridge and Elastic Net</H1></u></center>

In [1]:
from sklearn.datasets import load_boston
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

In [2]:
boston = load_boston()
X = boston.data
y = boston.target
lr = LinearRegression()

In [3]:
print(boston.DESCR)

Boston House Prices dataset

Notes
------
Data Set Characteristics:  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive
    
    :Median Value (attribute 14) is usually the target

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pupil-teacher ratio by town
      

In [None]:
print(boston.feature_names)

In [None]:
df = pd.DataFrame.from_dict(boston.data)
df['target'] = pd.DataFrame.from_dict(boston.target)
df

In [None]:
lr.fit(X,y)
pred = lr.predict(X)

In [None]:
# plotting predictions vs actual price
plt.scatter(pred, y)
plt.xlabel('Predicted price')
plt.ylabel('Actual price')
plt.show()

In [None]:
from sklearn.metrics import r2_score
r2 = r2_score(y, lr.predict(X))
print("R2 on test data: {:.2}".format(r2))

In [None]:
from sklearn.metrics import mean_squared_error

In [None]:
mse = mean_squared_error(y, lr.predict(X))
# root mean square error(RMSE):
rmse = np.sqrt(mse)
print('RMSE on test data: {:.3}'.format(rmse))

IMPORTANT NOTE: 
RMSE is a very rough estimate of the error by multiplying it by two. On this case we can expect the estimated price to be different from the real price by, at most USD 9,300.
RMSE correspond approximately to an estimate of the standard deviation. If we double our RMSE we can obtain a rough confident interval. This is valid not only if the errors are normally distributed either if they are not.

## L1 and L2 penalties

In [None]:
from sklearn.linear_model import ElasticNet, Lasso, Ridge

In [None]:
#ElasticNet
e_net = ElasticNet(alpha=0.5)

In [None]:
e_net.fit(X,y)
pred_enet = e_net.predict(X)

In [None]:
r2 = r2_score(y, pred_enet)
print("R2 on test data: {:.2}".format(r2))

In [None]:
mse = mean_squared_error(y, pred_enet)
# root mean square error(RMSE):
rmse = np.sqrt(mse)
print('RMSE on test data: {:.3}'.format(rmse))

In [None]:
#Lasso
lasso = Lasso()
lasso.fit(X,y)
pred_lasso = lasso.predict(X)
r2 = r2_score(y, pred_lasso)
print("R2 on test data: {:.2}".format(r2))

In [None]:
#Ridge
ridge = Ridge()
ridge.fit(X,y)
pred_ridge = ridge.predict(X)
r2 = r2_score(y, pred_ridge)
print("R2 on test data: {:.2}".format(r2))

## References:

http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.ElasticNet.html

http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html

http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Ridge.html