# Lesson 1 - Linear Regression - Normalization
In today's lesson we will explore new kinds of linear regression, in the form of Lasso and Ridge Regression.
Your task will include:
- creating a train/test split of the Boston Dataset
- implementing Lasso Regression
- implementing Ridge Regression
- comparing results

As you can see, today's lesson will be done using Jupyter, a Python Package that allows us to write interactive Notebooks divided in Cells. Each cell can be executed independetely, however you should remember that each time you execute a cell it's as if you are continuing the code run previously.

## Task 1 - Train/Test splits of the Dataset
It is common practice to divide the data you are working with into different sections (called splits).
This is particularly necessary when working with Deep Learning Methods, since they are sensible to overfitting.
Overfitting means that a model learns especially well a set of data, but in return loses the ability to generalize, meaning it will perform worse with data it has never seen.

Splitting the dataset in two will allow us to train a model on the train_split and test its performance on the test_split.

### Tips for this task
- We have already seen how to load the Boston Dataset and how to convert it to a Numpy matrix
- SKLearn has a built in function to split datasets ;)

In [18]:
# imports for general purposes
import pandas as pd
import numpy as np
import warnings
warnings.filterwarnings('ignore')

In [19]:
## Import the boston dataset, create X and Y, the numpy arrays of data and target
# remember to scale the data!

from sklearn.preprocessing import StandardScaler
from sklearn.datasets import load_boston

X, y = load_boston(return_X_y=True)

# Standardizing data
scaler = StandardScaler()
scaler.fit(X)
X = scaler.transform(X)

In [20]:
## generate X_train, X_test, y_train, y_test, the numpy arrays containing the split data and target
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1)

print(X_train.shape, y_train.shape)
print(X_test.shape, y_test.shape)

(455, 13) (455,)
(51, 13) (51,)


## Task 2 - Implement Ridge Regression, Lasso Regression, Least Square Regression
In this section you are expected to implement the 3 kinds of linear regressions seen in class: Lasso, Ridge and Least Square Linear Regression.

Additionally, we want to compare the performances between train_splits and test_splits. 
In particular, we want to train our models on the train splits, and then see their performances on both train_splits and test_splits, to capture the different ability to generalize.

We want to compare different factors of the 3 methods:
- Accuracy, using MSE and R2
- Time taken to get the solution
- The weight matrices (thetas)

### Tips for this task(s)
- We have seen how to calculate the time for regression, and the different scores
- SKLearn has all the functions you are looking for ;)

#### Task 2.1 - Regression and Results
In this first part, you are tasked with implementing the 3 different regressions and to calculate the 3 scores seen in class (R2, MSE, MAE).
- You must train the regressors on the train_split and calculate the results on the test_splits.
- Which model is better?
- Additionally, calculate the time it takes the 3 methods to get the weigth matrix. Which one is faster? Can you explain why?

In [22]:
from sklearn.linear_model import LinearRegression, Lasso, Ridge
from sklearn.metrics import r2_score, mean_squared_error
# confront results between regression methods

# Standard Linear Regression
reg_lin = LinearRegression()
reg_lin.fit(X_train,y_train)
pred_lin = reg_lin.predict(X_test)
print("R2 score Reg-Lin", r2_score(y_test, pred_lin))
print("MSE Reg-Lin", mean_squared_error(y_test, pred_lin))


# Lasso Regression
reg_las = Lasso()
reg_las.fit(X_train,y_train)
pred_las = reg_las.predict(X_test)
print("R2 score Reg-Las", r2_score(y_test, pred_las))
print("MSE Reg-Las", mean_squared_error(y_test, pred_las))

# Ridge Regression
reg_rid = Ridge()
reg_rid.fit(X_train,y_train)
pred_rid = reg_rid.predict(X_test)
print("R2 score Reg-Rid", r2_score(y_test, pred_rid))
print("MSE Reg-Rid", mean_squared_error(y_test, pred_rid))

R2 score Reg-Lin 0.7596438759923646
MSE Reg-Lin 27.836948462205537
R2 score Reg-Las 0.7136077540750553
MSE Reg-Las 33.16864183387616
R2 score Reg-Rid 0.7598825906763125
MSE Reg-Rid 27.80930161783383


#### Task 2.2
In this subtask we want to compare the different weights that each model produces.
You can use the weights calculated in the previous section to make your calculation, or you can re-calculate them.
- Look at the weights of the matrixes: are there features that tend to have a low absolute value?
- Try and do regression on the same data with those features removed. Compare the results.
- Use the whole dataset without splitting, to reduce randomness

In [23]:
# Confront the matrixes of the weights
print('Linear Matrix')
print(reg_lin.coef_)
print('\n\nLasso Matrix')
print(reg_las.coef_)
print('\n\nRidge Matrix')
print(reg_rid.coef_)
print('\n\n')

Linear Matrix
[-0.87805578  1.12222753  0.23821264  0.68315784 -2.125061    2.8022863
  0.0221266  -3.00390348  2.80007274 -2.1694357  -1.97367975  0.89755551
 -3.72367369]


Lasso Matrix
[-0.          0.         -0.          0.06926065 -0.          2.7530966
 -0.         -0.         -0.         -0.         -1.23066838  0.19489447
 -3.51999848]


Ridge Matrix
[-0.86933063  1.10692859  0.21283102  0.68738644 -2.09430916  2.81139193
  0.01380454 -2.97670886  2.71683151 -2.09244825 -1.9642439   0.89737787
 -3.70804342]





In [25]:
## Remove from the entire dataset the columns that Lasso sets to 0

X_train_slim = X_train[:,reg_las.coef_ != 0]
X_test_slim = X_test[:,reg_las.coef_ != 0]


# compare Lasso regression on original data with Lasso regression on reduced data
reg_las = Lasso()
reg_las.fit(X_train_slim,y_train)
pred_las = reg_las.predict(X_test_slim)
print("R2 score Reg-Las", r2_score(y_test, pred_las))
print("MSE Reg-Las", mean_squared_error(y_test, pred_las))

reg_las = Lasso()
reg_las.fit(X_train,y_train)
pred_las = reg_las.predict(X_test)
print("R2 score Reg-Las", r2_score(y_test, pred_las))
print("MSE Reg-Las", mean_squared_error(y_test, pred_las))


R2 score Reg-Las 0.7136084936400932
MSE Reg-Las 33.168556180831416
R2 score Reg-Las 0.7136077540750553
MSE Reg-Las 33.16864183387616
