## Building the Linear Model
We'll start off with using the final version of the Ames Housing dataset we worked on through the feature engineering parts. The goal is to create a Linear Regression Model, train it on the data with the optimal parameters using a grid search, and then evaluate the model's capabilities on a test set.

In [1]:
#import necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [2]:
#read the final csv file
df = pd.read_csv("../DATA/AMES_Final_DF.csv")

In [3]:
df.head()

Unnamed: 0.1,Unnamed: 0,Lot Frontage,Lot Area,Overall Qual,Overall Cond,Year Built,Year Remod/Add,Mas Vnr Area,BsmtFin SF 1,BsmtFin SF 2,...,Sale Type_ConLw,Sale Type_New,Sale Type_Oth,Sale Type_VWD,Sale Type_WD,Sale Condition_AdjLand,Sale Condition_Alloca,Sale Condition_Family,Sale Condition_Normal,Sale Condition_Partial
0,0,141.0,31770,6,5,1960,1960,112.0,639.0,0.0,...,0,0,0,0,1,0,0,0,1,0
1,1,80.0,11622,5,6,1961,1961,0.0,468.0,144.0,...,0,0,0,0,1,0,0,0,1,0
2,2,81.0,14267,6,6,1958,1958,108.0,923.0,0.0,...,0,0,0,0,1,0,0,0,1,0
3,3,93.0,11160,7,5,1968,1968,0.0,1065.0,0.0,...,0,0,0,0,1,0,0,0,1,0
4,4,74.0,13830,5,5,1997,1998,0.0,791.0,0.0,...,0,0,0,0,1,0,0,0,1,0


**The label we are trying to predict is the SalePrice column. Separate out the data into X features and y labels.**

In [4]:
X = df.drop('SalePrice',axis=1)
y = df['SalePrice']

**Use scikit-learn to split up X and y into a training set and test set. Since we will later be using a Grid Search strategy, set your test proportion to 10%.**

In [5]:
from sklearn.model_selection import train_test_split

In [6]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.10, random_state=101)

The dataset features has a variety of scales and units. For optimal regression performance, we scale the X features.

In [7]:
from sklearn.preprocessing import StandardScaler

In [8]:
#create the standard scaler 
scaler = StandardScaler()

In [9]:
scaler.fit(X_train)

In [10]:
#we only fit to the training data
#transform the testing data
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

**Let's try the Linear Regression Model and see how it performs.**

In [11]:
from sklearn.linear_model import LinearRegression

In [13]:
#create the model
model = LinearRegression()

In [15]:
#fit the model on training data
model.fit(X_train,y_train)

In [16]:
#evaluate the model on test data
predictions = model.predict(X_test)

In [17]:
from sklearn.metrics import mean_absolute_error,mean_squared_error

In [20]:
#calculate the evaluation metrics
MAE = mean_absolute_error(y_test,predictions)
MSE = mean_squared_error(y_test,predictions)
RMSE = np.sqrt(MSE)

In [21]:
MAE

14591.398716877513

In [22]:
RMSE

20855.793910415614

The linear regression model achieved chieved an MAE of $\$$14591.39 and a RMSE of $\$$20855.79

From the linear family, lets explore the elastic net model.

In [23]:
from sklearn.linear_model import ElasticNet

In [34]:
base_elastic_net = ElasticNet()

The Elastic Net model has two main parameters, alpha and the L1 ratio. Let's create a dictionary parameter grid of values for the ElasticNet.

In [39]:
param_grid = {'alpha':[0.1,1,5,10,50,100],
             'l1_ratio':[.1,.5,.7,.9,.95,.99,1]}

Using scikit-learn create a GridSearchCV object and run a grid search for the best parameters for your model based on the scaled training data.

In [36]:
from sklearn.model_selection import GridSearchCV

In [41]:
#create our grid model
grid_model = GridSearchCV(estimator=base_elastic_net,
                          param_grid=param_grid,
                          scoring='neg_mean_squared_error',cv=5,verbose=2)

In [42]:
#fit the model
grid_model.fit(X_train,y_train)

Fitting 5 folds for each of 42 candidates, totalling 210 fits
[CV] END ............................alpha=0.1, l1_ratio=0.1; total time=   0.7s
[CV] END ............................alpha=0.1, l1_ratio=0.1; total time=   0.8s
[CV] END ............................alpha=0.1, l1_ratio=0.1; total time=   0.7s
[CV] END ............................alpha=0.1, l1_ratio=0.1; total time=   0.7s
[CV] END ............................alpha=0.1, l1_ratio=0.1; total time=   0.8s
[CV] END ............................alpha=0.1, l1_ratio=0.5; total time=   1.1s
[CV] END ............................alpha=0.1, l1_ratio=0.5; total time=   1.3s
[CV] END ............................alpha=0.1, l1_ratio=0.5; total time=   1.1s
[CV] END ............................alpha=0.1, l1_ratio=0.5; total time=   1.1s
[CV] END ............................alpha=0.1, l1_ratio=0.5; total time=   1.2s
[CV] END ............................alpha=0.1, l1_ratio=0.7; total time=   1.2s
[CV] END ............................alpha=0.1,

  model = cd_fast.enet_coordinate_descent(


[CV] END ............................alpha=0.1, l1_ratio=0.9; total time=   2.5s


  model = cd_fast.enet_coordinate_descent(


[CV] END ............................alpha=0.1, l1_ratio=0.9; total time=   2.5s


  model = cd_fast.enet_coordinate_descent(


[CV] END ............................alpha=0.1, l1_ratio=0.9; total time=   2.8s


  model = cd_fast.enet_coordinate_descent(


[CV] END ............................alpha=0.1, l1_ratio=0.9; total time=   2.6s


  model = cd_fast.enet_coordinate_descent(


[CV] END ............................alpha=0.1, l1_ratio=0.9; total time=   2.6s


  model = cd_fast.enet_coordinate_descent(


[CV] END ...........................alpha=0.1, l1_ratio=0.95; total time=   2.5s


  model = cd_fast.enet_coordinate_descent(


[CV] END ...........................alpha=0.1, l1_ratio=0.95; total time=   2.2s


  model = cd_fast.enet_coordinate_descent(


[CV] END ...........................alpha=0.1, l1_ratio=0.95; total time=   2.4s


  model = cd_fast.enet_coordinate_descent(


[CV] END ...........................alpha=0.1, l1_ratio=0.95; total time=   2.2s


  model = cd_fast.enet_coordinate_descent(


[CV] END ...........................alpha=0.1, l1_ratio=0.95; total time=   2.5s


  model = cd_fast.enet_coordinate_descent(


[CV] END ...........................alpha=0.1, l1_ratio=0.99; total time=   2.3s


  model = cd_fast.enet_coordinate_descent(


[CV] END ...........................alpha=0.1, l1_ratio=0.99; total time=   2.2s


  model = cd_fast.enet_coordinate_descent(


[CV] END ...........................alpha=0.1, l1_ratio=0.99; total time=   2.1s


  model = cd_fast.enet_coordinate_descent(


[CV] END ...........................alpha=0.1, l1_ratio=0.99; total time=   2.1s


  model = cd_fast.enet_coordinate_descent(


[CV] END ...........................alpha=0.1, l1_ratio=0.99; total time=   2.2s


  model = cd_fast.enet_coordinate_descent(


[CV] END ..............................alpha=0.1, l1_ratio=1; total time=   2.1s


  model = cd_fast.enet_coordinate_descent(


[CV] END ..............................alpha=0.1, l1_ratio=1; total time=   2.2s


  model = cd_fast.enet_coordinate_descent(


[CV] END ..............................alpha=0.1, l1_ratio=1; total time=   2.1s


  model = cd_fast.enet_coordinate_descent(


[CV] END ..............................alpha=0.1, l1_ratio=1; total time=   2.1s


  model = cd_fast.enet_coordinate_descent(


[CV] END ..............................alpha=0.1, l1_ratio=1; total time=   2.1s
[CV] END ..............................alpha=1, l1_ratio=0.1; total time=   0.1s
[CV] END ..............................alpha=1, l1_ratio=0.1; total time=   0.1s
[CV] END ..............................alpha=1, l1_ratio=0.1; total time=   0.1s
[CV] END ..............................alpha=1, l1_ratio=0.1; total time=   0.2s
[CV] END ..............................alpha=1, l1_ratio=0.1; total time=   0.2s
[CV] END ..............................alpha=1, l1_ratio=0.5; total time=   0.2s
[CV] END ..............................alpha=1, l1_ratio=0.5; total time=   0.2s
[CV] END ..............................alpha=1, l1_ratio=0.5; total time=   0.2s
[CV] END ..............................alpha=1, l1_ratio=0.5; total time=   0.2s
[CV] END ..............................alpha=1, l1_ratio=0.5; total time=   0.2s
[CV] END ..............................alpha=1, l1_ratio=0.7; total time=   0.3s
[CV] END ...................

  model = cd_fast.enet_coordinate_descent(


[CV] END ................................alpha=1, l1_ratio=1; total time=   1.3s


  model = cd_fast.enet_coordinate_descent(


[CV] END ................................alpha=1, l1_ratio=1; total time=   1.2s


  model = cd_fast.enet_coordinate_descent(


[CV] END ................................alpha=1, l1_ratio=1; total time=   1.2s


  model = cd_fast.enet_coordinate_descent(


[CV] END ................................alpha=1, l1_ratio=1; total time=   1.2s


  model = cd_fast.enet_coordinate_descent(


[CV] END ................................alpha=1, l1_ratio=1; total time=   1.3s
[CV] END ..............................alpha=5, l1_ratio=0.1; total time=   0.0s
[CV] END ..............................alpha=5, l1_ratio=0.1; total time=   0.0s
[CV] END ..............................alpha=5, l1_ratio=0.1; total time=   0.0s
[CV] END ..............................alpha=5, l1_ratio=0.1; total time=   0.0s
[CV] END ..............................alpha=5, l1_ratio=0.1; total time=   0.0s
[CV] END ..............................alpha=5, l1_ratio=0.5; total time=   0.0s
[CV] END ..............................alpha=5, l1_ratio=0.5; total time=   0.0s
[CV] END ..............................alpha=5, l1_ratio=0.5; total time=   0.0s
[CV] END ..............................alpha=5, l1_ratio=0.5; total time=   0.0s
[CV] END ..............................alpha=5, l1_ratio=0.5; total time=   0.0s
[CV] END ..............................alpha=5, l1_ratio=0.7; total time=   0.0s
[CV] END ...................

  model = cd_fast.enet_coordinate_descent(


[CV] END ................................alpha=5, l1_ratio=1; total time=   1.3s


  model = cd_fast.enet_coordinate_descent(


[CV] END ................................alpha=5, l1_ratio=1; total time=   1.3s


  model = cd_fast.enet_coordinate_descent(


[CV] END ................................alpha=5, l1_ratio=1; total time=   1.4s


  model = cd_fast.enet_coordinate_descent(


[CV] END ................................alpha=5, l1_ratio=1; total time=   1.3s


  model = cd_fast.enet_coordinate_descent(


[CV] END ................................alpha=5, l1_ratio=1; total time=   1.5s
[CV] END .............................alpha=10, l1_ratio=0.1; total time=   0.0s
[CV] END .............................alpha=10, l1_ratio=0.1; total time=   0.0s
[CV] END .............................alpha=10, l1_ratio=0.1; total time=   0.1s
[CV] END .............................alpha=10, l1_ratio=0.1; total time=   0.0s
[CV] END .............................alpha=10, l1_ratio=0.1; total time=   0.0s
[CV] END .............................alpha=10, l1_ratio=0.5; total time=   0.0s
[CV] END .............................alpha=10, l1_ratio=0.5; total time=   0.0s
[CV] END .............................alpha=10, l1_ratio=0.5; total time=   0.0s
[CV] END .............................alpha=10, l1_ratio=0.5; total time=   0.0s
[CV] END .............................alpha=10, l1_ratio=0.5; total time=   0.0s
[CV] END .............................alpha=10, l1_ratio=0.7; total time=   0.0s
[CV] END ...................

  model = cd_fast.enet_coordinate_descent(


[CV] END ...............................alpha=10, l1_ratio=1; total time=   1.3s


  model = cd_fast.enet_coordinate_descent(


[CV] END ...............................alpha=10, l1_ratio=1; total time=   1.3s


  model = cd_fast.enet_coordinate_descent(


[CV] END ...............................alpha=10, l1_ratio=1; total time=   1.5s


  model = cd_fast.enet_coordinate_descent(


[CV] END ...............................alpha=10, l1_ratio=1; total time=   2.0s
[CV] END ...............................alpha=10, l1_ratio=1; total time=   1.8s
[CV] END .............................alpha=50, l1_ratio=0.1; total time=   0.0s
[CV] END .............................alpha=50, l1_ratio=0.1; total time=   0.0s
[CV] END .............................alpha=50, l1_ratio=0.1; total time=   0.0s
[CV] END .............................alpha=50, l1_ratio=0.1; total time=   0.0s
[CV] END .............................alpha=50, l1_ratio=0.1; total time=   0.0s
[CV] END .............................alpha=50, l1_ratio=0.5; total time=   0.0s
[CV] END .............................alpha=50, l1_ratio=0.5; total time=   0.0s
[CV] END .............................alpha=50, l1_ratio=0.5; total time=   0.0s
[CV] END .............................alpha=50, l1_ratio=0.5; total time=   0.0s
[CV] END .............................alpha=50, l1_ratio=0.5; total time=   0.0s
[CV] END ...................

In [43]:
#display the best combinations of the parameters
grid_model.best_params_

{'alpha': 100, 'l1_ratio': 1}

In [44]:
#evaluate the performance of the model
y_pred = grid_model.predict(X_test)

In [45]:
#calculate the evaluation metrics
MAE = mean_absolute_error(y_test,y_pred)
MSE = mean_squared_error(y_test,y_pred)
RMSE = np.sqrt(MSE)

In [46]:
MAE

14195.354916674847

In [47]:
RMSE

20558.508635060203

The elastic net model performed slightly better that the linear regression model and achieved an MAE of $\$$14195 and a RMSE of $\$$20558.

Now, the question arises. Is these an actual good prediction? To answer that let's calculate the average price of the df in the sale price.

In [48]:
np.mean(df['SalePrice'])

180815.53743589742

According to the mean absolute error, we are kind of plus or -10%. 