# Linear Regression Project Exercise 

---
---
---
### Imports

In [33]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

### Data

In [34]:
df = pd.read_csv("AMES_Final_DF.csv")

In [35]:
df.head()

Unnamed: 0,Lot Frontage,Lot Area,Overall Qual,Overall Cond,Year Built,Year Remod/Add,Mas Vnr Area,BsmtFin SF 1,BsmtFin SF 2,Bsmt Unf SF,...,Sale Type_ConLw,Sale Type_New,Sale Type_Oth,Sale Type_VWD,Sale Type_WD,Sale Condition_AdjLand,Sale Condition_Alloca,Sale Condition_Family,Sale Condition_Normal,Sale Condition_Partial
0,141.0,31770,6,5,1960,1960,112.0,639.0,0.0,441.0,...,0,0,0,0,1,0,0,0,1,0
1,80.0,11622,5,6,1961,1961,0.0,468.0,144.0,270.0,...,0,0,0,0,1,0,0,0,1,0
2,81.0,14267,6,6,1958,1958,108.0,923.0,0.0,406.0,...,0,0,0,0,1,0,0,0,1,0
3,93.0,11160,7,5,1968,1968,0.0,1065.0,0.0,1045.0,...,0,0,0,0,1,0,0,0,1,0
4,74.0,13830,5,5,1997,1998,0.0,791.0,0.0,137.0,...,0,0,0,0,1,0,0,0,1,0


In [36]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2925 entries, 0 to 2924
Columns: 274 entries, Lot Frontage to Sale Condition_Partial
dtypes: float64(11), int64(263)
memory usage: 6.1 MB


**The label we are trying to predict is the SalePrice column. Separate out the data into X features and y labels**

In [37]:
X = df.drop(columns=['SalePrice'])
y = df['SalePrice']

**Use scikit-learn to split up X and y into a training set and test set. Since we will later be using a Grid Search strategy, set your test proportion to 10%. To get the same data split as the solutions notebook, you can specify random_state = 101**

In [38]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.1, random_state=42)

print(X_train.shape)
print(y_train.shape)
print(X_test.shape)
print(y_test.shape)

(2632, 273)
(2632,)
(293, 273)
(293,)


**The dataset features has a variety of scales and units. For optimal regression performance, scale the X features. Take carefuly note of what to use for .fit() vs what to use for .transform()**

In [39]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train_trf, X_test_trf = scaler.fit_transform(
    X_train), scaler.transform(X_test)

**We will use an Elastic Net model. Create an instance of default ElasticNet model with scikit-learn**

In [40]:
from sklearn.linear_model import ElasticNet
el_net = ElasticNet()

In [41]:
help(el_net)

Help on ElasticNet in module sklearn.linear_model._coordinate_descent object:

class ElasticNet(sklearn.base.MultiOutputMixin, sklearn.base.RegressorMixin, sklearn.linear_model._base.LinearModel)
 |  ElasticNet(alpha=1.0, *, l1_ratio=0.5, fit_intercept=True, precompute=False, max_iter=1000, copy_X=True, tol=0.0001, warm_start=False, positive=False, random_state=None, selection='cyclic')
 |  
 |  Linear regression with combined L1 and L2 priors as regularizer.
 |  
 |  Minimizes the objective function::
 |  
 |          1 / (2 * n_samples) * ||y - Xw||^2_2
 |          + alpha * l1_ratio * ||w||_1
 |          + 0.5 * alpha * (1 - l1_ratio) * ||w||^2_2
 |  
 |  If you are interested in controlling the L1 and L2 penalty
 |  separately, keep in mind that this is equivalent to::
 |  
 |          a * ||w||_1 + 0.5 * b * ||w||_2^2
 |  
 |  where::
 |  
 |          alpha = a + b and l1_ratio = a / (a + b)
 |  
 |  The parameter l1_ratio corresponds to alpha in the glmnet R package while
 |  alp

**The Elastic Net model has two main parameters, alpha and the L1 ratio. Create a dictionary parameter grid of values for the ElasticNet. Feel free to play around with these values, keep in mind, you may not match up exactly with the solution choices**

In [42]:
params = dict(alpha=[0, 0.1, 0.5, 1.0, 5, 8, 10, 20, 50, 100], l1_ratio=[
              0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1.0])
params

{'alpha': [0, 0.1, 0.5, 1.0, 5, 8, 10, 20, 50, 100],
 'l1_ratio': [0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1.0]}

**Using scikit-learn create a GridSearchCV object and run a grid search for the best parameters for your model based on your scaled training data. [In case you are curious about the warnings you may recieve for certain parameter combinations](https://stackoverflow.com/questions/20681864/lasso-on-sklearn-does-not-converge)**

In [43]:
from sklearn.model_selection import GridSearchCV

grid = GridSearchCV(estimator=el_net, param_grid=params, cv=5, scoring='neg_root_mean_squared_error', verbose=2, n_jobs=-1)


In [44]:
grid.fit(X_train_trf, y_train)

Fitting 5 folds for each of 90 candidates, totalling 450 fits


In [45]:
grid.best_params_

{'alpha': 100, 'l1_ratio': 1.0}

**Display the best combination of parameters for your model**

**Evaluate your model's performance on the unseen 10% scaled test set. In the solutions notebook we achieved an MAE of $\$$14149 and a RMSE of $\$$20532**

In [46]:
y_pred = grid.predict(X_test_trf)

In [47]:
from sklearn.metrics import mean_absolute_error, mean_squared_error
mean_absolute_error(y_test, y_pred)

13696.987251039965

In [48]:
mean_squared_error(y_test, y_pred)

468796928.72163576

In [49]:
np.sqrt(mean_squared_error(y_test, y_pred))

21651.718839889727