# 4 Modeling - Investigating San Francisco Housing Prices Through Police Incident Reports and 311 Cases<a id='Modeling'></a>

## 1 Contents<a id='1_Contents'></a>
* [Modeling - Investigating San Francisco Housing Prices Through Police Incident Reports and 311 Cases](#Modeling)
  * [1 Contents](#1_Contents)
  * [2 Introduction](#2_Introduction)
  * [3 Imports](#3_Imports)
  * [4 Load The Data](#4_Load_The_Data)
  * [5 Create Train and Test Splits](#5_Create_Train_and_Test_Splits)
  * [6 Models](#6_Models)
    * [6.0 Dummy Model 0 - Standard Scaler, PCA, DummyRegressor](#6.0_Model_0_-_StandardScaler_PCA_DummyRegressor)
    * [6.1 Model 1 - Standard Scaler, PCA, Linear Regression](#6.1_Model_1_-_StandardScaler_PCA_LinearRegression)
    * [6.2 Model 2 - Robust Scaler, PCA, Linear Regression](#6.2_Model_2_-_RobustScaler_PCA_LinearRegression)
    * [6.3 Model 3 - Robust Scaler, SelectKBest, Linear Regression](#6.3_Model_3_-_RobustScaler_SelectKBest_LinearRegression)
    * [6.4 Model 4 - Robust Scaler, PCA, Random Forest](#6.4_Model_4_-_RobustScaler_PCA_RandomForest)
    * [6.5 Model 5 - Robust Scaler, PCA, Ridge](#6.5_Model_5_-_RobustScaler_PCA_Ridge)
    * [6.6 Model 6 - Robust Scaler, PCA, Lasso](#6.6_Model_6_-_RobustScaler_PCA_Lasso)
    * [6.7 Model 7 - Robust Scaler, PCA, LassoLars](#6.7_Model_7_-_RobustScaler_PCA_LassoLars)
    * [6.8 Model 8 - Robust Scaler, PCA, ElasticNet](#6.8_Model_8_-_RobustScaler_PCA_ElasticNet)
  * [7 Model Evaluation](#7_Model_Evaluation)

## 2 Introduction<a id='2_Introduction'></a>

In this notebook, we will build different machine learning models using hyperparameter tuning and compare them to determine the best model that most accurately predicts the relationship between San Francisco housing prices and police incident reports and 311 cases. 

We will use the file `Post_EDA_SF_Combined_SFPD_311_Housing.csv` that was created in our previous Jupyter Notebook, `2-Exploratory_Data_Analysis.ipynb`. This file contains all SF police incident reports, 311 cases, and housing sales data aggregated by month and by neighborhood, from January 2018 up to and including September 2020, wherein each row is an observation with a distinct pairing on month-year and each column represents a possible feature to be used in modelling.

## 3 Imports<a id='3_Imports'></a>

In [1]:
import pandas as pd
import numpy as np
import os
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn import __version__ as sklearn_version
from sklearn.decomposition import PCA
from sklearn.model_selection import cross_validate, GridSearchCV, learning_curve
from sklearn.preprocessing import scale, RobustScaler, StandardScaler
from sklearn.dummy import DummyRegressor
from sklearn.linear_model import LinearRegression, Ridge, Lasso, LassoLars, ElasticNet
from sklearn.ensemble import RandomForestRegressor
from sklearn.pipeline import make_pipeline
from sklearn.feature_selection import SelectKBest, mutual_info_regression
from sklearn.metrics import r2_score, mean_squared_error, mean_absolute_error


## 4 Load The Data<a id='4_Load_The_Data'></a>

In [2]:
df = pd.read_csv('../data/Post_EDA_SF_Combined_SFPD_311_Housing.csv')

In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1188 entries, 0 to 1187
Data columns (total 82 columns):
 #   Column                                        Non-Null Count  Dtype  
---  ------                                        --------------  -----  
 0   Year Month                                    1188 non-null   int64  
 1   Neighborhood                                  1188 non-null   object 
 2   Arson                                         1188 non-null   int64  
 3   Assault                                       1188 non-null   int64  
 4   Burglary                                      1188 non-null   int64  
 5   Case Closure                                  1188 non-null   int64  
 6   Civil Sidewalks                               1188 non-null   int64  
 7   Courtesy Report                               1188 non-null   int64  
 8   Disorderly Conduct                            1188 non-null   int64  
 9   Drug Offense                                  1188 non-null   i

## 5 Create Train and Test Splits<a id='5_Create_Train_and_Test_Splits'></a>

In our previous notebook `3-Preprocessing_and_Training.ipynb`, we manually created the train and test splits so as to avoid data leakage.

Here, we will split out 85% of the data into the training set, leaving 15% of the data in the test set.

In [4]:
# Look at how many months of data we have
print('Earliest date:',df['Year Month'].min(), 'Latest date:', df['Year Month'].max(), 'Number of months:', len(df['Year Month'].unique()))

Earliest date: 201801 Latest date: 202009 Number of months: 33


In [5]:
# We will split the data with 28 months in train and 5 months in test, approx 85/15 train/test split
train = df[df['Year Month'] < 202005]
test = df[df['Year Month'] >= 202005]

In [6]:
train.shape, test.shape

((1008, 82), (180, 82))

We will set X to be all the features, and y to be the Median Sale Prince __in thousands of $__.

In [7]:
X_train = train.drop(columns='Median Sale Price')
X_test = test.drop(columns='Median Sale Price')
y_train = train['Median Sale Price'] / 1000
y_test = test['Median Sale Price'] / 1000

In [8]:
X_train.shape, X_test.shape

((1008, 81), (180, 81))

In [9]:
y_train.shape, y_test.shape

((1008,), (180,))

In [10]:
# Save the 'Year Month' and 'Neighborhood' columns from the train/test data into ids_train and ids_test
# and drop these from 'X_train' and 'X_test'
ids_list = ['Year Month', 'Neighborhood']
ids_train = X_train[ids_list]
ids_test = X_test[ids_list]
X_train.drop(columns=ids_list, inplace=True)
X_test.drop(columns=ids_list, inplace=True)
X_train.shape, X_test.shape

((1008, 79), (180, 79))

## 6 Models<a id='6_Models'></a>

We're ready to create and evaluate some models. In order to do this, for each model, we will create a pipeline, then pass it into `GridSearchCV` to determine our optimal parameters for that model.

Each pipeline will consist of:
  * scaler: since our features contain numbers that vary by orders of magniture, we must scale them in preparation for PCA
  * dimensionality reducer (i.e. PCA, SelectKBest) : reduce dimensionality of the data or selectively choose features
  * regressor: our prediction component

### 6.0 Dummy Model 0 - Standard Scaler, PCA, DummyRegressor<a id='6.0_Model_0_-_StandardScaler_PCA_DummyRegressor'></a>

Let's find a baseline with a `DummyRegressor`. In our previous notebook `3-Preprocessing_and_Training.ipynb`, we determined that when PCA n_components=4, 91.5% of the variance is already explained.

In [11]:
pipe_0 = make_pipeline(
    StandardScaler(), 
    PCA(n_components=4, random_state=42),
    DummyRegressor() )
pipe_0.get_params().keys()

dict_keys(['memory', 'steps', 'verbose', 'standardscaler', 'pca', 'dummyregressor', 'standardscaler__copy', 'standardscaler__with_mean', 'standardscaler__with_std', 'pca__copy', 'pca__iterated_power', 'pca__n_components', 'pca__random_state', 'pca__svd_solver', 'pca__tol', 'pca__whiten', 'dummyregressor__constant', 'dummyregressor__quantile', 'dummyregressor__strategy'])

In [12]:
param_grid_0 = {'dummyregressor__strategy': ['mean','median']}
model_0 = GridSearchCV(pipe_0, param_grid_0, cv=5)
model_0.fit(X_train, y_train)

GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('standardscaler', StandardScaler()),
                                       ('pca',
                                        PCA(n_components=4, random_state=42)),
                                       ('dummyregressor', DummyRegressor())]),
             param_grid={'dummyregressor__strategy': ['mean', 'median']})

In [13]:
print("Model 0: Best Score: " + str(model_0.best_score_))
print("Model 0: Best Parameters: " + str(model_0.best_params_))

Model 0: Best Score: -0.0014654209162468846
Model 0: Best Parameters: {'dummyregressor__strategy': 'mean'}


### 6.1 Model 1 - Standard Scaler, PCA, Linear Regression<a id='6.1_Model_1_-_StandardScaler_PCA_LinearRegression'></a>

Our first model uses a basic pipeline consisting of `StandardScaler`, `PCA`, and `LinearRegression`.

In [14]:
pipe_1 = make_pipeline(
    StandardScaler(), 
    PCA(random_state=42),
    LinearRegression() )
pipe_1.get_params().keys()

dict_keys(['memory', 'steps', 'verbose', 'standardscaler', 'pca', 'linearregression', 'standardscaler__copy', 'standardscaler__with_mean', 'standardscaler__with_std', 'pca__copy', 'pca__iterated_power', 'pca__n_components', 'pca__random_state', 'pca__svd_solver', 'pca__tol', 'pca__whiten', 'linearregression__copy_X', 'linearregression__fit_intercept', 'linearregression__n_jobs', 'linearregression__normalize'])

In [15]:
param_grid_1 = {'pca__n_components': np.arange(4,20)}
model_1 = GridSearchCV(pipe_1, param_grid_1, cv=5)
model_1.fit(X_train, y_train)

GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('standardscaler', StandardScaler()),
                                       ('pca', PCA(random_state=42)),
                                       ('linearregression',
                                        LinearRegression())]),
             param_grid={'pca__n_components': array([ 4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19])})

In [16]:
print("Model 1: Best Score: " + str(model_1.best_score_))
print("Model 1: Best Parameters: " + str(model_1.best_params_))

Model 1: Best Score: 0.2376077027842245
Model 1: Best Parameters: {'pca__n_components': 17}


### 6.2 Model 2 - Robust Scaler, PCA, Linear Regression<a id='6.2_Model_2_-_RobustScaler_PCA_LinearRegression'></a>

Our second model will try to improve on the first by switching out the `StandardScaler` for the `RobustScaler`.

In [17]:
pipe_2 = make_pipeline(
    RobustScaler(), 
    PCA(random_state=42),
    LinearRegression() )
pipe_2.get_params().keys()

dict_keys(['memory', 'steps', 'verbose', 'robustscaler', 'pca', 'linearregression', 'robustscaler__copy', 'robustscaler__quantile_range', 'robustscaler__with_centering', 'robustscaler__with_scaling', 'pca__copy', 'pca__iterated_power', 'pca__n_components', 'pca__random_state', 'pca__svd_solver', 'pca__tol', 'pca__whiten', 'linearregression__copy_X', 'linearregression__fit_intercept', 'linearregression__n_jobs', 'linearregression__normalize'])

In [18]:
param_grid_2 = {'pca__n_components': np.arange(4,20)}
model_2 = GridSearchCV(pipe_2, param_grid_2, cv=5)
model_2.fit(X_train, y_train)

GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('robustscaler', RobustScaler()),
                                       ('pca', PCA(random_state=42)),
                                       ('linearregression',
                                        LinearRegression())]),
             param_grid={'pca__n_components': array([ 4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19])})

In [19]:
print("Model 2: Best Score: " + str(model_2.best_score_))
print("Model 2: Best Parameters: " + str(model_2.best_params_))

Model 2: Best Score: 0.26785265231223887
Model 2: Best Parameters: {'pca__n_components': 17}


### 6.3 Model 3 - Robust Scaler, SelectKBest, Linear Regression<a id='6.3_Model_3_-_RobustScaler_SelectKBest_LinearRegression'></a>

In our third model, we'll swap out `PCA` for `SelectKBest`.

In [20]:
pipe_3 = make_pipeline(
    RobustScaler(), 
    SelectKBest(mutual_info_regression),
    LinearRegression() )
pipe_3.get_params().keys()

dict_keys(['memory', 'steps', 'verbose', 'robustscaler', 'selectkbest', 'linearregression', 'robustscaler__copy', 'robustscaler__quantile_range', 'robustscaler__with_centering', 'robustscaler__with_scaling', 'selectkbest__k', 'selectkbest__score_func', 'linearregression__copy_X', 'linearregression__fit_intercept', 'linearregression__n_jobs', 'linearregression__normalize'])

In [21]:
param_grid_3 = {'selectkbest__k': np.arange(30,60) }
model_3 = GridSearchCV(pipe_3, param_grid_3, cv=5)
model_3.fit(X_train, y_train)

GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('robustscaler', RobustScaler()),
                                       ('selectkbest',
                                        SelectKBest(score_func=<function mutual_info_regression at 0x000001727C56ECA0>)),
                                       ('linearregression',
                                        LinearRegression())]),
             param_grid={'selectkbest__k': array([30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46,
       47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59])})

In [22]:
print("Model 3: Best Score: " + str(model_3.best_score_))
print("Model 3: Best Parameters: " + str(model_3.best_params_))

Model 3: Best Score: 0.3275323745682913
Model 3: Best Parameters: {'selectkbest__k': 46}


### 6.4 Model 4 - Robust Scaler, PCA, Random Forest<a id='6.4_Model_4_-_RobustScaler_PCA_RandomForest'></a>

In the above models, we seem to have determined that the best number of components for PCA is 17. In the fourth model, we will swap out the `LinearRegressor` for `RandomForestRegressor`.

In [23]:
pipe_4 = make_pipeline(
    RobustScaler(), 
    PCA(random_state=42, n_components=17),
    RandomForestRegressor(random_state=42) )
pipe_4.get_params().keys()

dict_keys(['memory', 'steps', 'verbose', 'robustscaler', 'pca', 'randomforestregressor', 'robustscaler__copy', 'robustscaler__quantile_range', 'robustscaler__with_centering', 'robustscaler__with_scaling', 'pca__copy', 'pca__iterated_power', 'pca__n_components', 'pca__random_state', 'pca__svd_solver', 'pca__tol', 'pca__whiten', 'randomforestregressor__bootstrap', 'randomforestregressor__ccp_alpha', 'randomforestregressor__criterion', 'randomforestregressor__max_depth', 'randomforestregressor__max_features', 'randomforestregressor__max_leaf_nodes', 'randomforestregressor__max_samples', 'randomforestregressor__min_impurity_decrease', 'randomforestregressor__min_impurity_split', 'randomforestregressor__min_samples_leaf', 'randomforestregressor__min_samples_split', 'randomforestregressor__min_weight_fraction_leaf', 'randomforestregressor__n_estimators', 'randomforestregressor__n_jobs', 'randomforestregressor__oob_score', 'randomforestregressor__random_state', 'randomforestregressor__verbose

In [24]:
param_grid_4 = {'randomforestregressor__n_estimators': [int(n) for n in np.logspace(start=1, stop=3, num=5)],
                'randomforestregressor__max_depth': [2, 5, 10, 20]}
model_4 = GridSearchCV(pipe_4, param_grid_4, cv=5, n_jobs=-1)
model_4.fit(X_train, y_train)

GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('robustscaler', RobustScaler()),
                                       ('pca',
                                        PCA(n_components=17, random_state=42)),
                                       ('randomforestregressor',
                                        RandomForestRegressor(random_state=42))]),
             n_jobs=-1,
             param_grid={'randomforestregressor__max_depth': [2, 5, 10, 20],
                         'randomforestregressor__n_estimators': [10, 31, 100,
                                                                 316, 1000]})

In [25]:
print("Model 4: Best Score: " + str(model_4.best_score_))
print("Model 4: Best Parameters: " + str(model_4.best_params_))

Model 4: Best Score: 0.46627746345200843
Model 4: Best Parameters: {'randomforestregressor__max_depth': 20, 'randomforestregressor__n_estimators': 1000}


### 6.5 Model 5 - Robust Scaler, PCA, Ridge<a id='6.5_Model_5_-_RobustScaler_PCA_Ridge'></a>

Here we will try the `Ridge` regressor. We will increase the hyperparameter tuning for PCA since Ridge does better with more features.

In [26]:
pipe_5 = make_pipeline(
    RobustScaler(), 
    PCA(random_state=42),
    Ridge() )
pipe_5.get_params().keys()

dict_keys(['memory', 'steps', 'verbose', 'robustscaler', 'pca', 'ridge', 'robustscaler__copy', 'robustscaler__quantile_range', 'robustscaler__with_centering', 'robustscaler__with_scaling', 'pca__copy', 'pca__iterated_power', 'pca__n_components', 'pca__random_state', 'pca__svd_solver', 'pca__tol', 'pca__whiten', 'ridge__alpha', 'ridge__copy_X', 'ridge__fit_intercept', 'ridge__max_iter', 'ridge__normalize', 'ridge__random_state', 'ridge__solver', 'ridge__tol'])

In [27]:
param_grid_5 = {'pca__n_components': np.arange(4,80,4),
                'ridge__alpha': [0.001, 0.01, 0.1, 1, 10, 100, 300, 500, 1000]}
model_5 = GridSearchCV(pipe_5, param_grid_5, cv=5, n_jobs=-1)
model_5.fit(X_train, y_train)

GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('robustscaler', RobustScaler()),
                                       ('pca', PCA(random_state=42)),
                                       ('ridge', Ridge())]),
             n_jobs=-1,
             param_grid={'pca__n_components': array([ 4,  8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52, 56, 60, 64, 68,
       72, 76]),
                         'ridge__alpha': [0.001, 0.01, 0.1, 1, 10, 100, 300,
                                          500, 1000]})

In [28]:
print("Model 5: Best Score: " + str(model_5.best_score_))
print("Model 5: Best Parameters: " + str(model_5.best_params_))

Model 5: Best Score: 0.3266967615387865
Model 5: Best Parameters: {'pca__n_components': 76, 'ridge__alpha': 100}


### 6.6 Model 6 - Robust Scaler, PCA, Lasso<a id='6.6_Model_6_-_RobustScaler_PCA_Lasso'></a>

Let's swap out `Ridge` for `Lasso`.

In [29]:
pipe_6 = make_pipeline(
    RobustScaler(), 
    PCA(random_state=42),
    Lasso() )
pipe_6.get_params().keys()

dict_keys(['memory', 'steps', 'verbose', 'robustscaler', 'pca', 'lasso', 'robustscaler__copy', 'robustscaler__quantile_range', 'robustscaler__with_centering', 'robustscaler__with_scaling', 'pca__copy', 'pca__iterated_power', 'pca__n_components', 'pca__random_state', 'pca__svd_solver', 'pca__tol', 'pca__whiten', 'lasso__alpha', 'lasso__copy_X', 'lasso__fit_intercept', 'lasso__max_iter', 'lasso__normalize', 'lasso__positive', 'lasso__precompute', 'lasso__random_state', 'lasso__selection', 'lasso__tol', 'lasso__warm_start'])

In [30]:
param_grid_6 = {'pca__n_components': np.arange(20,80,4),
                'lasso__alpha': [0.1, 0.25, 0.5, 1, 5, 10]}
model_6 = GridSearchCV(pipe_6, param_grid_6, cv=5, n_jobs=-1)
model_6.fit(X_train, y_train)

GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('robustscaler', RobustScaler()),
                                       ('pca', PCA(random_state=42)),
                                       ('lasso', Lasso())]),
             n_jobs=-1,
             param_grid={'lasso__alpha': [0.1, 0.25, 0.5, 1, 5, 10],
                         'pca__n_components': array([20, 24, 28, 32, 36, 40, 44, 48, 52, 56, 60, 64, 68, 72, 76])})

In [31]:
print("Model 6: Best Score: " + str(model_6.best_score_))
print("Model 6: Best Parameters: " + str(model_6.best_params_))

Model 6: Best Score: 0.32941407138721235
Model 6: Best Parameters: {'lasso__alpha': 5, 'pca__n_components': 76}


### 6.7 Model 7 - Robust Scaler, PCA, LassoLars<a id='6.7_Model_7_-_RobustScaler_PCA_LassoLars'></a>

Let's swap out `Lasso` for `LassoLars`.

In [32]:
pipe_7 = make_pipeline(
    RobustScaler(), 
    PCA(random_state=42),
    LassoLars() )
pipe_7.get_params().keys()

dict_keys(['memory', 'steps', 'verbose', 'robustscaler', 'pca', 'lassolars', 'robustscaler__copy', 'robustscaler__quantile_range', 'robustscaler__with_centering', 'robustscaler__with_scaling', 'pca__copy', 'pca__iterated_power', 'pca__n_components', 'pca__random_state', 'pca__svd_solver', 'pca__tol', 'pca__whiten', 'lassolars__alpha', 'lassolars__copy_X', 'lassolars__eps', 'lassolars__fit_intercept', 'lassolars__fit_path', 'lassolars__jitter', 'lassolars__max_iter', 'lassolars__normalize', 'lassolars__positive', 'lassolars__precompute', 'lassolars__random_state', 'lassolars__verbose'])

In [33]:
param_grid_7 = {'pca__n_components': np.arange(20,80,4),
                'lassolars__alpha': [0.1, 0.15, 0.25, 0.3, 0.5, 0.9]}
model_7 = GridSearchCV(pipe_7, param_grid_7, cv=5, n_jobs=-1)
model_7.fit(X_train, y_train)

GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('robustscaler', RobustScaler()),
                                       ('pca', PCA(random_state=42)),
                                       ('lassolars', LassoLars())]),
             n_jobs=-1,
             param_grid={'lassolars__alpha': [0.1, 0.15, 0.25, 0.3, 0.5, 0.9],
                         'pca__n_components': array([20, 24, 28, 32, 36, 40, 44, 48, 52, 56, 60, 64, 68, 72, 76])})

In [34]:
print("Model 7: Best Score: " + str(model_7.best_score_))
print("Model 7: Best Parameters: " + str(model_7.best_params_))

Model 7: Best Score: 0.3257572758939543
Model 7: Best Parameters: {'lassolars__alpha': 0.3, 'pca__n_components': 76}


### 6.8 Model 8 - Robust Scaler, PCA, ElasticNet<a id='6.8_Model_8_-_RobustScaler_PCA_ElasticNet'></a>

Let's try `ElasticNet`, should help with sparsity of data.

In [35]:
pipe_8 = make_pipeline(
    RobustScaler(), 
    PCA(random_state=42),
    ElasticNet() )
pipe_8.get_params().keys()

dict_keys(['memory', 'steps', 'verbose', 'robustscaler', 'pca', 'elasticnet', 'robustscaler__copy', 'robustscaler__quantile_range', 'robustscaler__with_centering', 'robustscaler__with_scaling', 'pca__copy', 'pca__iterated_power', 'pca__n_components', 'pca__random_state', 'pca__svd_solver', 'pca__tol', 'pca__whiten', 'elasticnet__alpha', 'elasticnet__copy_X', 'elasticnet__fit_intercept', 'elasticnet__l1_ratio', 'elasticnet__max_iter', 'elasticnet__normalize', 'elasticnet__positive', 'elasticnet__precompute', 'elasticnet__random_state', 'elasticnet__selection', 'elasticnet__tol', 'elasticnet__warm_start'])

In [36]:
param_grid_8 = {'pca__n_components': np.arange(20,80,4),
                'elasticnet__alpha': [0.1, 0.15, 0.25, 0.3, 0.5, 0.9],
                'elasticnet__l1_ratio': [0.05, 0.1, 0.5, 0.7, 0.9, 0.95, 0.99] }
model_8 = GridSearchCV(pipe_8, param_grid_8, cv=5, n_jobs=-1)
model_8.fit(X_train, y_train)

GridSearchCV(cv=5,
             estimator=Pipeline(steps=[('robustscaler', RobustScaler()),
                                       ('pca', PCA(random_state=42)),
                                       ('elasticnet', ElasticNet())]),
             n_jobs=-1,
             param_grid={'elasticnet__alpha': [0.1, 0.15, 0.25, 0.3, 0.5, 0.9],
                         'elasticnet__l1_ratio': [0.05, 0.1, 0.5, 0.7, 0.9,
                                                  0.95, 0.99],
                         'pca__n_components': array([20, 24, 28, 32, 36, 40, 44, 48, 52, 56, 60, 64, 68, 72, 76])})

In [37]:
print("Model 8: Best Score: " + str(model_8.best_score_))
print("Model 8: Best Parameters: " + str(model_8.best_params_))

Model 8: Best Score: 0.327975091308277
Model 8: Best Parameters: {'elasticnet__alpha': 0.9, 'elasticnet__l1_ratio': 0.9, 'pca__n_components': 76}


## 7 Model Evaluation<a id='7_Model_Evaluation'></a>

Let's evaluate the models against each other.

In [38]:
models = [model_1, model_2, model_3, model_4, model_5, model_6, model_7, model_8]
print("          R2(train)  R2(test)   MAE          SQRT(MSE)")
for i, model in enumerate(models, start=1):
    y_pred = model.best_estimator_.predict(X_test)
    print("Model {}:  {:0.6f}   {:0.6f}   {:0,.6f}   {:0,.6f}".format(i, model.best_score_, r2_score(y_test, y_pred), mean_absolute_error(y_test, y_pred), (mean_squared_error(y_test, y_pred))**0.5))

          R2(train)  R2(test)   MAE          SQRT(MSE)
Model 1:  0.237608   0.273431   291.680729   386.709726
Model 2:  0.267853   0.239206   298.509300   395.712791
Model 3:  0.327532   0.091984   299.629686   432.308039
Model 4:  0.466277   0.242836   282.534852   394.767824
Model 5:  0.326697   0.292975   287.645985   381.473176
Model 6:  0.329414   0.245116   293.949241   394.172812
Model 7:  0.325757   0.175289   300.829798   412.000335
Model 8:  0.327975   0.293744   287.554110   381.265704


None of these models are performing very well...