### Hyperparameter Tuning

**Hyperparameters** are important parts of the ML model and can make the model gold or trash. Here we have discussed one of the popular hyperparameter tunning method i.e. using Grid Search CV.

### Grid Search CV

**Crime Rate- Linear Regression**\
*Predictor Variable: Crime Rate (Regression Based)*

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore")
%matplotlib inline


In [2]:
from sklearn.model_selection import GridSearchCV
# import train_test_split
from sklearn.model_selection import train_test_split

crime = pd.read_csv("https://raw.githubusercontent.com/dphi-official/ML_Models/master/Performance_Evaluation/Standard%20Metropolitan%20Areas%20Data%20-%20train_data.csv")


train, test = train_test_split(crime)
x_train = train.drop('crime_rate', axis = 1)
y_train = train.crime_rate
x_test = test.drop('crime_rate', axis = 1)
y_test = test.crime_rate

**Performance without grid search:**

In [3]:
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(x_train, y_train)
y_pred = lr.predict(x_test)

In [4]:
# import mean_squared_error
from sklearn.metrics import mean_squared_error
mean_squared_error(y_test, y_pred, squared=False)

10.257135158548357

**Performance with Grid Search**

**Step 1: Define a parameter Space**

In [5]:
parameters = {'fit_intercept':[True,False], 'copy_X':[True, False], 'n_jobs':[-1,1,10,15]} # define parameters

**Step 2: Fit the model to find the best hyperparameters on training data, and select the scorer you want to select to optimise**

In [6]:
grid = GridSearchCV(lr,parameters, cv=3)
grid.fit(x_train, y_train)

**Step 3: Print the best obtained parameters**

In [7]:
grid.best_estimator_

In [8]:
grid_lr = LinearRegression(copy_X=True, fit_intercept=True, n_jobs=-1)
grid_lr.fit(x_train, y_train)
y_pred= grid_lr.predict(x_test)
mean_squared_error(y_test, y_pred, squared=False)

10.257135158548357

Performance does not vary that much!

The number of hyperparameters for Linear Regression is very less. Hence all of them give similar performance (in this specific dataset)