# Improve Performance with Algorithm Tuning
Machine learning models are parameterized so that their behavior can be tuned for a given
problem. Models can have many parameters and finding the best combination of parameters can
be treated as a search problem.

1. The importance of algorithm parameter tuning to improve algorithm performance.
2. How to use a grid search algorithm tuning strategy.
3. How to use a random search algorithm tuning strategy.

## Machine Learning Algorithm Parameters
- Algorithm tuning is a final step in the process of applied machine learning before finalizing your model.
- It is sometimes called hyperparameter optimization where the algorithm parameters are referred to as hyperparameters, whereas the coefficients found by the machine learning algorithm itself are referred to as parameters.
- Optimization suggests the search-nature of the problem. Phrased as a search problem, you can use different search strategies to find a good and robust parameter or set of parameters for an algorithm on a given problem.

## Grid Search Parameter Tuning
Grid search is an approach to parameter tuning that will methodically build and evaluate a
model for each combination of algorithm parameters specified in a grid. The example below evaluates different alpha values for
the Ridge Regression algorithm on the standard diabetes dataset. This is a one-dimensional
grid search.


In [1]:
# Pima Indians Diabetes Dataset
import pandas as pd
import numpy as np
from sklearn.linear_model import Ridge
from sklearn.model_selection import GridSearchCV

#Loading dataset
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
df = pd.read_csv('pima-indians-diabetes.data',names=names)

# separate array into input and output components
X = df.drop('class',axis='columns')
Y = df['class']

In [3]:
alphas = np.array([1,0.1,0.01,0.001,0.0001,0])

param_grid = dict(alpha=alphas)

In [4]:
param_grid

{'alpha': array([  1.00000000e+00,   1.00000000e-01,   1.00000000e-02,
          1.00000000e-03,   1.00000000e-04,   0.00000000e+00])}

In [5]:
grid = GridSearchCV(estimator=Ridge(), param_grid=param_grid)
grid.fit(X, Y)

print(grid.best_score_)
print(grid.best_estimator_.alpha)

0.279617559313
1.0


## Random Search Parameter Tuning
- Random search is an approach to parameter tuning that will sample algorithm parameters from a random distribution (i.e. uniform) for a fixed number of iterations.
- A model is constructed and evaluated for each combination of parameters chosen.
- The example below evaluates different random alpha values between 0 and 1 for the Ridge Regression algorithm on the standard diabetes dataset.
- A total of 100 iterations are performed with uniformly random alpha values selected in the range between 0 and 1 (the range that alpha values can take).

In [6]:
from scipy.stats import uniform
from sklearn.linear_model import Ridge
from sklearn.model_selection import RandomizedSearchCV

In [7]:
param_grid = {'alpha': uniform()}

In [8]:
param_grid

{'alpha': <scipy.stats._distn_infrastructure.rv_frozen at 0x7782325470>}

In [9]:
rsearch = RandomizedSearchCV(estimator=Ridge(), param_distributions=param_grid, n_iter=100,
random_state=7)
rsearch.fit(X, Y)
print(rsearch.best_score_)
print(rsearch.best_estimator_.alpha)

0.279617127031
0.977989511997


Algorithm parameter tuning is an important step for improving algorithm performance right
before presenting results or preparing a system for production.

In [10]:
#page 101