# Improve Performance with Algorithm Tuning

**ML models are parameterized** so that their behavior can be tuned for a given problem. Models can have many parameters and **finding the best combination of parameters** is an optimization task and can be treated as a search problem (more later).

We will see how to tune the parameters of ML algorithms in Python using scikit-learn. 

The goal is to learn:
1. The importance of algorithm parameter tuning to improve algorithm performance
2. How to use a grid search algorithm tuning strategy
3. How to use a random search algorithm tuning strategy.

# ML Algorithm hyper-parameters optimization

Algorithm tuning is a final step in the process of applied ML before finalizing your model. It is sometimes called ***hyperparameter optimization***.

Python scikit-learn provides 2 simple methods for algorithm parameter tuning:
1. Grid Search parameter tuning
2. Random Search parameter tuning.

## 1. Grid Search Parameter Tuning

You can perform a grid search using the $GridSearchCV$ class (documented [here](http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html)).

The example below evaluates different alpha values for the Ridge Regression algorithm on the standard diabetes dataset. This is a one-dimensional grid search.

In [1]:
import numpy
from pandas import read_csv
from sklearn.linear_model import Ridge
from sklearn.model_selection import GridSearchCV                   # <--

In [2]:
#load dataset
filename = 'pima-indians-diabetes.data.csv'
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
dataframe = read_csv(filename, names=names)
array = dataframe.values
X = array[:,0:8]
Y = array[:,8]

In [3]:
# Grid Search for Algorithm Tuning
alphas = numpy.array([1,0.1,0.01,0.001,0.0001,0])
param_grid = dict(alpha=alphas)
model = Ridge()
grid = GridSearchCV(estimator=model, param_grid=param_grid)
grid.fit(X, Y)
print(grid.best_score_)
print(grid.best_estimator_.alpha)

0.279617559313
1.0


## Random Search parameter tuning

You can perform a random search for algorithm parameters using the $RandomizedSearchCV$ class (documented [here](http://scikit-learn.org/stable/modules/generated/sklearn.model_selection. RandomizedSearchCV.html)).

The example below evaluates different random alpha values between 0 and 1 for the Ridge Regression algorithm on the standard diabetes dataset. A total of 100 iterations are performed with uniformly random alpha values selected in the range between 0 and 1 (the range that alpha values can take).

In [4]:
from pandas import read_csv
from scipy.stats import uniform
from sklearn.linear_model import Ridge
from sklearn.model_selection import RandomizedSearchCV                   # <--

In [5]:
# Randomized for Algorithm Tuning
param_grid = { 'alpha' : uniform()}
model = Ridge()
rsearch = RandomizedSearchCV(estimator=model, param_distributions=param_grid, n_iter=100,
    random_state=7)
rsearch.fit(X, Y)
print(rsearch.best_score_)
print(rsearch.best_estimator_.alpha)

0.279617127031
0.977989511997


## Summary

What we did:

* we discovered that algorithm parameter tuning is an important step for improving algorithm performance right before presenting results or preparing a system for production. We explored two methods that you can use right now in Python and scikit-learn to improve your algorithm results (Grid Search Parameter Tuning, Random Search Parameter Tuning).

## What's next 

You basically have all the ingredients you need. We covered all the techniques that you can use to improve the performance of algorithms on your dataset. Now, and finally, you will discover how you can finalize your model for using it on unseen data.