# Improve Performance with Algorithm Tuning

**ML models are parameterized** so that their behavior can be tuned for a given problem. Models can have many parameters and **finding the best combination of parameters** is an optimization task and can be treated as a search problem (more later).

We will see how to tune the parameters of ML algorithms in Python using scikit-learn. 

The goal is to learn:
1. The importance of algorithm parameter tuning to improve algorithm performance
2. How to use a grid search algorithm tuning strategy
3. How to use a random search algorithm tuning strategy.

# ML Algorithm hyper-parameters optimization

Algorithm tuning is a final step in the process of applied ML before finalizing your model. It is sometimes called ***hyperparameter optimization*** where:

* the _algorithm parameters_ are referred to as **hyperparameters**
* the _coefficients_ found by the ML algorithm itself are referred to as **parameters**. 

Optimization suggests the search-nature of the problem. Phrased as a search problem, you can use different search strategies to find a good and robust parameter or set of parameters for an algorithm on a given problem. 

Python scikit-learn provides 2 simple methods for algorithm parameter tuning:
1. Grid Search parameter tuning
2. Random Search parameter tuning.

## 0. Import the data

In [0]:
import pandas as pd

url = 'https://raw.githubusercontent.com/dbonacorsi/AML_basic_AA1920/master/datasets/pima-indians-diabetes.data.csv'

names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
data = pd.read_csv(url, names=names)
data

## 1. Grid Search Parameter Tuning

Grid search is an approach to parameter tuning that will **methodically build and evaluate a model for each combination of algorithm parameters specified in a grid**. 

You can perform a grid search using the GridSearchCV class (documented [here](http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html)).

The example below evaluates different alpha values for the Ridge Regression algorithm on the standard diabetes dataset. This is a one-dimensional grid search.

In [0]:
import numpy as np
#
#from pandas import read_csv
#
from sklearn.linear_model import Ridge
#
from sklearn.model_selection import GridSearchCV                   # <--

In [0]:
array = data.values
X = array[:,0:8]
Y = array[:,8]

In [0]:
# Grid Search for Algorithm Tuning
alphas = np.array([24, 6, 0.3, 0.002])
param_grid = dict(alpha=alphas)
model = Ridge()
grid = GridSearchCV(estimator=model, param_grid=param_grid, cv=5)
grid.fit(X, Y)
print(grid.best_score_)
print(grid.best_estimator_.alpha)

Running the example lists out the optimal score achieved and the set of parameters in the
grid that achieved that score. In this case ap optimal `alpha` among those explicitly given is found.

## <font color='red'>Exercise 1</font>

Can you find better values than the one found above? How would you do it?



## <font color='green'>Solution 1</font>

In [0]:
# put your code here

## Random Search parameter tuning

Random search is an approach to parameter tuning that will **sample algorithm parameters from a random distribution (i.e. uniform) for a fixed number of iterations**. 

A model is constructed and evaluated for each combination of parameters chosen. You can perform a random search for algorithm parameters using the `RandomizedSearchCV` class (documented [here](http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.RandomizedSearchCV.html)).

Suppose that you want to go slow, your algo does not impose time constraints in training time, so you do not require an optimal, but perhaps too aggressive, alpha. How do you find a value in the [0,1] interval?


In [0]:
# Grid Search for Algorithm Tuning
alphas = np.array([1., 0.1, 0.01, 0.001])
param_grid = dict(alpha=alphas)
model = Ridge()
grid = GridSearchCV(estimator=model, param_grid=param_grid, cv=5)
grid.fit(X, Y)
print(grid.best_score_)
print(grid.best_estimator_.alpha)

Well, sure, you found `1.0` is optimal, but just among the ones you typed. Which is the best floating point value in the [0,1] interval is still unknown.. and you cannot type manually all real values, of course! So, what?

The example below evaluates different random alpha values between 0 and 1 for the Ridge Regression algorithm on the standard diabetes dataset. A total of 100 iterations are performed with uniformly random alpha values selected in the range between 0 and 1 (the range that alpha values can take).

In [0]:
from scipy.stats import uniform
#
from sklearn.linear_model import Ridge
#
from sklearn.model_selection import RandomizedSearchCV                   # <--

In [0]:
# Randomized for Algorithm Tuning
param_grid = { 'alpha' : uniform()}
model = Ridge()
rsearch = RandomizedSearchCV(estimator=model, param_distributions=param_grid, n_iter=100, random_state=7)
rsearch.fit(X, Y)
print(rsearch.best_score_)
print(rsearch.best_estimator_.alpha)

Running the example produces results much like those in the grid search example above. An
optimal `alpha` value near `1.0` is discovered.

## Summary

What we did:

* we discovered that algorithm parameter tuning is an important step for improving algorithm performance right before presenting results or preparing a system for production. We explored two methods that you can use right now in Python and scikit-learn to improve your algorithm results (Grid Search Parameter Tuning, Random Search Parameter Tuning).