<a href="https://colab.research.google.com/github/marcelounb/ML-Mastery-with-Python-Course/blob/master/chap16_Improve_Performance_Algorithm_Tuning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

 Python scikit-learn provides two simple methods for algorithm parameter tuning:


. Grid Search Parameter Tuning


. Random Search Parameter Tuning


In [0]:
import numpy 
from pandas import read_csv 
from sklearn.linear_model import Ridge 
from sklearn.model_selection import GridSearchCV 

from scipy.stats import uniform 
from sklearn.model_selection import RandomizedSearchCV

In [0]:
filename = '/content/diabetes_moddd.csv' 
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 
dataframe = read_csv(filename, names=names) 
array = dataframe.values 
X = array[:,0:8] 
Y = array[:,8] 

**Grid Search Parameter Tuning** -  The example below evaluates diﬀerent alpha values for the Ridge Regression algorithm on the standard diabetes dataset. This is a one-dimensional grid search

In [3]:
alphas = numpy.array([1,0.1,0.01,0.001,0.0001,0]) 
param_grid = dict(alpha=alphas) 
model = Ridge() 
grid = GridSearchCV(estimator=model, param_grid=param_grid) 
grid.fit(X, Y) 

GridSearchCV(cv=None, error_score=nan,
             estimator=Ridge(alpha=1.0, copy_X=True, fit_intercept=True,
                             max_iter=None, normalize=False, random_state=None,
                             solver='auto', tol=0.001),
             iid='deprecated', n_jobs=None,
             param_grid={'alpha': array([1.e+00, 1.e-01, 1.e-02, 1.e-03, 1.e-04, 0.e+00])},
             pre_dispatch='2*n_jobs', refit=True, return_train_score=False,
             scoring=None, verbose=0)

In [4]:
grid.best_score_

0.27610844129292433

In [5]:
grid.best_estimator_.alpha

1.0

Running the example lists out the optimal score achieved and the set of parameters in the grid that achieved that score. In this case the alpha value of 1.0.


**Random Search Parameter Tuning** - is an approach to parameter tuning that will sample algorithm parameters from a random distribution (i.e. uniform) for a ﬁxed number of iterations. A model is constructed and evaluated for each combination of parameters chosen. You can perform a random search for algorithm parameters using the RandomizedSearchCV class2. The example below evaluates diﬀerent random alpha values between 0 and 1 for the Ridge Regression algorithm on the standard diabetes dataset. A total of 100 iterations are performed with uniformly random alpha values selected in the range between 0 and 1 (the range that alpha values can take).


In [6]:
param_grid = {'alpha': uniform()} 
model = Ridge() 
rsearch = RandomizedSearchCV(estimator=model, param_distributions=param_grid, n_iter=100, random_state=7) 
rsearch.fit(X, Y) 

RandomizedSearchCV(cv=None, error_score=nan,
                   estimator=Ridge(alpha=1.0, copy_X=True, fit_intercept=True,
                                   max_iter=None, normalize=False,
                                   random_state=None, solver='auto',
                                   tol=0.001),
                   iid='deprecated', n_iter=100, n_jobs=None,
                   param_distributions={'alpha': <scipy.stats._distn_infrastructure.rv_frozen object at 0x7f1fe8054e48>},
                   pre_dispatch='2*n_jobs', random_state=7, refit=True,
                   return_train_score=False, scoring=None, verbose=0)

In [7]:
grid.best_score_

0.27610844129292433

In [8]:
grid.best_estimator_.alpha

1.0