[View in Colaboratory](https://colab.research.google.com/github/Tathagatd96/Hyperparameter-Tuning/blob/master/hyperparameter_tuning.ipynb)

**Import the required libraries**

In [0]:
import warnings
warnings.filterwarnings('ignore')

In [4]:
import numpy as np
from xgboost.sklearn import XGBClassifier
from sklearn.grid_search import GridSearchCV,RandomizedSearchCV
from sklearn.cross_validation import StratifiedKFold
from sklearn.datasets import make_classification



**We will use the xgboost classifier as our estimator model**

**Creating the dataset**
We use the make_classification method from the scikit-learn package for creating an artificial dataset with any number of samples and features. 

We will use the same dataset for both approaches for comparison.

In [0]:
x,y=make_classification(n_samples=1000, n_features=20,shuffle=True,random_state=101)

In [0]:
cv=StratifiedKFold(y,n_folds=10,shuffle=True)

We define two variables to pass into the methods. 
***params_fixed*** lists the parameters which will not be changed during the iterations.
***params_grid*** lists the parameters which need to be tested in each iteration.

In [13]:
params_fixed={'objective':'binary:logistic','silent':1.0}
params_grid={'max_depth':[1,2,3,4,5],'n_estimators':[5,10,15,20,25],'learning_rate':np.linspace(1e-16,1,3)}

bst_grid=GridSearchCV(estimator=XGBClassifier(**params_fixed),param_grid=params_grid,cv=cv,scoring='accuracy')

bst_grid.fit(x,y)

print("Best Accuracy Achieved : {}".format(bst_grid.best_score_))
print("Params")
for key,value in bst_grid.best_params_.items():
    print("\t{}:{}".format(key,value))

Best Accuracy Achieved : 0.958
Params
	learning_rate:1.0
	max_depth:4
	n_estimators:20


In [0]:
from scipy.stats import randint as sp_randint
from scipy.stats import uniform as uniform

The only differnce between **param_distributions** and **param_grid** is that in RandomizedSearchCV we provide a random distribution of values to be tested.

In [14]:
params_fixed={'objective':'binary:logistic','silent':1.0}
params_grid={"max_depth": [3,1,2,4,5],'n_estimators':sp_randint(1,1001),'learning_rate':np.linspace(uniform.ppf(0.01),uniform.ppf(0.99), 10)}

bst_grid=RandomizedSearchCV(estimator=XGBClassifier(**params_fixed),param_distributions=params_grid,n_iter=10,cv=cv,scoring='accuracy')

bst_grid.fit(x,y)

print("Best Accuracy Achieved : {}".format(bst_grid.best_score_))
print("Params")
for key,value in bst_grid.best_params_.items():
    print("\t{}:{}".format(key,value))

Best Accuracy Achieved : 0.96
Params
	learning_rate:0.44555555555555554
	max_depth:4
	n_estimators:962
