# Hyperparameter Tunning & Cross Validation
- **Hyperparameter Tunning:** is the used to train the machine learning model & increase the accuracy of model like decision
tree, random forest, support vector machine, etc. Hyperparameter tuning is a process of finding the
best combination of hyperparameters that results in the best model performance.
- **Cross Validation:** is a technique used to evaluate the performance of a machine learning model on unseen data
by splitting the available data into training and testing sets. Cross-validation is used to estimate the model's
performance on unseen data and to prevent overfitting.
- **Grid Search:** is a hyperparameter tuning technique that involves searching through a grid of possible
hyperparameter combinations to find the best combination that results in the best model performance.
- **Random Search:** is a hyperparameter tuning technique that involves randomly sampling hyperparameter combinations
from a predefined distribution to find the best combination that results in the best model performance.


### Methods used in Hyperparameter:
- Grid Search
- Random Search


***Import the libraries of sklearn***

In [30]:
#Import the libraries of sklearn
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.metrics import confusion_matrix, classification_report, accuracy_score, precision_score, f1_score, recall_score

***Load the dataset from sklearn***

In [31]:
#Load the dataset of iris form sklearn 
from sklearn.datasets import load_iris
iris = load_iris()

In [32]:
#Split the data into X & y:
X = iris.data
y = iris.target

In [33]:
#Apply the model of Random Forest:
model = RandomForestClassifier()

In [34]:
#Make a parameter grid:
param_grid = {
    #'n_estimaters' : [50, 100, 200, 300, 400, 500],
    'max_features' : ['auto', 'sqrt', 'log2'],
    'max_depth' : [4, 5, 6, 7, 8, 9, 10],
    'criterion' : ['gini', 'entropy']
    #'bootstrap' : ['True', 'False']
}

In [35]:
#Apply the GridSearchCV:
grid = GridSearchCV(
    estimator=model,
    param_grid= param_grid,
    cv = 5,
    scoring='accuracy',
    verbose=1,
    n_jobs=-1
)

In [36]:
#Fit the model:
grid.fit(X, y)

Fitting 5 folds for each of 42 candidates, totalling 210 fits


70 fits failed out of a total of 210.
The score on these train-test partitions for these parameters will be set to nan.
If these failures are not expected, you can try to debug them by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
60 fits failed with the following error:
Traceback (most recent call last):
  File "c:\Users\Al Hafiz Enterprises\miniconda3\envs\pythonl_ml\lib\site-packages\sklearn\model_selection\_validation.py", line 888, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "c:\Users\Al Hafiz Enterprises\miniconda3\envs\pythonl_ml\lib\site-packages\sklearn\base.py", line 1466, in wrapper
    estimator._validate_params()
  File "c:\Users\Al Hafiz Enterprises\miniconda3\envs\pythonl_ml\lib\site-packages\sklearn\base.py", line 666, in _validate_params
    validate_parameter_constraints(
  File "c:\Users\Al Hafiz Enterprises\miniconda3\envs\python

In [37]:
#Print the best parameters:
print(grid.best_params_)

{'criterion': 'gini', 'max_depth': 5, 'max_features': 'log2'}


In [38]:
#Print the best score:
print(grid.best_score_)

0.9666666666666668


In [39]:
#Print the best estimator:
print(grid.best_estimator_)

RandomForestClassifier(max_depth=5, max_features='log2')


In [40]:
#Print the best index:
print(grid.best_index_)

5


***Import the RandomizedSearchCV***

In [41]:
#Import the randomized Search CV:
from sklearn.model_selection import RandomizedSearchCV

In [42]:
#Apply the grid_1:
grid_1 = RandomizedSearchCV(
    estimator=model,
    param_distributions=param_grid,
    cv = 5,
    scoring= 'accuracy',
    verbose=1,
    n_iter=20, 
    n_jobs=1
)

In [43]:
#Fit the model:
grid_1.fit(X, y)

Fitting 5 folds for each of 20 candidates, totalling 100 fits


35 fits failed out of a total of 100.
The score on these train-test partitions for these parameters will be set to nan.
If these failures are not expected, you can try to debug them by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
35 fits failed with the following error:
Traceback (most recent call last):
  File "c:\Users\Al Hafiz Enterprises\miniconda3\envs\pythonl_ml\lib\site-packages\sklearn\model_selection\_validation.py", line 888, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "c:\Users\Al Hafiz Enterprises\miniconda3\envs\pythonl_ml\lib\site-packages\sklearn\base.py", line 1466, in wrapper
    estimator._validate_params()
  File "c:\Users\Al Hafiz Enterprises\miniconda3\envs\pythonl_ml\lib\site-packages\sklearn\base.py", line 666, in _validate_params
    validate_parameter_constraints(
  File "c:\Users\Al Hafiz Enterprises\miniconda3\envs\python

In [44]:
#Print the best parameters:
print(grid_1.best_params_)

{'max_features': 'sqrt', 'max_depth': 7, 'criterion': 'gini'}


In [45]:
#Print the best score:
print(grid_1.best_score_)

0.9666666666666668
