<div class='alert alert-info'>
    <h1>
        <center>
            <font color='Darkblue'> HyperParameter Tuning </font>
        </center>
    </h1>
</div>

# <font color='Darkblue'> What are Hyperparameters?

        1.Hyperparameters are parameters set before training a machine Learning Model.
        
        2.Unlike model parameters(like weights in linear models),hyperparameters control how a model learns.
        
        3.They affect model performance,complexity and learning speed.

# <font color='#FF33A1'>🔵 Examples of Hyperparameters:<font>

<table>
    <tr>
        <th>Algorithm</th>
        <th>Hyperparameters Example</th>
    </tr>
    <tr>
        <th>Decision Tree</th>
        <td>max_depth, min_samples_split, criterion</td>
    </tr>
    <tr>
        <th>Random Forest</th>
        <td>n_estimators, max_features, max_depth</td>
    </tr>
    <tr>
        <th>KNN</th>
        <td>n_neighbors, weights, metric</td>
    </tr>
    <tr>
        <th>SVM</th>
        <td>C, kernel, gamma</td>
    </tr>
    <tr>
        <th>Gradient Boosting</th>
        <td>learning_rate, n_estimators, max_depth</td>
    </tr>
    <tr>
        <th>Neural Networks</th>
        <td>hidden_layers, neurons, learning_rate</td>
    </tr>
</table>

# <font color='#FF33A1'>🔷 Why Hyperparameter Tuning?<font>
        1.To maximize model performance.
        2.Prevent underfitting (too simple) and overfitting (too complex).
        3.Find the optimal configuration for a given dataset.


# <font color='#FF33A1'>🔷 Steps in Hyperparameter Tuning:</font>

        1.Select model (e.g., RandomForest, XGBoost).
        
        2.Choose hyperparameters to tune.
        
        3.Select tuning method (Grid, Random, Bayesian).
        
        4.Cross-validate to estimate performance.
        
        5.Train final model on best hyperparameters.
        
        6.Evaluate on test data.


# <font color='#FF33A1'>🔶 Common Hyperparameter Tuning Methods:<font>

# <font color='Darkblue'>1.Random Search Cv<font>

### 1.What is it?
        Randomly selects a fixed number of combinations from the parameter grid.
### 2.Pros:
        Faster than grid search ,works well in large hyperparameters spaces.
### 3.Cons:
        May miss the optimal combination since it doesn't check all possibilities. 

# <font color='Darkblue'> Implementation of Random Search CV</font>

In [37]:
# Importing the required Libraries 
import numpy as np
import pandas as pd
import seaborn as sns

In [38]:
# loading the dataset 
dataset=pd.read_csv('Churn_Modelling.csv')
# Here the dataset says, using these features whether ther customer will leave the bank or not ,we have to findout.

In [39]:
dataset.head()

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


In [40]:
dataset.isna().sum()


RowNumber          0
CustomerId         0
Surname            0
CreditScore        0
Geography          0
Gender             0
Age                0
Tenure             0
Balance            0
NumOfProducts      0
HasCrCard          0
IsActiveMember     0
EstimatedSalary    0
Exited             0
dtype: int64

In [35]:
# Feature matrix ,dependent variable
X = dataset.iloc[: , 3:13].values
y=dataset.iloc[: , -1].values

In [41]:
X[0]

array([619, 'France', 'Female', 42, 2, 0.0, 1, 1, 1, 101348.88],
      dtype=object)

        Here we have two categorical variables ,so we need to convert them into 
    numerical.

In [42]:
# Here we have used LabelEncoder to convert the categorical varible(gender) into numerical in 0,1,2,3,4,like that..
from sklearn.preprocessing import LabelEncoder
labelencoder = LabelEncoder()
X[: , 2] = labelencoder.fit_transform(X[: , 2])

In [43]:

X[0]

array([619, 'France', 0, 42, 2, 0.0, 1, 1, 1, 101348.88], dtype=object)

In [44]:
# Here we have used Column Transfer and OnehotEncoder for converting the city ,,beacues here we have more than 2 classes to avoid overfitting..  
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers = [('encoder' , OneHotEncoder() , [1])] , remainder = 'passthrough')
X = np.array(ct.fit_transform(X))

In [45]:
X[0]

array([1.0, 0.0, 0.0, 619, 0, 42, 2, 0.0, 1, 1, 1, 101348.88],
      dtype=object)

In [46]:
# Here It will forms dummy varible trap to avoid that we simply remove first column
X=X[:,1:]

In [47]:
X[0]

array([0.0, 0.0, 619, 0, 42, 2, 0.0, 1, 1, 1, 101348.88], dtype=object)

In [49]:
# Splitting the dataset into traning and testing set.
from sklearn.model_selection import train_test_split
x_train , x_test , y_train , y_test = train_test_split(X , y , test_size=0.2 , random_state =0)

In [50]:
# Here I have used XGBClassifier ,XGBClassifier it is a ensemble learning it will converts weak learners into strong learners sequentially.
from xgboost import XGBClassifier
classifier = XGBClassifier()
classifier.fit(x_train , y_train)

In [51]:
y_pred=classifier.predict(x_test)

In [52]:
from sklearn.metrics import accuracy_score
print('The accuracy score of our model is : {} '.format(accuracy_score(y_test , y_pred)))

The accuracy score of our model is : 0.853 


# <font color='Darkblue'>RandomizedSearchCV</font>

In [56]:
from sklearn.model_selection import RandomizedSearchCV
parameters = {
    'learning_rate':[0.1,0.15,0.2,0.25,0.3], # defualit lr=0.1 # It will helps not to overfit the model
  'gamma':[0,0.1,0.2,0.3,0.4], # default g=0.0 # it is also used contorl the overfitting while adding new item 
  'max_depth':[3,4,5,6,7], #defualt md=3 # depth of the decision tree
  'min_child_weight':[1,2,3,4,5,6] # min_child_weight=4
}

#Here parameters ,depends on the model we have to change the parameters,Here we have taken XGBoost In this we have taken depends on the model 

In [57]:
randomcv = RandomizedSearchCV(estimator=classifier , param_distributions=parameters , cv = 10 , n_jobs=-1) # n_jobs helps to execute fastly.

In [58]:
randomcv.fit(x_train , y_train)

In [59]:

randomcv.best_estimator_

In [60]:

randomcv.best_params_

{'min_child_weight': 1, 'max_depth': 4, 'learning_rate': 0.1, 'gamma': 0.4}

        Here the RandomSearchCv gives best combinations that will improve the model performance and accuaracy. 

In [62]:
randomcv.best_score_

0.8643750000000001

# <font color='Darkblue'> 2.Grid Search Cv</font>

# <font color='#FF33A1'>1. Grid Search (Exhaustive Search)<font>
### What it is?
        Tries all possible combinations of specified hyperparameters.
### Pros:
        Guarantees to find the best combination (within the search space).
### Cons:
        Very slow and computationally expensive, especially for large spaces.

# <font color='darkblue'>Implementation of GridSearchCv<font>

In [66]:
import numpy as np
import pandas as pd
import seaborn as sns


In [67]:
dataset = pd.read_csv('Churn_Modelling.csv')

In [68]:

dataset.head()

Unnamed: 0,RowNumber,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
0,1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
1,2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
2,3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
3,4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
4,5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


In [69]:

dataset.isna().sum()

RowNumber          0
CustomerId         0
Surname            0
CreditScore        0
Geography          0
Gender             0
Age                0
Tenure             0
Balance            0
NumOfProducts      0
HasCrCard          0
IsActiveMember     0
EstimatedSalary    0
Exited             0
dtype: int64

In [70]:
X = dataset.iloc[: , 3:13].values
y=dataset.iloc[: , -1].values

In [71]:

X[0]

array([619, 'France', 'Female', 42, 2, 0.0, 1, 1, 1, 101348.88],
      dtype=object)

In [72]:

from sklearn.preprocessing import LabelEncoder
labelencoder = LabelEncoder()
X[: , 2] = labelencoder.fit_transform(X[: , 2])

In [73]:
X[0]

array([619, 'France', 0, 42, 2, 0.0, 1, 1, 1, 101348.88], dtype=object)

In [74]:
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers = [('encoder' , OneHotEncoder() , [1])] , remainder = 'passthrough')
X = np.array(ct.fit_transform(X))

In [75]:

X[0]

array([1.0, 0.0, 0.0, 619, 0, 42, 2, 0.0, 1, 1, 1, 101348.88],
      dtype=object)

In [76]:
# avoiding the dummy varible trap
X=X[:,1:]

In [77]:
X[0]

array([0.0, 0.0, 619, 0, 42, 2, 0.0, 1, 1, 1, 101348.88], dtype=object)

In [78]:
from sklearn.model_selection import train_test_split
x_train , x_test , y_train , y_test = train_test_split(X , y , test_size=0.2 , random_state =0)

In [81]:

from sklearn.ensemble import RandomForestClassifier
classifier = RandomForestClassifier()
classifier.fit(x_train , y_train)

In [82]:

y_pred=classifier.predict(x_test)

In [83]:
y_pred

array([0, 0, 0, ..., 0, 0, 0], dtype=int64)

In [84]:
from sklearn.metrics import accuracy_score
print('The accuracy score of our model is : {} '.format(accuracy_score(y_test , y_pred)))

The accuracy score of our model is : 0.87 


In [89]:
# GridSearch cv performing 
from sklearn.model_selection import GridSearchCV

In [87]:
parameters = {
   'max_depth': [3, 5, 10],
    'n_estimators': [100, 200, 300]
}

In [95]:
gridsearch=GridSearchCV(estimator=classifier,param_grid=parameters,cv=10,n_jobs=-1)
gridsearch.fit(x_train, y_train)

In [98]:

gridsearch = gridsearch.fit(x_train , y_train)

In [97]:

gridsearch.best_estimator_

In [99]:

gridsearch.best_params_

{'max_depth': 10, 'n_estimators': 100}

In [100]:

gridsearch.best_score_

0.8623749999999999

# <font color='Darkblue'>3. Bayesian Optimization (Probabilistic Search)</font>
### What it is?
        1.Uses probabilistic models (e.g., Gaussian Process) to model the function and choose the next best hyperparameter set to evaluate.
### Pros:
        1.More efficient than grid/random search.
        2.Learns from previous results to explore promising regions.
### Cons:
        1.More complex to implement.
### Libraries: Optuna, Hyperopt, Scikit-Optimize (skopt).

In [None]:
import optuna

def objective(trial):
    max_depth = trial.suggest_int('max_depth', 3, 10)
    n_estimators = trial.suggest_int('n_estimators', 100, 500)
    model = RandomForestClassifier(max_depth=max_depth, n_estimators=n_estimators)
    return cross_val_score(model, X_train, y_train, cv=3).mean()

study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=50)

# <font color='Darkblue'>4. Gradient-Based Optimization (Advanced)</font>
### What it is?
        Uses gradients to adjust hyperparameters (commonly used in neural networks).
### Pros:
        Fast when applicable.
### Cons:
        Requires differentiable objective function, rarely used for traditional ML models.
### Example: Learning rate scheduling in deep learning.
