# Hyperparameter Tuning in Machine Learning


*   Hyperparameter tuning is the process of selecting the best set of hyperparameters to improve model performance.
*   Hyperparameter tuning is a crucial step in improving the performance of machine learning models.

*   Various algorithms benefit from optimized hyperparameters to achieve better accuracy, generalization, and efficiency.


*   Below is a list of popular algorithms that utilize hyperparameter tuning:






# 1. Manual Search

### How It Works:

Manually choose combinations of hyperparameters based on intuition or experience.
Train and evaluate the model for each combination.

### Advantages:

Simple and quick for small parameter spaces.

### Disadvantages:

Time-consuming and inefficient for large parameter spaces.
Might miss the optimal combination.

# 2. Grid Search
### How It Works:

Exhaustively searches through a specified set of hyperparameter combinations.
For each combination, the model is trained and evaluated.

### Advantages:

Ensures all combinations are tested.

### Disadvantages:

Computationally expensive for large parameter spaces.

In [None]:
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier

# Define the model
model = RandomForestClassifier()

# Define hyperparameters to search
param_grid = {
    'n_estimators': [100, 200, 300],
    'max_depth': [10, 20, None],
    'max_features': ['sqrt', 'log2', None]
}

# Perform Grid Search
grid_search = GridSearchCV(estimator=model, param_grid=param_grid, cv=5)
grid_search.fit(X_train, y_train)

print("Best Hyperparameters:", grid_search.best_params_)


3. Random Search
### How It Works:

Randomly samples combinations of hyperparameters from the specified parameter distributions.
A fixed number of combinations is evaluated.

### Advantages:

Faster than grid search for large spaces.
Can find good hyperparameters without testing all combinations.

### Disadvantages:

Results can vary due to randomness.


In [None]:
from sklearn.model_selection import RandomizedSearchCV
from sklearn.ensemble import RandomForestClassifier

# Define the model
model = RandomForestClassifier()

# Define hyperparameters to search
param_dist = {
    'n_estimators': [100, 200, 300],
    'max_depth': [10, 20, None],
    'max_features': ['sqrt', 'log2', None],
    'min_samples_split': [2, 5, 10]
}

# Perform Random Search
random_search = RandomizedSearchCV(estimator=model, param_distributions=param_dist, n_iter=20, cv=5, random_state=42)
random_search.fit(X_train, y_train)

print("Best Hyperparameters:", random_search.best_params_)


# 4. Bayesian Optimization
### How It Works:

Uses probabilistic models to predict the performance of hyperparameter combinations.
Focuses on the most promising areas of the parameter space.

### Advantages:

Efficient exploration of the parameter space.
Often finds optimal or near-optimal parameters faster than grid or random search.

### Disadvantages:

More complex to implement.


In [None]:
from skopt import BayesSearchCV
from sklearn.ensemble import RandomForestClassifier

# Define the model
model = RandomForestClassifier()

# Define hyperparameters to search
param_dist = {
    'n_estimators': (100, 500),
    'max_depth': (10, 50),
    'max_features': ['sqrt', 'log2', None],
}

# Perform Bayesian Search
bayes_search = BayesSearchCV(estimator=model, search_spaces=param_dist, n_iter=20, cv=5, random_state=42)
bayes_search.fit(X_train, y_train)

print("Best Hyperparameters:", bayes_search.best_params_)


## Best Practices for Hyperparameter Tuning


1.   Start with random search or grid search for simple problems.
2.   Use Bayesian optimization for complex problems and large datasets.
3.   Evaluate using cross-validation to ensure the model generalizes well.
4.   Monitor resource usage to prevent excessive computation time.






