# Hyperparameter Optimization in Machine Learning

Hyperparameter optimization is a crucial step in the machine learning pipeline. It involves finding the best set of hyperparameters for a machine learning algorithm to maximize its performance on a specific task.

## What Are Hyperparameters?
- **Definition**: Hyperparameters are parameters set before the learning process begins. They are not learned from the data but rather tuned manually or automatically.
- **Examples**:
  - Learning rate in gradient-based optimization.
  - Number of layers and neurons in a neural network.
  - Depth of a decision tree.
  - Number of clusters in k-means.

## Why Is Hyperparameter Optimization Important?
- Improves model performance by finding the most suitable configuration.
- Prevents underfitting or overfitting of the model.
- Ensures generalization to unseen data.

## Common Methods for Hyperparameter Optimization
### 1. **Grid Search**
- Explores all possible combinations of hyperparameters within a predefined grid.
- **Pros**: Simple and exhaustive.
- **Cons**: Computationally expensive, especially for large grids.

### 2. **Random Search**
- Samples hyperparameters randomly within predefined ranges.
- **Pros**: Often faster than grid search; more effective when only a few hyperparameters are influential.
- **Cons**: May still require extensive computation.

### 3. **Bayesian Optimization**
- Builds a probabilistic model (e.g., Gaussian Process) of the objective function and uses it to select the most promising hyperparameters.
- **Pros**: Efficient for expensive evaluations.
- **Cons**: Implementation complexity.

### 4. **Gradient-Based Optimization**
- Uses gradients to optimize continuous hyperparameters (e.g., learning rate).
- **Pros**: Fast for differentiable hyperparameters.
- **Cons**: Limited to continuous hyperparameters.

### 5. **Evolutionary Algorithms**
- Uses techniques inspired by natural evolution (e.g., mutation, crossover) to optimize hyperparameters.
- **Pros**: Effective for non-differentiable and complex search spaces.
- **Cons**: Can be computationally intensive.

### 6. **Early Stopping Techniques**
- Halts training when performance on validation data stops improving.
- Often used in combination with other optimization methods.

## Best Practices
1. **Start Simple**: Begin with default hyperparameters and gradually fine-tune.
2. **Use Cross-Validation**: Evaluate hyperparameters using cross-validation to ensure generalization.
3. **Scale Resources**: Use cloud computing or distributed systems for large-scale hyperparameter optimization.
4. **Automated Tools**: Leverage libraries like:
   - `GridSearchCV` and `RandomizedSearchCV` in scikit-learn.
   - `Optuna` for advanced optimization.
   - `Hyperopt` and `Ray Tune` for scalable optimization.

## Summary
Hyperparameter optimization enhances machine learning model performance by systematically exploring or searching for the best configurations. While methods like grid and random search are straightforward, advanced techniques like Bayesian optimization offer efficiency for complex problems. Careful optimization can significantly impact the quality of predictions and the overall success of machine learning projects.


![Screenshot (8178).png](attachment:9ce3061f-ee6d-4d29-8843-1bd1f7461d35.png)

![Screenshot (8180).png](attachment:6f74fdaf-bdd0-401e-b579-b22693501f36.png)

In [3]:
import pandas as pd
data = pd.read_csv('iris.csv')
x = data.drop(['Species','Id'] , axis = 1)
y = data['Species']

In [5]:
x.sample(5)

Unnamed: 0,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm
74,6.4,2.9,4.3,1.3
60,5.0,2.0,3.5,1.0
88,5.6,3.0,4.1,1.3
11,4.8,3.4,1.6,0.2
35,5.0,3.2,1.2,0.2


In [7]:
y.sample(5)

11        Iris-setosa
53    Iris-versicolor
85    Iris-versicolor
60    Iris-versicolor
9         Iris-setosa
Name: Species, dtype: object

In [11]:
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.3)

In [13]:
from sklearn.svm import SVC
model = SVC(C=0.1 , kernel='rbf')

In [15]:
model.fit(x_train , y_train)

In [17]:
model.score(x_test , y_test)

0.8888888888888888

### GridSearchCV

In [20]:
from sklearn.model_selection import GridSearchCV

In [22]:
param_grid = {'C':[0.1, 1, 10],
             'kernel':['rbf', 'linear', 'poly']}

In [24]:
grid_search = GridSearchCV(estimator=model, param_grid=param_grid)
grid_search.fit(x_train, y_train)

In [26]:
grid_search.best_params_

{'C': 1, 'kernel': 'rbf'}

In [28]:
grid_search.score(x_test, y_test)

0.9777777777777777

### RandomizedSearchCV

In [31]:
from sklearn.model_selection import RandomizedSearchCV

In [33]:
param_dist = {'C':[0.1, 1, 10],
             'kernel':['rbf', 'linear', 'poly']}

In [35]:
randomized_search = RandomizedSearchCV(estimator=model, param_distributions=param_dist, n_iter=8)
randomized_search.fit(x_train, y_train)

In [37]:
randomized_search.best_params_

{'kernel': 'rbf', 'C': 1}

In [39]:

randomized_search.score(x_test, y_test)

0.9777777777777777