<a href="https://colab.research.google.com/github/Jhansipothabattula/Machine_Learning/blob/main/Day43.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Grid Serach and Random Search

**Introdutcion to Grid Serach and Random Search**

- What is Grid Search?

  - Method to hyperparameter tuning that systematically evaluates all possible combinations of hyperparameter values within a specific grid

  - How it works:

    - Define a range of values for each hyperparameter

    - Train and evaluate the model on each combination of hyperparameter values

    - Select the combination that yields the best perfomance

- What is Random search?

  - Alternative method where hyperparameter combinations are sampled randomly from thr specified ranges

  - How it works:

    - Define ranges or dirtibutions for each hyperparameter

    - Randomly sample a specified number of combinations

    - Train and evaluate the model for each sampled combination

**Pros and cons of each method**

| Feature | Grid Search | Random Search |
|----------|----------|----------|
| Exhaustiveness    | Evaluates all combinations    | Randomly samples combinations    |
| Time Efficiency    | Computationally expensive for large grids    | Faster for large parameter spaces    |
| Exploration    | Limited to predefined grid    | Explores more diverse ranges    |
| Best use case    | Small parameter spaces    | Large parameter spaces with time constraints    |


**Practical Guidance on Choosing Search Ranges**

- Start with Broad Ranges

  - Use Random search to explore large parameter spaces and identify promisisng ranges

- Refine with Grid search

  - Narrower the search based on Random search results adn perform exhaustive Grid Search for fine-tuning

- Understand Model Sensitivity

  - Model hyperparameters(EX: Learning Rate)requires fine granularity, ehile others(EX: number of trees)can have coarser steps

**Objective**

- Implement both Grid search and Random Search for hyperparameter tuning, compare their efficiency and analyze their impact on model perfomance

In [2]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split, GridSearchCV, RandomizedSearchCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
import numpy as np

# Load Dataset
data = load_iris()
X, y = data.data, data.target

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,random_state=42)

# Display dataset info
print(f"Feature Names:{data.feature_names}")
print(f"Class Names:{data.target_names}")

# Define hyperparameter grid for RandomForestClassifier
param_grid = {
    "n_estimators": [100, 200, 300],
    "min_samples_split": [2, 5, 10]
}

# Initialize Grid Search
grid_search = GridSearchCV(
    estimator=RandomForestClassifier(random_state=42),
    param_grid=param_grid,
    scoring="accuracy",
    cv=5,
    n_jobs=-1
)

# Perform Grid Search
grid_search.fit(X_train, y_train)

# Evaluate best model
best_grid_model = grid_search.best_estimator_
y_pred_grid = best_grid_model.predict(X_test)
accuracy_grid = accuracy_score(y_test, y_pred_grid)

print(f"Best Hyperparameters(Grid Search): {grid_search.best_params_}")
print(f"Grid Search Accuracy: {accuracy_grid:.4f}")

# Define hyperparameter distribution
param_dist = {
    "n_estimators": [50, 100, 150, 200], # Corrected range for n_estimators
    "max_depth": [None, 5, 10, 15],
    "min_samples_split": [2, 5, 10, 20]
}

# Initialize Random Search
random_search = RandomizedSearchCV(
    estimator=RandomForestClassifier(random_state=42),
    param_distributions=param_dist,
    n_iter=20,
    scoring="accuracy",
    cv=5,
    n_jobs=-1,
    random_state=42 # Corrected placement of random_state
)

# Perform Random Search
random_search.fit(X_train, y_train)

# Evaluate best model
best_random_model = random_search.best_estimator_
y_pred_random = best_random_model.predict(X_test)
accuracy_random = accuracy_score(y_test, y_pred_random)

print(f"Best Hyperparameters(Random Search): {random_search.best_params_}")
print(f"Random Search Accuracy: {accuracy_random:.4f}")

Feature Names:['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
Class Names:['setosa' 'versicolor' 'virginica']
Best Hyperparameters(Grid Search): {'min_samples_split': 2, 'n_estimators': 200}
Grid Search Accuracy: 1.0000
Best Hyperparameters(Random Search): {'n_estimators': 50, 'min_samples_split': 5, 'max_depth': 15}
Random Search Accuracy: 1.0000
