<a href="https://colab.research.google.com/github/ShrieVarshini2004/Machine-Learning-Basics/blob/main/GridsearchCV.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

1️⃣ Cross-Validation (CV)

✅ Purpose: To evaluate a model’s performance by splitting data multiple times.

✅ Prevents Overfitting: Ensures the model generalizes well to unseen data.

✅ Types of CV:

K-Fold CV: Splits data into K folds, trains on K-1, tests on 1.
Stratified K-Fold: Maintains class balance in each fold (for imbalanced datasets).
Leave-One-Out CV (LOO-CV): Trains on all but one sample, repeats for each sample.
Time Series CV: Used for sequential data, ensures no data leakage.

2️⃣ GridSearchCV

✅ Purpose: Finds the best hyperparameters for a model by testing all possible
combinations.

✅ Works by:
Defining a grid of hyperparameter values.
Using Cross-Validation (CV) to test each combination.
Selecting the best combination based on a scoring metric (e.g., accuracy).

✅ Example Use-Case: Tuning hyperparameters like C, kernel, gamma in an SVM.

✅ Alternative: RandomizedSearchCV (tests random combinations, faster for large grids).

In [1]:
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Load dataset
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2, random_state=42)

# Define model
model = SVC()

# Define hyperparameter grid
param_grid = {
    'C': [0.1, 1, 10],   # Regularization parameter
    'kernel': ['linear', 'rbf'],  # Kernel type
    'gamma': ['scale', 'auto']  # Kernel coefficient
}

# Apply GridSearchCV
grid_search = GridSearchCV(model, param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)

# Print best parameters
print("Best Parameters:", grid_search.best_params_)
print("Best Score:", grid_search.best_score_)


Best Parameters: {'C': 1, 'gamma': 'scale', 'kernel': 'linear'}
Best Score: 0.9583333333333334


In [2]:
import pandas as pd
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Load dataset
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2, random_state=42)

# Define model
model = SVC()

# Define hyperparameter grid
param_grid = {
    'C': [0.1, 1, 10],   # Regularization parameter
    'kernel': ['linear', 'rbf'],  # Kernel type
    'gamma': ['scale', 'auto']  # Kernel coefficient
}

# Apply GridSearchCV
grid_search = GridSearchCV(model, param_grid, cv=5, scoring='accuracy', return_train_score=True)
grid_search.fit(X_train, y_train)

# Extract results into a DataFrame
results_df = pd.DataFrame(grid_search.cv_results_)

# Select relevant columns for comparison
comparison_df = results_df[['param_C', 'param_kernel', 'param_gamma', 'mean_test_score', 'rank_test_score']]

# Sort by best test score
comparison_df = comparison_df.sort_values(by="rank_test_score")

# Display the comparison matrix
print("\nEvaluation Matrix:")
print(comparison_df)
print("\nBest Hyperparameters:", grid_search.best_params_)
print("Best Score:", grid_search.best_score_)



Evaluation Matrix:
    param_C param_kernel param_gamma  mean_test_score  rank_test_score
4       1.0       linear       scale         0.958333                1
6       1.0       linear        auto         0.958333                1
11     10.0          rbf        auto         0.958333                1
5       1.0          rbf       scale         0.950000                4
7       1.0          rbf        auto         0.950000                4
8      10.0       linear       scale         0.950000                4
9      10.0          rbf       scale         0.950000                4
10     10.0       linear        auto         0.950000                4
0       0.1       linear       scale         0.941667                9
2       0.1       linear        auto         0.941667                9
3       0.1          rbf        auto         0.916667               11
1       0.1          rbf       scale         0.891667               12

Best Hyperparameters: {'C': 1, 'gamma': 'scale', 'kernel