<font color="turquoise" size="6">Hyperparameter Tuning</font>
<p><font color="turquoise" size="6">1_GRID SEARCH</font>

Grid Search is a hyperparameter tuning technique that systematically explores a predefined grid of hyperparameter values to identify the best combination that optimizes the model's performance. It evaluates every possible combination of the provided parameters using cross-validation.

<b><font color="Red">How Does Grid Search Work?<b>
<ol>
  

   <li>Define a grid of hyperparameter values.</li>
   <li> Train the model on every combination of hyperparameter values using cross-validation.</li>
   <li> Evaluate the model's performance for each combination using a scoring metric (e.g., accuracy, precision, recall).</li>
  <li>  Select the combination that yields the best score.</li>
    </ol>

In [3]:
from sklearn.datasets import load_wine
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn.metrics import classification_report


In [4]:
# Step 1: Load the dataset
wine = load_wine()
X = wine.data
y = wine.target


In [5]:
# Step 2: Split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Step 3: Define the model
clf = RandomForestClassifier(random_state=42)


In [6]:
# Step 4: Define the parameter grid
param_grid = {
    'n_estimators': [50, 100, 200],       # Number of trees in the forest
    'max_depth': [None, 10, 20],         # Maximum depth of the tree
    'min_samples_split': [2, 5, 10],     # Minimum samples required to split a node
    'min_samples_leaf': [1, 2, 4]        # Minimum samples required to form a leaf
}

In [7]:
# Step 5: Create a GridSearchCV object
grid_search = GridSearchCV(estimator=clf, param_grid=param_grid, 
                           cv=3, scoring='accuracy', verbose=2, n_jobs=-1)

In [8]:
# Step 6: Fit the GridSearchCV object
grid_search.fit(X_train, y_train)


Fitting 3 folds for each of 81 candidates, totalling 243 fits


In [9]:
# Step 7: Get the best parameters
print("Best Parameters:", grid_search.best_params_)

# Step 8: Evaluate the model
best_model = grid_search.best_estimator_
y_pred = best_model.predict(X_test)
print("Classification Report:\n", classification_report(y_test, y_pred))

Best Parameters: {'max_depth': None, 'min_samples_leaf': 1, 'min_samples_split': 2, 'n_estimators': 100}
Classification Report:
               precision    recall  f1-score   support

           0       1.00      1.00      1.00        14
           1       1.00      1.00      1.00        14
           2       1.00      1.00      1.00         8

    accuracy                           1.00        36
   macro avg       1.00      1.00      1.00        36
weighted avg       1.00      1.00      1.00        36



<b>Cross-validation</b> is a technique used in machine learning to evaluate the performance of a model on unseen data. It provides a better estimate of model performance than a single train-test split by using multiple splits of the data.