# Day 19: Hyperparameter Tuning

In [2]:
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import accuracy_score

# Load dataset
data = load_iris()
X = data.data
y = data.target

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Define the model
rf = RandomForestClassifier(random_state=42)

# Hyperparameter grid
param_grid = {
    'n_estimators': [50, 100, 200],
    'max_depth': [None, 10, 20, 30],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4]
}

# Perform GridSearchCV
grid_search = GridSearchCV(estimator=rf, param_grid=param_grid, cv=5, n_jobs=-1)
grid_search.fit(X_train, y_train)

# Best parameters and model
print("Best hyperparameters:", grid_search.best_params_)

# Evaluate the model
best_rf = grid_search.best_estimator_
y_pred = best_rf.predict(X_test)
# print("Accuracy:", accuracy_score(y_test, y_pred))


Best hyperparameters: {'max_depth': None, 'min_samples_leaf': 1, 'min_samples_split': 2, 'n_estimators': 100}


Day 19: Hyperparameter Tuning

In machine learning, hyperparameters are the parameters that are set before the learning process begins. They control aspects of the training process such as model complexity, training time, and learning rates, which can significantly impact the model's performance. Unlike model parameters (like weights in a neural network) that are learned from the data during training, hyperparameters are set prior to the model training and need to be optimized for the best model performance.

Examples of Hyperparameters:
Learning Rate: Controls how much to change the model in response to the estimated error each time the model weights are updated.
Number of Trees (for Random Forest): The number of trees in the ensemble model.
Max Depth (for Decision Trees): The maximum depth of the tree to prevent overfitting.
Batch Size (for Neural Networks): Number of samples per gradient update in deep learning models.
Regularization Strength (like L2): Controls overfitting by penalizing large coefficients.
    
Importance of Hyperparameter Tuning in ML
Hyperparameter tuning is essential because it directly influences the model's ability to generalize well to unseen data. By optimizing these settings, you ensure the model has the best possible performance while avoiding underfitting or overfitting.

Overfitting: If the model is too complex (too many trees, too deep a tree, too high a learning rate), it might learn noise from the training data and fail to generalize to new data.
Underfitting: If the model is too simple (too few trees, shallow depth, too low a learning rate), it might not capture the underlying patterns in the data.
Optimal hyperparameter tuning allows a model to perform at its best by striking a balance between bias and variance.

#100DaysOfCodeDay19 #HyperparameterTuning #MachineLearning