<font color="turquoise" size="6">Hyperparameter Tuning</font>
<p><font color="turquoise" size="6">2_Randomized Search</font>

What is Randomized Search?

    Instead of trying every possible combination of hyperparameters (as in grid search), it samples a fixed number of combinations randomly from the hyperparameter space.
    Faster and computationally cheaper than grid search.
    Provided by RandomizedSearchCV in scikit-learn.

In [1]:
from sklearn.datasets import load_wine
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import RandomizedSearchCV, train_test_split
from sklearn.metrics import classification_report
from scipy.stats import randint

In [2]:
# Step 1: Load the dataset
wine = load_wine()
X = wine.data
y = wine.target

# Step 2: Split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Step 3: Define the model
clf = RandomForestClassifier(random_state=42)

In [3]:
# Step 4: Define the parameter distribution
param_dist = {
    'n_estimators': randint(50, 200),       # Number of trees in the forest
    'max_depth': [None, 10, 20, 30],       # Maximum depth of the tree
    'min_samples_split': randint(2, 10),   # Minimum samples required to split a node
    'min_samples_leaf': randint(1, 5)      # Minimum samples required to form a leaf
}

In [4]:
# Step 5: Create a RandomizedSearchCV object
random_search = RandomizedSearchCV(
    estimator=clf,
    param_distributions=param_dist,
    n_iter=20,                # Number of random combinations to try
    cv=3,                     # 3-fold cross-validation
    scoring='accuracy',       # Use accuracy as the evaluation metric
    random_state=42,
    n_jobs=-1,                # Use all available CPU cores
    verbose=2                 # Print progress during search
)

In [5]:
# Step 6: Fit the RandomizedSearchCV object
random_search.fit(X_train, y_train)

Fitting 3 folds for each of 20 candidates, totalling 60 fits


In [6]:
# Step 7: Get the best parameters
print("Best Parameters:", random_search.best_params_)

Best Parameters: {'max_depth': None, 'min_samples_leaf': 2, 'min_samples_split': 9, 'n_estimators': 87}


In [7]:
# Step 8: Evaluate the model
best_model = random_search.best_estimator_
y_pred = best_model.predict(X_test)
print("Classification Report:\n", classification_report(y_test, y_pred))

Classification Report:
               precision    recall  f1-score   support

           0       1.00      1.00      1.00        14
           1       1.00      1.00      1.00        14
           2       1.00      1.00      1.00         8

    accuracy                           1.00        36
   macro avg       1.00      1.00      1.00        36
weighted avg       1.00      1.00      1.00        36

