<a href="https://colab.research.google.com/github/Moksha-nagraj/Marvel_tasks_lvl2/blob/main/Hyperparameter_tuning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Hyperparameter Tuning in Machine Learning
A machine learning model is a mathematical construct that learns parameters from data. However, it also relies on hyperparameters—settings predetermined before training that influence the model's learning process.
#####**What is Hyperparameter Tuning?**
Hyperparameter tuning is essential for optimizing model performance and involves selecting the best values for these parameters, such as learning rate, number of layers, or regularization settings.
####**Techniques for Hyperparameter Tuning**
Various techniques exist for tuning hyperparameters, including GridSearchCV and RandomizedSearchCV, which systematically explore combinations to enhance accuracy.
####**Importance of Proper Tuning**
Proper tuning can significantly improve a model’s performance by fine-tuning its complexity and learning capacity across different algorithms, such as neural networks, support vector machines, and XGBoost.

In [None]:
# Hyperparameter Tuning with Random Forest Classifier

# Importing necessary libraries
from sklearn.datasets import make_classification
import numpy as np
import pandas as pd
import plotly.graph_objects as go
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

# Step 1: Generate a synthetic dataset
X, Y = make_classification(n_samples=200, n_classes=2, n_features=10, n_redundant=0, random_state=1)
print("Dataset shapes:", X.shape, Y.shape)

# Step 2: Split the dataset into training and testing sets
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=1)

# Step 3: Initialize and train the Random Forest classifier
rf = RandomForestClassifier(max_features=5, n_estimators=100)
rf.fit(X_train, Y_train)

# Step 4: Evaluate the model
accuracy = rf.score(X_test, Y_test)
print(f"Initial model accuracy: {accuracy:.3f}")

# Hyperparameter Tuning
print("Hyperparameter Tuning")

# Define the hyperparameter grid
max_features_range = np.arange(1, 6, 1)
n_estimators_range = np.arange(10, 210, 10)
param_grid = dict(max_features=max_features_range, n_estimators=n_estimators_range)

# Initialize GridSearchCV
grid = GridSearchCV(estimator=rf, param_grid=param_grid, cv=5)
grid.fit(X_train, Y_train)

# Step 5: Print best parameters and score
print(f"The best parameters are {grid.best_params_} with a score of {grid.best_score_:.2f}")

# Step 6: Compile grid results into a DataFrame
grid_results = pd.concat([pd.DataFrame(grid.cv_results_["params"]),
                          pd.DataFrame(grid.cv_results_["mean_test_score"], columns=["Accuracy"])], axis=1)

# Step 7: Prepare data for visualization
grid_contour = grid_results.groupby(['max_features', 'n_estimators']).mean()
grid_reset = grid_contour.reset_index()
grid_reset.columns = ['max_features', 'n_estimators', 'Accuracy']

# Fixing the pivot creation
grid_pivot = grid_reset.pivot(index='max_features', columns='n_estimators', values='Accuracy')

# Step 8: Plotting accuracy as a contour plot
x = grid_pivot.columns.values
y = grid_pivot.index.values
z = grid_pivot.values

layout = go.Layout(
    xaxis=go.layout.XAxis(title=go.layout.xaxis.Title(text='n_estimators')),
    yaxis=go.layout.YAxis(title=go.layout.yaxis.Title(text='max_features'))
)

fig = go.Figure(data=[go.Contour(z=z, x=x, y=y)], layout=layout)
fig.update_layout(title='Hyperparameter Tuning Contour Plot', autosize=False,
                  width=500, height=500,
                  margin=dict(l=65, r=50, b=65, t=90))
fig.show()

# Step 9: Plotting accuracy as a 3D surface plot
fig = go.Figure(data=[go.Surface(z=z, y=y, x=x)], layout=layout)
fig.update_layout(title='Hyperparameter Tuning Surface Plot',
                  scene=dict(xaxis_title='n_estimators',
                             yaxis_title='max_features',
                             zaxis_title='Accuracy'),
                  autosize=False,
                  width=800, height=800,
                  margin=dict(l=65, r=50, b=65, t=90))
fig.show()


Dataset shapes: (200, 10) (200,)
Initial model accuracy: 0.825
Hyperparameter Tuning
The best parameters are {'max_features': 2, 'n_estimators': 110} with a score of 0.91
