<a href="https://colab.research.google.com/github/Jhansipothabattula/Machine_Learning/blob/main/Day42.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to Hyperparameter Tuning

**Parameters and Hypeparameters**

- What are Parameters?

  - values are learned by a machine learning model during training

  - adjusted to min the loss function and optimize predictions

  - Examples:

    - Coefficients in Linear Regression

    - Weights and biases in Neural Networks

- What are Hyperparameters?

  - Settings defined before training that influence how the model learns from data

  - Not Learned from the data but instead control the training process

  - Examples

    - Tree Depth

    - Learning Rate

    - Number of Estimators

**Importance of Tuning Hyperparameters**

- Why Tune Hyperparameters?

  - Improve Model Perfomance

    - Optimal Hyperparameters help models generalizes better, reducing overfitting and underfitting

  - Enhance Efficiency

    - Proper Tuning can reduce training time and computational resources

  - Adapt Problem-Specific Needs

    - Tailoring hyperparameters ensures the mocel fits the dataset's characterestics

**Common Hyperparameters in popular Models**

- Decision Trees and Random Forests

  - max-Depth: Limits the depth of trees to avoid overfitting

  - Min samples models: Minimum samples required to split an internal node

  - Number of Estimators: Total Numbe of trees in Random Forest

- Gradient Boosting Models:

  - Learning Rate: Determines the contribution of each tree

  - Subsample: Fraction of training data used to train each tree

  - Max Depth: Limits the complexity of individual trees

- Neural Networks

  - Learning Rate: Step size for weights updates
  
  - Number of Layers: Determines the depth of the Network

  - batch Size: Number of samples per gradient update

**Objective**

- Train a model with default hyperparameters, evaluate it's perfomance and manually adjust a few hyperparameters to observe their impact on results

In [2]:
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report
import pandas as pd

# Load dataset
data = load_iris()
X, y = data.data, data.target

# split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Display dataset info
print(f"Feature Names: \n{data.feature_names}")
print(f"Class names: \n{data.target_names}")

# Train Random Forest with default hyperparameters
rf_default = RandomForestClassifier(random_state=42)
rf_default.fit(X_train, y_train)

# Predict and evaluate
y_pred_default = rf_default.predict(X_test)
accuracy_default = accuracy_score(y_test, y_pred_default)

print(f"Default Model Accuracy: {accuracy_default:.4f}")
print("Classification Report:", classification_report(y_test, y_pred_default))

# Train Random Forest with adjusted parameters
rf_tuned = RandomForestClassifier(
    n_estimators=200,
    max_depth=5,
    random_state=42
)
rf_tuned.fit(X_train, y_train)

# Predict and Evaluate
y_pred_tuned = rf_tuned.predict(X_test)
accuracy_tuned = accuracy_score(y_test, y_pred_tuned)

print(f"Tuned Model Accuracy: {accuracy_tuned:.4f}")
print("Classification Report", classification_report(y_test, y_pred_tuned))

Feature Names: 
['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
Class names: 
['setosa' 'versicolor' 'virginica']
Default Model Accuracy: 1.0000
Classification Report:               precision    recall  f1-score   support

           0       1.00      1.00      1.00        10
           1       1.00      1.00      1.00         9
           2       1.00      1.00      1.00        11

    accuracy                           1.00        30
   macro avg       1.00      1.00      1.00        30
weighted avg       1.00      1.00      1.00        30

Tuned Model Accuracy: 1.0000
Classification Report               precision    recall  f1-score   support

           0       1.00      1.00      1.00        10
           1       1.00      1.00      1.00         9
           2       1.00      1.00      1.00        11

    accuracy                           1.00        30
   macro avg       1.00      1.00      1.00        30
weighted avg       1.00      1.00      