# Model Complexity Tuning

In this notebook, we will learn how to find the best model complexity for our decision tree classifier. This process helps us achieve a good balance between underfitting and overfitting.

## 🚀 Introduction to Model Complexity Tuning

When building machine learning models, choosing the right complexity (like the depth of a decision tree) is crucial. If the model is too simple, it won't capture important patterns (underfitting). If it's too complex, it may memorize the training data and not perform well on new data (overfitting).

## 🎯 Strategy for Tuning Model Complexity

- **Start Simple:** Begin with a simple model (low complexity)
- **Gradually Increase:** Make the model more complex step by step
- **Monitor Performance:** Check how well the model does on training and validation data
- **Find the Sweet Spot:** Select the complexity that gives the best validation performance

## 💻 Interactive Model Tuning

Below is a Python code example that analyzes how model accuracy changes with different tree depths. You can run this code to see the effect of complexity on performance.

In [None]:
# Load example dataset and split into training and validation sets
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

data = load_iris()
X, y = data.data, data.target
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.3, random_state=42)


In [None]:
# Interactive model complexity analysis
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import cross_val_score
import numpy as np

def analyze_complexity(max_depths):
    train_scores = []
    val_scores = []
    
    for depth in max_depths:
        model = DecisionTreeClassifier(max_depth=depth, random_state=42)
        
        # Cross-validation scores
        train_score = cross_val_score(model, X_train, y_train, cv=5).mean()
        val_score = cross_val_score(model, X_val, y_val, cv=5).mean()
        
        train_scores.append(train_score)
        val_scores.append(val_score)
        
        print(f"Depth {depth}: Train={train_score:.3f}, Val={val_score:.3f}")
    
    return train_scores, val_scores

# Test different complexities
depths = range(1, 15)
train_scores, val_scores = analyze_complexity(depths)

# Find optimal complexity
optimal_depth = depths[np.argmax(val_scores)]
print(f"\n🎯 Optimal Depth: {optimal_depth}")


## 🔗 Open in Colab

You can run this notebook interactively in Google Colab by clicking the link below:

[🚀 Open in Colab](https://colab.research.google.com/github/Roopesht/codeexamples/blob/main/genai/python_easy/3/complexity_tuning.ipynb)

## 🎯 Insights on Model Tuning

- **Training score** tends to improve as complexity increases.
- **Validation score** improves initially, then starts to degrade after a certain point.
- The **best complexity** is where the validation score peaks.
- A large gap between training and validation scores indicates overfitting.

This process is fundamental in hyperparameter tuning to build more robust models.