Q7. Build a Decision Tree (sklearn)
1.	Use sklearn.tree.DecisionTreeClassifier on the Iris dataset.
2.	Train trees with max_depth = 1, 2, 3.
3.	Report training and test accuracy for each depth.
4.	Discuss signs of underfitting vs overfitting.


In [2]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

# Load dataset
iris = load_iris()
X = iris.data
y = iris.target

# Split into train/test
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42, stratify=y
)
print(f"Loaded Iris dataset. Train size: {X_train.shape[0]}, Test size: {X_test.shape[0]}")

Loaded Iris dataset. Train size: 105, Test size: 45


In [3]:
depths = [1, 2, 3]
results = []

for depth in depths:
    clf = DecisionTreeClassifier(max_depth=depth, random_state=42)
    clf.fit(X_train, y_train)
    
    # Predictions
    y_train_pred = clf.predict(X_train)
    y_test_pred = clf.predict(X_test)
    
    # Accuracy
    train_acc = accuracy_score(y_train, y_train_pred)
    test_acc = accuracy_score(y_test, y_test_pred)
    
    results.append((depth, train_acc, test_acc))

# Print results
print("Depth | Train Accuracy | Test Accuracy")
for depth, train_acc, test_acc in results:
    print(f"{depth:<5} | {train_acc:.3f}         | {test_acc:.3f}")


Depth | Train Accuracy | Test Accuracy
1     | 0.667         | 0.667
2     | 0.971         | 0.889
3     | 0.981         | 0.978


4. Discussion: Underfitting vs. Overfitting

Depth = 1 (Underfitting)

The tree is very shallow and can only make very simple splits.

Both training and test accuracy are relatively low.

Model fails to capture the complexity of the data.

Depth = 2 (Good Fit)

Much better training and test accuracy.

The tree captures more structure without over-complicating.

Depth = 3 (Risk of Overfitting)

Training accuracy nearly perfect.

Test accuracy is slightly lower than training (small gap).

If depth kept increasing, the gap would grow → classic overfitting.