# Ensembles â€” Random Forest and Gradient Boosting

This notebook turns trees into **strong practical benchmarks**.

**We will compare:**
- Single decision tree vs Random Forest
- Bagging vs boosting intuition from results
- Models under the same split and metrics

**Goal:** build a realistic baseline for structured data.

## 1. Setup and Data

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score

sns.set_style("whitegrid")
plt.rcParams["figure.figsize"] = (10, 6)

from sklearn.datasets import make_classification

X, y = make_classification(
    n_samples=2000,
    n_features=20,
    n_informative=10,
    n_redundant=5,
    random_state=42,
)

df = pd.DataFrame(X, columns=[f"feature_{i+1}" for i in range(X.shape[1])])
df["target"] = y

print("Shape:", df.shape)
print(df["target"].value_counts())
df.head()