# Decision Tree Classifier

A **Decision Tree** is a supervised learning algorithm that can be used for both **classification** and **regression** tasks.

It works by splitting the dataset into subsets based on the feature values, creating a tree-like structure of decisions.

## 1. Importing Libraries

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier, plot_tree
from sklearn.metrics import accuracy_score, classification_report
import matplotlib.pyplot as plt

## 2. Sample Dataset

In [None]:
data = {
    "age": [25, 30, 45, 35, 40, 50, 23, 33],
    "income": [50000, 60000, 80000, 120000, 70000, 150000, 40000, 90000],
    "buys": [0, 0, 1, 1, 0, 1, 0, 1]
}
df = pd.DataFrame(data)
df

## 3. Train-Test Split

In [None]:
X = df[["age", "income"]]
y = df["buys"]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

## 4. Training Decision Tree

In [None]:
model = DecisionTreeClassifier(criterion="entropy", max_depth=3, random_state=42)
model.fit(X_train, y_train)

## 5. Predictions & Evaluation

In [None]:
y_pred = model.predict(X_test)

print("Accuracy:", accuracy_score(y_test, y_pred))
print("Classification Report:\n", classification_report(y_test, y_pred))

## 6. Visualizing the Tree

In [None]:
plt.figure(figsize=(8,6))
plot_tree(model, feature_names=["age", "income"], class_names=["No", "Yes"], filled=True)
plt.show()

## 7. Key Notes
- Decision Trees split data based on **information gain** (entropy) or **Gini index**.
- They are easy to interpret but can **overfit**.
- Limiting tree depth (`max_depth`) helps prevent overfitting.
- Can be extended to **Random Forests** and **Boosting** for better performance.