# Decision Trees

## Introduction and Formulation

A decision tree is a non-parametric **Supervised** learning method used for **Classification and Regression.** The objective is to learn simple decision rules inferred from the data features. 

<figure>
    <center><img src="img/decision_tree.png" width="300" height="300">
    <figcaption>Fig: Decision Tree</figcaption></center>
</figure>

Starting from a base node, the strategy is to choose the attribute which **maximizes the Information Gain** and create a partition of the tree using that attribute
$$
(\text{Entropy}) \quad H(x) = -\sum_{i=1}^n P(x_i)\log_bP(x_i) 
$$
<figure>
    <center><img src="img/entropy.png" width="300" height="300">
    <figcaption>Fig: Binary Entropy function</figcaption></center>
</figure>

$$
(\text{Information Gain}) \quad \text{IG}(S,D) = H(S) - \sum_{V \in D} \frac{|V|}{|D|}H(V)
$$
That is, the information before the split (S) - the information after the split (D). The resulting separation boundaries looks like this

<figure>
    <center><img src="img/decision_tree_data.png" width="600" height="300">
    <figcaption>Fig: Decision Tree Boundaries</figcaption></center>
</figure>

## Important Parameters

**Max Depth**: Controls the max number of nodes and layers of the tree and hence, the fitting of the model (Prevent overfitting by choosing a moderate depth).

## Implementation

We are going to work with the [Iris](https://scikit-learn.org/stable/auto_examples/datasets/plot_iris_dataset.html) Dataset

In [10]:
#Import usual libraries
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 

#Import required libraries and functions
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split

In [11]:
#Load the dataset 
iris = load_iris()
df = pd.DataFrame(data=iris.data, columns=iris.feature_names)
df["target"] = iris.target
df


Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),target
0,5.1,3.5,1.4,0.2,0
1,4.9,3.0,1.4,0.2,0
2,4.7,3.2,1.3,0.2,0
3,4.6,3.1,1.5,0.2,0
4,5.0,3.6,1.4,0.2,0
...,...,...,...,...,...
145,6.7,3.0,5.2,2.3,2
146,6.3,2.5,5.0,1.9,2
147,6.5,3.0,5.2,2.0,2
148,6.2,3.4,5.4,2.3,2


### Simple Use

In [17]:
#Divide the data 
X_train, X_test, y_train, y_test = train_test_split(df.drop(columns=["target"]), df["target"], test_size=0.333, stratify = df["target"], random_state=666)
#Create the model



Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm)
115,6.4,3.2,5.3,2.3
101,5.8,2.7,5.1,1.9
50,7.0,3.2,4.7,1.4
135,7.7,3.0,6.1,2.3
140,6.7,3.1,5.6,2.4
...,...,...,...,...
119,6.0,2.2,5.0,1.5
20,5.4,3.4,1.7,0.2
91,6.1,3.0,4.6,1.4
148,6.2,3.4,5.4,2.3
