# Decision Trees

## Theory

Decision trees are one of the simpler machine learning methods. After training, it looks like a series of if-then statements arranged into a tree.

There are several algorithms to train a decision tree:
- CART: Classification and regression trees
- ID3

### CART
This algorithm chooses the best variable to divide up the data. 2 measures for measuring how mixed a set is:
- Gini impurity
- Entropy

## Practice

Train a decision tree model with sklearn.

In [2]:
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier

iris = load_iris()
X = iris.data[:, 2:]
y = iris.target

tree_clf = DecisionTreeClassifier(max_depth=2)
tree_clf.fit(X, y)

DecisionTreeClassifier(max_depth=2)

In [4]:
from sklearn.tree import export_graphviz

export_graphviz(
        tree_clf,
        out_file=("iris_tree.dot"),
        feature_names=iris.feature_names[2:],
        class_names=iris.target_names,
        rounded=True,
        filled=True
    )

In [10]:
! dot -Tpng artifacts/iris_tree.dot -o artifacts/iris_tree.png

![decision tree viz](artifacts/iris_tree.png)

## References
- [Hands On Machine Learning, Chapter 6. Decision Trees](https://learning.oreilly.com/library/view/hands-on-machine-learning/9781492032632/ch06.html)
- [Programming Collective Intelligence, Chapter 7. Modeling with Decision Trees](https://learning.oreilly.com/library/view/programming-collective-intelligence/9780596529321/ch07.html)