# Decision Trees

"Like SVMs, **Decision Trees** are versatile Machine Learning algorithms that can perform both classification and regression tasks, and even multioutput tasks.

#### Training and Visualizing a Decision Tree

Let's build and train a simple Decision Tree classifier using Scikit-learn's `DecisionTreeClassifier` class.

In [2]:
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier

iris = load_iris()
x = iris.data[:, 2:]
y = iris.target

tree_clf = DecisionTreeClassifier(max_depth=2)
tree_clf.fit(x,y)

DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=2,
                       max_features=None, max_leaf_nodes=None,
                       min_impurity_decrease=0.0, min_impurity_split=None,
                       min_samples_leaf=1, min_samples_split=2,
                       min_weight_fraction_leaf=0.0, presort=False,
                       random_state=None, splitter='best')

We can easily visualize the Decision Tree. First, we need to use the `export_graphviz()` method to convert it to a file called `iris_tree.dot`. Then we can use the `graphviz` package to convert it to a PDF or PNG with the command:

`$ dot -Tpng iris_tree.dot -o iris_tree.png`

In [3]:
from sklearn.tree import export_graphviz
from IPython.display import Image
import pydotplus

export_graphviz(
    tree_clf,
    out_file='iris_tree.dot',
    feature_names=iris.feature_names[2:],
    class_names=iris.target_names,
    rounded=True,
    filled=True
)

In [4]:
from sklearn.externals.six import StringIO

dot_data = StringIO()



In [5]:
graph = pydotplus.graph_from_dot_data(dot_data.getvalue())
Image(graph.create_png())


^
Expected {'graph' | 'digraph'} (at char 0), (line:1, col:1)


AttributeError: 'NoneType' object has no attribute 'create_png'

The prediction making process of decision trees is actually super simple. It simply makes a flowchart-style questionarre. It asks if the pedal width is less than a certain size. If so, it will predict one thing, if not, it may ask a couple more questions before coming to a prediction.

"One of the many qualities of Decision Trees is that they require very little data preparation. In particular, they don't require feature scaling or centering at all."

"Scikit-Learn uses the CART algorithm, which produces only *binary trees*: nonleaf nodes always have two children (i.e., questions only have yes/no answers). However, other algorithms such as ID3 can produce Decision Trees with nodes that have more than two children."

The good news about Decision Trees is that they are considered **white box models**. In that it is very clear why their predictions were made, unlike Random Forests or Neural Networks.

#### Estimating Class Probabilities

"A Decision Tree can also estimate the probability that an instance belongs to a particular class k: first it traverses the tree to find the leaf node for this instance, and then it returns the ratio of training instances of class k in this node. For example, suppose you have found a flower whose petals are 5 cm long and 1.5 cm wide. The corresponding leaf node is the depth-2 left node, so the Decision Tree should output the following probabilities: 0% for Iris-Setosa (0/54), 90.7% for Iris-Versicolor (49/54), and 9.3% for Iris-Virginica (5/54)."

In [6]:
tree_clf.predict_proba([[5, 1.5]])

array([[0.        , 0.90740741, 0.09259259]])