# Application: Decision Trees Part 1

#### References

1. **Decision Trees:** https://scikit-learn.org/stable/modules/tree.html

## Decision Trees Part 1

Decision Trees (DTs) are a non-parametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target variable by learning simple decision rules inferred from the data features.

### Decision tree classifier ```sklearn```

The followign code snippet, taken from ```sklearn``` website, show you how to create a DT classifier 

In [7]:
from sklearn import tree
X = [[0, 0], [1, 1]]
Y = [0, 1]
clf = tree.DecisionTreeClassifier()
clf = clf.fit(X, Y)


Just like we do with many supervised learning algorithms, after the model is been fitted, it can  be used to predict the class of samples:

In [8]:
clf.predict_proba([[2., 2.]])


array([[ 0.,  1.]])

---

**Remark**

```DecisionTreeClassifier``` is capable of both binary (where the labels are ```[-1, 1]```) classification and multiclass (where the labels are ```[0, …, K-1]```) classification.

---

Using the Iris dataset, we can construct a tree as follows:

In [4]:
from sklearn.datasets import load_iris
from sklearn import tree
iris = load_iris()
clf = tree.DecisionTreeClassifier()
clf = clf.fit(iris.data, iris.target)

Once trained, we can export the tree in ```Graphviz``` format using the ```export_graphviz``` exporter. 

In [6]:
import graphviz 
dot_data = tree.export_graphviz(clf, out_file=None) 
graph = graphviz.Source(dot_data) 
graph.render("iris") 

ModuleNotFoundError: No module named 'graphviz'

---

**Remark**

If you use the ```conda``` package manager, the graphviz binaries and the python package can be installed with

```
conda install python-graphviz
```

Alternatively binaries for ```graphviz``` can be downloaded from the graphviz project homepage, and the Python wrapper installed from pypi with 

```
pip install graphviz.
```

---