# Setup

Getting default setup from Previous Chapters:
- Making sure it works both in Python 2 & 3
- Getting common imports
- Inline plotting for Jupyter
- Save Figure Function Setup

In [1]:
# python 2 & 3 support
from __future__ import division, print_function, unicode_literals

# common imports
import numpy as np
import os

# setting random seed
np.random.seed(42)

# matplotlib inline plotting
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
# plotting setups
plt.rcParams['axes.labelsize'] = 14
plt.rcParams['xtick.labelsize'] = 12
plt.rcParams['ytick.labelsize'] = 12


# save figure function
PROJECT_ROOT_DIR = 'figures'
CHAPTER_ID = 'svm'
FIG_PATH = os.path.join(PROJECT_ROOT_DIR, CHAPTER_ID)

def image_path(fig_id):
    return os.path.join(FIG_PATH, fig_id)


def save_fig(fig_id, tight_layout = True):
    if not os.path.isdir(FIG_PATH):
        os.makedirs(FIG_PATH)
    fig_path = os.path.join(FIG_PATH, fig_id + '.png') # save as png file
    print('Saving figure', fig_id)
    if tight_layout:
        plt.tight_layout()
    plt.savefig(fig_path, format = 'png', dpi = 300)

# Training and Visualizing a Decision Tree

Like SVMs, _Decision Trees_ are versatile Machine Learning algorithms that can preform both classification and regression tasks and even multioutput tasks. They are very powerful algorithms, capable of fitting complex datasets. Decision Trees are also the fundamental components of Random Forests, which are among the most powerful Machine Learning algorithms available today.

We can start with a basic one and look at how it makes predictions, using `DecisionTreeClassifier` on the iris dataset:

In [2]:
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier

iris = load_iris()
X = iris.data[:,:2] # petal length and width
y = iris.target

tree_clf = DecisionTreeClassifier(max_depth = 2, random_state = 42)
tree_clf.fit(X,y)

DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=2,
            max_features=None, max_leaf_nodes=None,
            min_impurity_decrease=0.0, min_impurity_split=None,
            min_samples_leaf=1, min_samples_split=2,
            min_weight_fraction_leaf=0.0, presort=False, random_state=42,
            splitter='best')

We can export model to `graphviz` and then visualize results:

In [3]:
from sklearn.tree import export_graphviz

export_graphviz(tree_clf, out_file = image_path('iris_tree.dot'),
                feature_names = iris.feature_names[2:],
                class_names = iris.target_names,
                rounded = True,
                filled = True
               )

We can then use command line tool to generate a graph:
```
$ dot -Tpng iris_tree.dot -o iris_tree.png
```