### Decision Tree

In this chapter you will learn how to make a "Decision Tree". A Decision Tree is a flow chart, and can help you make decisions 
based on previous experience.

Decision trees are great algorithms to learn for many reasons. 
* it is great for beginners 
* it’s a “white box” algorithm, meaning that you can actually understand the decision-making of the algorithm. 
This is especially useful for beginners to understand the “how” of machine learning.

Beyond this, decision trees are great algorithms because:

1. They’re generally faster to train than most of the other algorithms
2. Their complexity is a by-product (something that is produced as a result of making something else) of 
the data’s attributes and dimensions
3. It’s a non-parametric method meaning that they do not depend on probability distribution assumptions
4. They can handle high dimensional data with high degrees of accuracy

### How do Decision Trees Work?
Decision trees work by splitting data into a series of binary decisions. These decisions allow you to traverse down the tree 
based on these decisions. You continue moving through the decisions until you end at a leaf node, 
which will return the predicted classification.

In the following example, a person will try to decide if he/she should go to a comedy show or not.

Luckily our example person has registered every time there was a comedy show in town, and registered some information 
about the comedian, and also registered if he/she went or not.

Age,Experience,Rank,Nationality,Go
36,10,9,UK,NO
42,12,4,USA,NO 
23,4,6,N,NO
52,4,4,USA,NO
43,21,8,USA,YES
44,14,5,UK,NO
66,3,7,N,YES
35,14,9,UK,YES
52,13,7,N,YES
35,5,9,N,YES
24,3,5,USA,NO
18,3,7,UK,YES
45,9,9,UK,YES

Now, based on this data set, Python can create a decision tree that can be used to decide if any new shows are worth attending to.

### How Does it Work?

First, read the dataset with pandas:

In [1]:
import pandas

df = pandas.read_csv("data_for_decision_tree.csv")

print(df)

    Age  Experience  Rank Nationality   Go
0    36          10     9          UK   NO
1    42          12     4         USA   NO
2    23           4     6           N   NO
3    52           4     4         USA   NO
4    43          21     8         USA  YES
5    44          14     5          UK   NO
6    66           3     7           N  YES
7    35          14     9          UK  YES
8    52          13     7           N  YES
9    35           5     9           N  YES
10   24           3     5         USA   NO
11   18           3     7          UK  YES
12   45           9     9          UK  YES


To make a decision tree, all data has to be numerical.

We have to convert the non numerical columns 'Nationality' and 'Go' into numerical values.

Pandas has a map() method that takes a dictionary with information on how to convert the values.

{'UK': 0, 'USA': 1, 'N': 2}

Means convert the values 'UK' to 0, 'USA' to 1, and 'N' to 2.

### Example

Change string values into numerical values:

In [2]:
d = {'UK': 0, 'USA': 1, 'N': 2} # define a dictionary. Convert 'UK'to 0, 'USA' to 1, 'N'to 2
df['Nationality'] = df['Nationality'].map(d) # 
d = {'YES': 1, 'NO': 0} # Convert 'YES' to 1, 'NO' to 0
df['Go'] = df['Go'].map(d)

print(df)

    Age  Experience  Rank  Nationality  Go
0    36          10     9            0   0
1    42          12     4            1   0
2    23           4     6            2   0
3    52           4     4            1   0
4    43          21     8            1   1
5    44          14     5            0   0
6    66           3     7            2   1
7    35          14     9            0   1
8    52          13     7            2   1
9    35           5     9            2   1
10   24           3     5            1   0
11   18           3     7            0   1
12   45           9     9            0   1


"Curly Braces" are used in Python to define a dictionary. A dictionary is a data structure that maps one value to another - kind of like how an English dictionary maps a word to its definition. They are not used to denote code blocks as they are in many "C-like" languages.

Now, we have to separate the feature columns from the target column.

The feature columns are the columns that we try to predict from, and the target column is the column with the values we try to predict.

X is the feature columns, y is the target column:

In [3]:
features = ['Age', 'Experience', 'Rank', 'Nationality']

X = df[features] # we can think these values as independent variables
y = df['Go'] # We can think this value as dependent variable

#print(X)
#print(y)

Now we can create and display the decision tree, fit it with our details. Start by importing the modules we need.

In [4]:
#Three lines to make our compiler able to draw:
import sys
import matplotlib
matplotlib.use('TkAgg')

from sklearn import tree
from sklearn.tree import DecisionTreeClassifier
import matplotlib.pyplot as plt


dtree = DecisionTreeClassifier() # Decision tree classifiers are supervised machine learning models. 
                                 # This means that they use prelabelled data in order to train an algorithm that can be used 
                                 # to make a prediction. 
# Decision tree classifiers work like flowcharts. Each node of a decision tree represents a decision point that splits into 
# two leaf nodes. Each of these nodes represents the outcome of the decision and each of the decisions can also turn into 
# decision nodes. Eventually, the different decisions will lead to a final classification.
dtree = dtree.fit(X, y)

tree.plot_tree(dtree, feature_names=features)

plt.show() 

### Result Explained

The decision tree uses your earlier decisions to calculate the odds for you to wanting to go see a comedian or not.

To read the different aspects of the decision tree, refer to the "week 6 decision tree explanation.docx" file on Yulearn week 6.

In [5]:
print(dtree.predict([[50, 12, 7, 1]]))

print(dtree.predict([[40, 10, 6.3, 0]]))

print("[1] means 'GO'")
print("[0] means 'NO GO'")

[1]
[0]
[1] means 'GO'
[0] means 'NO GO'




### Different Results

You will see that the Decision Tree gives you different results if you run it enough times, even if you feed it with the same data.

That is because the Decision Tree does not give us a 100% certain answer. It is based on the probability of an outcome, and the answer will vary.