In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

# Review of Decision Trees - yet another classifier model

**Decision trees can be described by the following:**
- Considered to be one of the most mature and traditional algorithms in predictive analytics
- Used to solve classification problems through visual and explicit representations of decisions and decision making
- Essentially a map where you can follow the path according to your decisions at every step, and at the end you get your prediction

## Why and when do we need Decision Trees?

- When features are catgorical: when we can classify data into known groups
- When we want to model a set of sequential, heirarchical decisions that lead to some final result
- When we need to explain (to a boss) the reasoning behind a specific decision

## What are the components of Decision Trees?
**Here is what makes up a decision tree:**
- Root Node
- Leaf Node

We can obtain these root and leaf nodes via:
- Conditional probability
- Entropy
- Information Gain

# The steps and defenitions behind building a Decision Tree

## Entropy
Entropy is a measure of uncertainty for a random variable. The higher the entropy, the more uncertain we are of the variable value at a certain event

$H(Coin) = \sum -p(outcome) * log_2(p(outcome)$

Where the coin has two outcomes, for example: p(H) = .5 and p(T) = .5

In [2]:
def entropy(p):
    H = np.array([-i*np.log2(i) for i in p]).sum()
    return H
    
p = [.5, .5]
print(entropy(p))

1.0


The entropy of the coin with probability of .5 has the most uncertainty when we flip it, so the entropy is at the max of 1.0

In [3]:
p = [.9, .1]
print(entropy(p))

0.4689955935892812


Here we can see that an unfair coin has a way lower entropy, meaning we are more certain of the outcome

## Conditional Probability
Conditional probability can be described as the answer to this question:  
>Given that some friends like Chocolate, what is the probability that they like Strawberry as well?

The formula that we can use to find that probability is this one:
  
  
$ P( Strawberry \mid Chocolate ) = \frac{P( Chocolate \cap Strawberry )}{P( Chocolate )} $