# Decision Tree

Decision Tree is one of the most popular machine learning algorithm which belongs to supervised learning type. It's a structure for decision-making where each decision leads to set of consequences or additional decisions. Decision Tree can be used for both classification and regression.

## Case Study: App Recommendation Engine

Say, we are about to build a recommendation for the App Store or Google Play. The task is to recommend people the app they're most likely to download, based on previous data. Below is the data we have.

| Gender | Occupation | App |
|:-----:|---------|-------|
| F | Study | PUBG |
| F | Work | Slack |
| M | Work | Tinder |
| F | Work | Slack |
| M | Study | PUBG |
| M | Study | PUBG |

First, let's consider these questions.

- For a woman who works at an office, what app do we recommend?
- For a man who works at a factory, what app do we recommend?
- For a girl who goes to high school, what app do we recommend?

### Machine asks slightly different question

- Between gender and occupation, which one seems more decisive for predicting what app will the users download?

### How about data with numerical/continuous features?

Consider a case of student admission with features: `Grades` and `Test`. We can ask similar question but different notion.

- Between a horizontal and a vertical line, which one would cut the data better? (After making a plot)

## Entropy

In order computers (model) to know, we introduce **Entropy**. Borrowed from physics, strictly speaking, entropy is a measure of how much freedom particles have to move around. Consider below image of state of particles.

![](../python-for-=d/assets/img/entropy-particles.png)

Ice will have the lowest entropy, water have medium entropy, and gas will have the highest entropy since it can move around a lot.

### Entropy in Probability

Imagin you have 3 buckets. Each buckets consist of 4 balls with different configuration of color.

- Bucket A - 4 red balls
- Bucket B - 3 red balls, 1 blue balls
- Bucket C - 2 red balls, 2 blue balls

Entropy measure how much balls are allowed to move around if we put them in a line? Which bucket does have the highest entropy, medium, and lowest one?

## Entropy in Knowledge

Using the same scenario and configuration, now consider you take one ball from those buckets. Which bucket does inform you better about the color? Which bucket does have the highest entropy and/or highest knowledge?

## Entropy Formula

$$
entropy = - \sum_{i=1}^{n}{p_i} \cdot \log_2 p_i
$$

## Information Gain

In decision tree, information gain is the change of entropy of parent nodes with the child nodes. This is where decision tree got calculated and measure how well the model is.

## Hyperparameters for Decision Trees

In order to create decision trees that will generalize to new problems well, we can tune a number of different aspects about the trees. We call the different aspects of a decision tree "hyperparameters". These are some of the most important hyperparameters used in decision trees:

### Maximum depth
The maximum depth of a decision tree is simply the largest length between the root to a leaf. A tree of maximum length `k` can have at most $2^k$ leaves.

### Minimum number of samples per leaf
Minimum number of samples required to be at a leaf node. When splitting a node, one could run into the problem of having 99 samples in one of them, and 1 on the other. This will not take us too far in our process, and would be a waste of resources and time. If we want to avoid this, we can set a minimum for the number of samples we allow on each leaf.

### Minimum number of samples per split
This is the same as number of samples per leaf, but applied on any split of an internal node.

### Maximum number of features
Oftentimes, we will have too many features to build a tree. If this is the case, in every split, we have to check the entire dataset on each of the features which is very expensive. A solution for this is to limit the number of features that one looks for in each split.

---

## Decision Trees in Scikit-Learn