# Decision Tree and Random Forest Basics

## Decision Trees

## What is a Decision Tree?

A decision tree is a flowchart-like structure in which each internal node represents a test on a feature (e.g. whether a coin flip comes up heads or tails), each leaf node represents a class label (decision taken after computing all features) and branches represent conjunctions of features that lead to those class labels. 

The paths from root to leaf represent classification rules

Below diagram illustrate the basic flow of decision tree for decision making with labels (Rain(Yes), No Rain(No))

![Capture.PNG](attachment:Capture.PNG)

## Decision trees have two very important attributes:

1. Nodes : Split for the value of a certain attribute (outlook, humidity and windy features in our case)
2. Edges : Outcome of a split to next one 
3. Root : The node that performs the first split (outlook feature in our case)
4. Leaves : Terminal nodes that predict the outcome (Yes and No nodes in our case)

## Entropy and Information Gain are the mathematical methods of choosing the best split


### What is Entropy?
- Entropy controls how a Decision Tree decides to split the data. It actually effects how a Decision Tree draws its boundaries

### What is Information gain and why it is matter in Decision Tree?
- Information gain (IG) measures how much “information” a feature gives us about the class.
- Information gain is the main key that is used by Decision Tree Algorithms to construct a Decision Tree.
- Decision Trees algorithm will always tries to maximize Information gain.
- An attribute with highest Information gain will tested/split first.

## Amazing article on Entropy and Information Gain : [Entropy and Information Gain](https://medium.com/coinmonks/what-is-entropy-and-why-information-gain-is-matter-4e85d46d2f01)

## Random Forests


A decision tree is the building block of a random forest and is an intuitive model. We can think of a decision tree as a series of yes/no questions asked about our data eventually leading to a predicted class (or continuous value in the case of regression). This is an interpretable model because it makes classifications much like we do: we ask a sequence of queries about the available data we have until we arrive at a decision (in an ideal world).


The random forest combines hundreds or thousands of decision trees, trains each one on a slightly different set of the observations, splitting nodes in each tree considering a limited number of the features. The final predictions of the random forest are made by averaging the predictions of each individual tree.

## Random Forest Concept

The random forest is a model made up of many decision trees. Rather than just simply averaging the prediction of trees (which we could call a “forest”), this model uses two key concepts that gives it the name random:

1. Random sampling of training data points when building trees
2. Random subsets of features considered when splitting nodes


#### Random sampling of training observations
When training, each tree in a random forest learns from a random sample of the data points. The samples are drawn with replacement, known as bootstrapping, which means that some samples will be used multiple times in a single tree. The idea is that by training each tree on different samples, although each tree might have high variance with respect to a particular set of the training data, overall, the entire forest will have lower variance but not at the cost of increasing the bias.

At test time, predictions are made by averaging the predictions of each decision tree. This procedure of training each individual learner on different bootstrapped subsets of the data and then averaging the predictions is known as bagging, short for bootstrap aggregating.

#### Random Subsets of features for splitting nodes
The other main concept in the random forest is that only a subset of all the features are considered for splitting each node in each decision tree. Generally this is set to sqrt(n_features) for classification meaning that if there are 16 features, at each node in each tree, only 4 random features will be considered for splitting the node. (The random forest can also be trained considering all the features at every node as is common in regression. These options can be controlled in the Scikit-Learn Random Forest implementation).

## What exactly does Random Forest do?

The random forest combines hundreds or thousands of decision trees, trains each one on a slightly different set of the observations, splitting nodes in each tree considering a limited number of the features. The final predictions of the random forest are made by averaging the predictions of each individual tree

Random Forest is used as a way to improve performance of single decision tree. The primary weakness of a decision tree is that they dont they dont tend to have the best predicting accuracy which is partially due to the high variance (which means that different splits in the training data can lead to very different trees)

## Random Forest in Simple Terms

Random Forest is an ensemble of randomized decision trees. Each decision tree gives a vote for the prediction of target variable. Random forest choses the prediction that gets the most vote.

An ensemble learning model aggregates multiple machine learning models to give a better performance. In random forest we use multiple random decision trees for a better accuracy.

Random Forest is a ensemble bagging algorithm to achieve low prediction error. It reduces the variance of the individual decision trees by randomly selecting trees and then either average them or picking the class that gets the most vote.

Bagging is a method for generating multiple versions of a predictor to get an aggregated predictor

## Amazing article on Random Forest     

### [Random Forest Article 1](https://medium.com/datadriveninvestor/decision-tree-and-random-forest-e174686dd9eb)      [Random Forest Article 2](https://towardsdatascience.com/an-implementation-and-explanation-of-the-random-forest-in-python-77bf308a9b76)  [Random Forest Article 3](https://towardsdatascience.com/enchanted-random-forest-b08d418cb411)