Random Forest output a list of features that are important in predicting the variable.

* Perform both regression and classification tasks
* Develop decision trees based on random selection of data and variables
* Use averaging to improve the predictive accuracy and control over-fitting

### How does it work?

Assume we have Data = {(X1, Y1), ... (Xn, Yn)}:

For b (Single Tree) = 1 to B (Random Forest):
    
    1) Draw a bootstrap sample X* of size N from the training data
    
    2) Grow a random-forest tree Tb to the bootstrapped data, repeat the following steps until the minimum node size $\n_{min}$ is reached.
        i. Select m variables at random from the p variables
        ii. Pick the best variable/split-point among the m
        iii. Split the node into two daughter nodes
Output the ensemble of trees {Tb}
    
    1) In the forest with T trees where we have t \episilon {1, ..., T}
    2) All the trees are trained independently
    3) During testing, each test point v is simultaneously pushed through all trees(starting at the root) until it reaches the corresponding leaves

### Forests of randomized trees
* Forest classifiers have to be fitted with two arrays:
    
    1) A sparse or dense array X of size [n_samples, n_features] for the training samples
    
    2) An array Y of size [n_samples] for the target value

In [3]:
from sklearn.ensemble import RandomForestClassifier
X = [[1, 0, 2], [3, 6, 1], [0, 2, 4], [8, 9, 0], [5, 5, 1]]
Y = [0, 1, 0, 1, 0]
# n_estimators: number of trees in the forest
clf = RandomForestClassifier(n_estimators = 10)
clf = clf.fit(X, Y)

In [4]:
clf

RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
            max_depth=None, max_features='auto', max_leaf_nodes=None,
            min_impurity_split=1e-07, min_samples_leaf=1,
            min_samples_split=2, min_weight_fraction_leaf=0.0,
            n_estimators=10, n_jobs=1, oob_score=False, random_state=None,
            verbose=0, warm_start=False)

### Reference:
https://www.youtube.com/watch?v=3kYujfDgmNk


http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html

Random Forest Classifier

http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier