# Validating: Random Forests

||continuous|categorical|
|---|---|---|
|**supervised**|**regression**|**classification**|
|unsupervised|dimension reduction|clustering|

In the previous notebook we saw, Random Forest at work. In this one we are taking a look at how well its working....

In [11]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
from sklearn.multiclass import OneVsRestClassifier
from sklearn.metrics import roc_curve, auc
from sklearn.preprocessing import label_binarize
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_digits

# use seaborn plotting defaults
import seaborn as sns; sns.set()

# import warnings
# warnings.filterwarnings('ignore')
from sklearn import metrics


### Precision versus Recall

* **Precision** (also called positive predictive value) is the fraction of relevant instances among the retrieved instances
* **Recall** (also known as sensitivity) is the fraction of relevant instances that have been retrieved over the total amount of relevant instances. 



Both precision and recall are therefore based on an understanding and measure of relevance.

***For Example*** : 
- Suppose a computer program for recognizing dogs in photographs identifies eight dogs in a picture containing 12 dogs and some cats. 
    - Of the eight dogs identified, five actually are dogs (true positives), while the rest are cats (false positives). 
    - The program's precision is 5/8 while its recall is 5/12. 
    
    
    
    
- When a search engine returns 30 pages, only 20 of which were relevant while failing to return 40 additional relevant pages. 
    - its precision is 20/30 = 2/3 while its recall is 20/60 = 1/3. 
    - So, in this case, precision is "how useful the search results are", and recall is "how complete the results are".



<div id="container"> <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/2/26/Precisionrecall.svg/440px-Precisionrecall.svg.png" class="center" alt="Random Forest" style="width: 300px;"/> 



In statistics, if the null hypothesis is that - ***all items are irrelevant***
- where the hypothesis is accepted or rejected based on the number selected compared with the sample size 
- absence of type I and type II errors corresponds respectively to maximum precision (no false positive) and maximum recall (no false negative). 
- The above pattern recognition example
    - 8 dogs in a picture containing 12 dogs and some cats, 
        - 5 actually are dogs (true positives), while the rest are cats (false positives)
    - contained 8 − 5 = 3 **Type I errors**, and 12 − 5 = 7 **Type II errors.** 
    
    
- **Precision** can be seen as a measure of ***exactness or quality***, whereas **Recall** is a measure of ***completeness or quantity.*** 

In simple terms, high precision means that an algorithm returned substantially more relevant results than irrelevant ones, while high recall means that an algorithm returned most of the relevant results. [Source](https://en.wikipedia.org/wiki/Precision_and_recall)



**$F1\_Score$** (also F-score or F-measure) is a measure of a test's accuracy

$$F1\_Score = 2*\frac{precison * recall}{precision + recall}$$

### Another Example: Lets complicate the model a little more

Reload Digits dataset

In [4]:
# Learn to predict each class against the other
from sklearn import svm

digits = load_digits()

X = digits.data
y = digits.target

Binarize the labels and use SVM

In [5]:
# Binarize the output
y = label_binarize(y, classes=range(0,10))
n_classes = y.shape[1]

Do a train-test split and train the classifier -  Here we are trying to compare the predictive power of each class to other classes so [OneVsRestClassifier](http://scikit-learn.org/stable/modules/generated/sklearn.multiclass.OneVsRestClassifier.html#sklearn.multiclass.OneVsRestClassifier) 

In [9]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.7,
                                                    random_state=0)

random_state = np.random.RandomState(0)
classifier = OneVsRestClassifier(svm.SVC(kernel='linear', probability=True,
                                 random_state=random_state))
fit_model = classifier.fit(X_train, y_train)
y_score = fit_model.decision_function(X_test)
y_pred_svm = fit_model.predict(X_test)

**Accuracy report**

In [12]:
print(metrics.classification_report(y_pred_svm, y_test) )

              precision    recall  f1-score   support

           0       0.98      0.97      0.97       130
           1       0.90      0.86      0.88       129
           2       1.00      0.98      0.99       122
           3       0.91      0.80      0.85       148
           4       0.94      0.97      0.95       117
           5       0.91      1.00      0.95       127
           6       0.94      0.98      0.96       122
           7       0.97      0.93      0.95       121
           8       0.77      0.70      0.73       140
           9       0.89      0.95      0.92       117

   micro avg       0.92      0.91      0.91      1273
   macro avg       0.92      0.91      0.92      1273
weighted avg       0.92      0.91      0.91      1273
 samples avg       0.92      0.88      0.89      1273



  _warn_prf(average, modifier, msg_start, len(result))


<small><i>Reference [GitHub](https://github.com/jakevdp/sklearn_pycon2015/).</i></small>