## Performance Measures

### Measuring Accuracy Using Cross-Validation
* Create clone classifer for each CV fold: `from sklearn.base import clone ... clone_clf = clone(sgd_clf)`

### Confusion Matrix
* the `confusion_matrix` in `sklearn` prints the negative class result in first row, and positive class result in 2nd row.
[[TN, FP],[FN, TP]]

```
from sklearn.model_selection import cross_val_predict
from sklearn.metrics import confusion_matrix

y_train_pred = cross_val_predict(clf, X_train, y_train, cv=3)
confusion_matrix(y_train, y_train_pred)
```
* For multi-class, rows represent actual classes, while columns represent predicted classes.

### Precision and Recall 
* `from sklearn.metrics import precision_score, recall_score, f1_score`
* precision = $\dfrac{TP}{TP+FP}$
* recall = $\dfrac{TP}{TP+FN}$
* F1 score (harmonic mean of precision and recall). It favors classifiers that have similar precision and recall.

$$F_1 = \dfrac{2}{\frac{1}{precision}+\frac{1}{recall}} = 2\times \dfrac{precision\times recall}{precision+recall}$$

### Precision-Recall Tradeoff
```
from sklearn.metrics import precision_recall_curve

precisions, recalls, thresholds = precision_recall_curve(y_train, y_scores)
```

### ROC Curve
* plot *true positive rate (recall)* Vs *false positive rate (=FP/(FP+TN))*, in other words, it plots *sensitivity* versus *1-specificity*
* <span style="color:blue"> *pp. 91* </span>

    ```
    from sklearn.metrics import roc_curve, roc_auc_score

    fpr, tpr, thresholds = roc_curve(y_train, y_scores)
    roc_auc_score(y_train, y_scores)
```

* for these plots, we need the `y_score`, which can be obtained from either `decision_function()` or `predict_prob()`

* <span style="color:blue"> *pp.88, 92*</span>

    ```
    y_scores = cross_val_predict(clf, X_train, y_train, cv=3, method = "decision_function")
    ```

    OR

    ```
    y_probas = cross_val_predict(clf, X_train, y_train, cv=3, method = "predict_proba")
    y_scores = y_probas[:,1]  # score = proba of positive class
```

## Multiclass Classification
* Some algorithms (Random Forest, Naive Bayes) are capable of handling multi-class tasks directly, others (SVM, Linear Classifiers) are strictly binary.
* For $N$ classes, *one-versus-all* strategy requires $N$ classifiers, *one-versus-one* (distinguish every pair of potential classes) requires $N(N-1)/2$ classifiers.
    * For a new instance, run all the `clfs` and select the class with the highest score.
    * The **advantage** of OvO is that each `clf` only needs to be trained on the part of the training set for the two classes that it must distinguish. For algorithms that do not scale well (e.g., SVM), OvO is preferred.
    * For most binary clf algorithms, OvA is preferred.
* Scikit-Learn detects the issue and automatically runs OvA. To use OvO
    
    ```
    from sklearn.multiclass import OneVsOneClassifier
    
    clf = ...
    ovo_clf = OneVsOneClassifier(clf)
    ```


## Error Analysis
* Image representation of confusion matrix `plt.matshow(conf_mx, cmap=plt.cm.gray)`
* Plot the *error matrix*, fill the diagonals with zeros.
    ```
    row_sums = conf_mx.sum(axis=1, keepdims=True)
    norm_conf_mx = conf_mx / row_sums
    np.fill_diagonal(norm_conf_mx, 0)
    ```

## Multilabel Classification
## Multioutput Classification

## Exercise
* The [notebook] (https://github.com/ageron/handson-ml/blob/master/03_classification.ipynb) contains an implementation of the *Titanic* Kaggle project (Exercises 2). It is a good example for using `Pipeline`, `FeatureUnion`, `Imputer`, etc
* **Exercise 2** builds a spam classifier, a good place to learn basic text mining/feature transformation, nltk, etc. How to build a feature transformer class to convert emails to word counters and vectors.
    

## Resources
* Detailed implementation [notebook](https://github.com/ageron/handson-ml/blob/master/03_classification.ipynb) by [Aurélien Geron](https://github.com/ageron)