# Agenda
---

- Become familiar with tools in sklearn for evaluating a model
- Learn what a pipline is in sklearn and how to use it

## Evaluating Models
---

### Basic evaluation methods
- Basic accuracy calculation for model

In [1]:
%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [2]:
from sklearn import svm, datasets
from sklearn.cross_validation import train_test_split
from sklearn.metrics import confusion_matrix
from sklearn.naive_bayes import GaussianNB

# Other datasets in sklearn have similar "load" functions
iris = datasets.load_iris()
X, y = iris.data, iris.target

# Leave one value out from training set - that will be test later on
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

In [3]:
# Perform the classification step and run a prediction on test set from above
nb = GaussianNB()
nb.fit(X_train, y_train)
nb.score(X_test, y_test)

0.91111111111111109

### Classification reports
---

In [4]:
from sklearn.metrics import classification_report

y_pred = nb.predict(X_test)
print(classification_report(y_test, y_pred, target_names = iris.target_names))

             precision    recall  f1-score   support

     setosa       1.00      1.00      1.00        15
 versicolor       0.81      0.93      0.87        14
  virginica       0.93      0.81      0.87        16

avg / total       0.92      0.91      0.91        45



**_Precision, Recall, Accuracy & F-measure explanation_**

<img src="images/Precisionrecall.png" width="300" height="300"/>

<img src="images/precision.svg" size="300" />
<img src="images/recall.svg" size="300" />

**Accuracy**
- Accuracy is basically (total correct classification)/(total data points)
<img src="images/Accuracy.svg" size="300" />

**F-Score**
- A combined measure that assess the P/R tradeoff
- Harmonic mean of precision and recall

<img src="images/f_measure.svg" size="300" />