# Compare ML Algorithms

It is important to compare the respective performance of multiple different ML algorithms consistently. 

We will discover how you can create a test harness to compare multiple different ML algorithms in Python with sklearn. You can use this test harness as a template on your own ML problems and add more and different algorithms to compare. 

So, the goal is to learn:
1. How to formulate an experiment to directly compare ML algorithms
2. How to build a reusable template for evaluating the performance of multiple algorithms on one dataset
3. How to report and visualize the results when comparing algorithm performance.

## Choose "the best" ML algorithm

You should use a number of different ways of looking at the estimated accuracy of your ML algorithms in order to choose the one or two algorithm to finalize. A way to do this is to use visualization methods to show the average accuracy, variance and other properties of the distribution of model accuracies. 

We will discover how you can do that in Python with scikit-learn.

## Consistent comparison of ML algos

In the example below 6 different classification algorithms are compared on a single dataset:

* Logistic Regression
* Linear Discriminant Analysis
* k-Nearest Neighbors
* Classification and Regression Trees
* Naive Bayes
* Support Vector Machines



The dataset is the diabetes one. The problem has 2 classes and 8 numeric input variables of varying scales. The 10-fold cross-validation procedure is used to evaluate each algorithm, importantly configured with the same random seed to ensure that the same splits to the training data are performed and that each algorithm is evaluated in precisely the same way. Each algorithm is given a short name, useful for summarizing results afterward.

## 0. Import the data

In [0]:
import pandas as pd

url = 'https://raw.githubusercontent.com/dbonacorsi/AML_basic_AA1920/master/datasets/pima-indians-diabetes.data.csv'

names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
data = pd.read_csv(url, names=names)
data

In [0]:
array = data.values
X = array[:,0:8]
Y = array[:,8]

In [0]:
#from pandas import read_csv
from matplotlib import pyplot
#
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score
#
from sklearn.linear_model import LogisticRegression                         # <---
#
from sklearn.tree import DecisionTreeClassifier                             # <---
#
from sklearn.neighbors import KNeighborsClassifier                          # <---
#
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis        # <---
#
from sklearn.naive_bayes import GaussianNB                                  # <---
#
from sklearn.svm import SVC                                                 # <---

Everything meaningful is in this cell:

In [0]:
# Compare Algorithms

# prepare models
models = []
models.append(( 'LR'   , LogisticRegression(solver='lbfgs', max_iter=500)))    # avoid warnings with (solver='lbfgs', max_iter=500)
models.append(( 'LDA'  , LinearDiscriminantAnalysis()))
models.append(( 'KNN'  , KNeighborsClassifier()))
models.append(( 'CART' , DecisionTreeClassifier()))
models.append(( 'NB'   , GaussianNB()))
models.append(( 'SVM' , SVC()))                                                # avoid warnings with (gamma='scale')

# evaluate each model in turn
results = []
names = []
scoring = 'accuracy'
for name, model in models:
  kfold = KFold(n_splits=10, random_state=7)
  cv_results = cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
  results.append(cv_results)
  names.append(name)
  msg = "%s: %f (%f)" % (name, cv_results.mean(), cv_results.std())
  print(msg)


Running the example provides a list of each algorithm short name, the mean accuracy and the standard deviation accuracy. 

In [0]:
# boxplot algorithm comparison
fig = pyplot.figure()
fig.suptitle('Algorithms comparison')
ax = fig.add_subplot(111)
pyplot.boxplot(results)
ax.set_xticklabels(names)
pyplot.show()

The example also provides a box and whisker plot showing the spread of the accuracy scores across each cross-validation fold for each algorithm. From these results, a suggestion could easily arise: **which models are worthy of further study on this problem?**

## Summary

What we did:

* we discovered how to evaluate multiple different ML algorithms on a dataset in Python with scikit-learn. You learned how to both use the same test harness to evaluate the algorithms and how to summarize the results both numerically and using a box and whisker plot. You can use this recipe as a template for evaluating multiple algorithms on your own problems.