[Blog](https://machinelearningmastery.com/evaluate-performance-machine-learning-algorithms-python-using-resampling/)


# Evaluate the Performance of Machine Learning Algorithms in Python using Resampling

### 1. Train and Test Sets.
### 2. K-fold Cross Validation
### 3. Leave One Out Cross Validation
### 4. Repeated Random Test-Train Splits.


## 1. Train and Test Sets.

Positives: fast, good for large data sets

Negatives: Can have high variance

In [5]:
# Evaluate using a train and test set

import pandas as pd
from sklearn import model_selection
from sklearn.linear_model import LogisticRegression

names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
df = pd.read_csv('pima-indians-diabetes.data.csv', names=names)
array = df.values
seed = 7 # This is the random seed

X = array[:,0:8]
Y = array[:,8]

test_size = 0.33 # let the test size be 33% and the training size be 67%
X_train, X_test, Y_train, Y_test = model_selection.train_test_split(X, Y, test_size=test_size, random_state=seed)

model = LogisticRegression()

model.fit(X_train, Y_train)
result = model.score(X_test, Y_test)

f"Accuracy: {result*100.0:.3f}" #  :.3f limits the decimal places to 3

'Accuracy: 75.591'

## 2. K-Fold Cross Validation

Positives: Less variance

Negative: You must determine appropriate k-parts

In [8]:
# Evaluate using Cross Validation
import pandas as pd
from sklearn import model_selection
from sklearn.linear_model import LogisticRegression

names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
df = pd.read_csv('pima-indians-diabetes.data.csv', names=names)
array = df.values
seed = 7 # This is the random seed

X = array[:,0:8]
Y = array[:,8]

num_instances = len(X)

kfold = model_selection.KFold(n_splits=10, random_state=seed)

model = LogisticRegression()
results = model_selection.cross_val_score(model, X, Y, cv=kfold)

f"Accuracy: {results.mean()*100.0:.3f} ({results.std()*100.0:.3f})"

'Accuracy: 76.951 (4.841)'

## Leave One Out Cross Validation

Positive: Large number of performances that can give a more diverse view of your data
    
    
Negative: More computationally expensive than K-Cross Validation

In [11]:
# Evaluate using Cross Validation
import pandas as pd
from sklearn import model_selection
from sklearn.linear_model import LogisticRegression

names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
df = pd.read_csv('pima-indians-diabetes.data.csv', names=names)
array = df.values
seed = 7 # This is the random seed

X = array[:,0:8]
Y = array[:,8]

num_instances = len(X)

loocv = model_selection.LeaveOneOut()

model = LogisticRegression()
results = model_selection.cross_val_score(model, X, Y, cv=loocv)

f"Accuracy: {results.mean()*100.00:.3f}, {results.std()*100.0:.3f}"



'Accuracy: 76.823, 42.196'

## 4. Repeated Random Test-Train Splits

Positives: Speed of train/test split with a reduction in variance

Negatives: No garuntee there isn't redundancy in data evaluation

In [12]:
# Evalute suing Shuffle Split Cross Validation


import pandas as pd
from sklearn import model_selection
from sklearn.linear_model import LogisticRegression

names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
df = pd.read_csv('pima-indians-diabetes.data.csv', names=names)
array = df.values
seed = 7 # This is the random seed

X = array[:,0:8]
Y = array[:,8]

test_size = 0.33 # let the test size be 33% and the training size be 67%

num_samples = 10
num_instances = len(X)

kfold = model_selection.ShuffleSplit(n_splits=10,test_size=test_size, random_state=seed)

model = LogisticRegression()
results = model_selection.cross_val_score(model, X, Y, cv=kfold)

f"Accuracy: {results.mean()*100.00:.3f}, {results.std()*100.0:.3f}"


'Accuracy: 76.496, 1.698'

## What Techniques to Use When

* Generally k-fold cross validation is the gold-standard for evaluating the performance of a machine learning algorithm on unseen data with k set to 3, 5, or 10.

* Using a train/test split is good for speed when using a slow algorithm and produces performance estimates with lower bias when using large datasets.

* Techniques like leave-one-out cross validation and repeated random splits can be useful intermediates when trying to balance variance in the estimated performance, model training speed and dataset size.


The best advice is to experiment and find a technique for your problem that is fast and produces reasonable estimates of performance that you can use to make decisions. If in doubt, use 10-fold cross validation.