## Leave-One-Out Cross-Validation for evaluating Machine Learning Algorithms

In [1]:
from pandas import read_csv
from numpy import mean
from numpy import std
from numpy import absolute
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import LeaveOneOut
from sklearn.datasets import make_blobs
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import accuracy_score

In [2]:
X, y = make_blobs(n_samples = 100, random_state = 1)
cv = LeaveOneOut()
y_true, y_pred = list(), list()

In [3]:
for train_ix, test_ix in cv.split(X):
    # split data
    X_train, X_test = X[train_ix, :], X[test_ix, :]
    y_train, y_test = y[train_ix], y[test_ix]
    # fit model
    model = RandomForestClassifier(random_state=1)
    model.fit(X_train, y_train)
    # evaluate model
    yhat = model.predict(X_test)
    # store
    y_true.append(y_test[0])
    y_pred.append(yhat[0])

In [4]:
accuracy = accuracy_score(y_true, y_pred)
print("Accuracy: %.3f" % accuracy)

Accuracy: 0.990


In [5]:
scores = cross_val_score(model, X, y, scoring = 'accuracy', cv = cv, n_jobs = -1)
print('Accuracy: %.3f (%.3f)' % (mean(scores), std(scores)))

Accuracy: 0.990 (0.099)


The mean classification accuracy across all folds matches our manual estimate previously.

## LOOCV for Regression ( on housing data)

In [6]:
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/housing.csv"
data = read_csv(url, header = None)
data = data.values

In [7]:
X, y = data[:, :-1], data[:, -1]
print(X.shape, y.shape)

(506, 13) (506,)


In [8]:
cv = LeaveOneOut()
model = RandomForestRegressor(random_state = 1)
scores = cross_val_score(model, X, y, scoring = 'neg_mean_absolute_error', cv = cv, n_jobs = -1)
scores = absolute(scores)
print('MAE: %.3f (%.3f)' % (mean(scores), std(scores)))

MAE: 2.180 (2.346)


The model is evaluated using LOOCV and the performance of the model when making predictions on new data is a mean absolute error of about 2.180 (thousands of dollars).