# Example 1: Leave-One-Out Cross-Validation in Python (With Examples)

In [None]:
#To evaluate the performance of a model on a dataset, we need to measure how well the predictions made by the model 
#match the observed data.

#One commonly used method for doing this is known as leave-one-out cross-validation (LOOCV), which uses the following approach:

#1. Split a dataset into a training set and a testing set, using all but one observation as part of the training set.

#2. Build a model using only data from the training set.

#3. Use the model to predict the response value of the one observation left out of the model and calculate the mean squared error (MSE).

#4. Repeat this process n times. Calculate the test MSE to be the average of all of the test MSE’s.

#This notebook provides a step-by-step example of how to perform LOOCV for a given model in Python.

In [1]:
#Step 1: Load Necessary Libraries
#First, we’ll load the necessary functions and libraries for this example:



from sklearn.model_selection import train_test_split
from sklearn.model_selection import LeaveOneOut          #Load LeaveOneOut from sklearn
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LinearRegression
from numpy import mean
from numpy import absolute
from numpy import sqrt
import pandas as pd

In [2]:
#Step 2: Create the Data
#Next, we create a pandas DataFrame that contains two predictor variables, x1 and x2, and a single response variable y.


df = pd.DataFrame({'y': [6, 8, 12, 14, 14, 15, 17, 22, 24, 23],
                   'x1': [2, 5, 4, 3, 4, 6, 7, 5, 8, 9],
                   'x2': [14, 12, 12, 13, 7, 8, 7, 4, 6, 5]})

In [3]:
#Step 3: Perform Leave-One-Out Cross-Validation
#Next, we fit a multiple linear regression model to the dataset and perform LOOCV to evaluate the model performance.



#define predictor and response variables
X = df[['x1', 'x2']]
y = df['y']

#define cross-validation method to use
cv = LeaveOneOut()

#build multiple linear regression model
model = LinearRegression()

#use LOOCV to evaluate model
scores = cross_val_score(model, X, y, scoring='neg_mean_absolute_error',
                         cv=cv, n_jobs=-1)

#view mean absolute error (MAE)
mean(absolute(scores))


3.1461548083469744

In [None]:
#From the output we can see that the mean absolute error (MAE) was 3.146. 
#That is, the average absolute error between the model prediction and the actual observed data is 3.146.

#In general, the lower the MAE, the more closely a model is able to predict the actual observations.

#Another commonly used metric to evaluate model performance is the root mean squared error (RMSE). 
#The following code shows how to calculate this metric using LOOCV:

In [4]:
#define predictor and response variables
X = df[['x1', 'x2']]
y = df['y']

#define cross-validation method to use
cv = LeaveOneOut()

#build multiple linear regression model
model = LinearRegression()

#use LOOCV to evaluate model
scores = cross_val_score(model, X, y, scoring='neg_mean_squared_error',
                         cv=cv, n_jobs=-1)

#view RMSE
sqrt(mean(absolute(scores)))

3.6194564763855688

In [None]:
#From the output we can see that the root mean squared error (RMSE) was 3.619. 
#The lower the RMSE, the more closely a model is able to predict the actual observations.

#In practice we typically fit several different models and compare 
#the RMSE or MAE of each model to decide which model produces the lowest test error rates 
#and is therefore the best model to use.

# Example 2: Using cross_val_score in sklearn

In [None]:

#cross_val_score is a common function to use during the testing and 
#validation phase of your machine learning model development. 

#cross_val_score is used to drive K-Fold Cross Validation in sklearn

#Here we explain what it is, what you can use it for, and how to implement it in Python.

In [None]:
#Cross_val_score in sklearn, what is it?
#Cross_val_score is a function in the scikit-learn package which trains 
#and tests a model over multiple folds of your dataset. 

#This cross validation method gives you a better understanding of model performance over the 
#whole dataset instead of just a single train/test split.

In [None]:
#The process that cross_val_score uses is typical for cross validation and follows these steps:

#1. The number of folds is defined, by default this is 5
#2. The dataset is split up according to these folds, where each fold has a unique set of testing data
#3. A model is trained and tested for each fold
#4. Each fold returns a metric for it's test data
#5. The mean and standard deviation of these metrics can then be calculated to provide a single metric for the process


#An illustration of how this works is shown below:

In [None]:
#What is cross_val_score used for?
#Cross_val_score is used as a simple cross validation technique to prevent over-fitting and promote model generalisation.

#The typical process of model development is to train a model on one fold of data and then test on another. 
#But how do we know that this single test dataset is representative? 
#This is why we use cross_val_score and cross validation more generally, 
#to train and test our model on multiple folds such that we can be sure 
#our model generalises well across the whole dataset and not just a single portion.

#If we see that the metrics for all folds in cross_val_score are uniform 
#then it can be concluded that the model is able to generalise, 
#however if there are significant differences between them then 
#this may indicate over-fitting to certain folds and would need to be investigated further.

In [None]:
#How many folds should I use in cross_val_score?
#By default cross_val_score uses a 5-fold strategy, however this can be adjusted in the cv (hyper)parameter.

In [None]:
#But how many folds should you choose?

#There is unfortunately no hard and fast rules when it comes to how many folds you should choose. 

#A general rule of thumb though is that the number of folds should be as large as possible such 
#that each split has enough observations to generalise from and be tested on.

In [None]:
#Can I train my model using cross_val_score?

#A common question developers have is whether cross_val_score can also function as a way of training the final model. 
#Unfortunately this is not the case. Cross_val_score is a way of assessing a model and it’s parameters, 
#and cannot be used for final training. Final training should take place on all available data and tested 
#using a set of data that has been held back from the start.

In [None]:
#Can I use cross_val_score for classification and regression?

#cross_val_score is a function which can be used for both classification and regression models. 
#The only major difference between the two is that by default cross_val_score uses Stratified KFold for classification, 
#and normal KFold for regression.

In [None]:
#Which metrics can I use in cross_val_score
#By default cross_val_score uses the chosen model’s default scoring metric, 
#but this can be overridden with your metric of choice in the scoring parameter.

#The common metrics provided by sklearn are passable as a string into this parameter, where some typical choices would be:

‘accuracy’
‘balanced_accuracy’
‘roc_auc’
‘f1’
‘neg_mean_absolute_error’
‘neg_root_mean_squared_error’
‘r2’

In [None]:
#How to implement cross_val_score in Python?

#This function is simple to implement in Python, but first let’s look at 
#how it fits into a typical machine learning development workflow:

#1. Create a dataset
#2. Run hyper-parameter tuning
#3. Create model object with desired parameters
#4. Run cross_val_score to test model performance
#5. Train final model on full dataset

In [None]:
#Therefore, in order to use cross_val_score we need to first have an 
#idea of the model we want to use and a prepared dataset to test it on.

#Let’s look at how this process would look in Python using 
#a Linear Regression model and the Diabetes dataset from sklearn:

In [5]:
from sklearn import datasets
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LinearRegression

#load toy dataset (Diabetes dataset) from sklearn
X, y = datasets.load_diabetes(return_X_y=True)

#build linear regression model
model = LinearRegression()

#use 5-Fold CV i.e. cv=5 to obtain performance scores
scores = cross_val_score(model, X, y, cv=5, scoring='neg_root_mean_squared_error')

print("Mean score of %0.2f with a standard deviation of %0.2f" % (scores.mean(), scores.std()))

Mean score of -54.69 with a standard deviation of 1.37


In [None]:
#Function parameters for cross_val_score
#There are a number of parameters that you should be aware of when using cross_val_score. They are:

#estimator - The model object to use to fit the data
#X - The data to fit the model on
#y - The target of the model
#scoring - The error metric to use
#cv - The number of splits to use i.e. k

In [None]:
#Summary of the cross_val_score function
#Cross_val_score is a method which runs cross validation on a 
#dataset to test whether the model can generalise over the whole dataset. 

#The function returns a list of one score per split, and the average of these scores 
#can be calculated to provide a single metric value for the dataset. 
#This is a function and a technique which you should add to your workflow to make sure you are developing highly performant models.

# Example 3: Using cross_validate in sklearn

In [None]:
from sklearn import datasets
from sklearn.model_selection import cross_validate #load cross_validate from sklearn
from sklearn.linear_model import LinearRegression

#load toy dataset
X, y = datasets.load_diabetes(return_X_y=True)

#define/select performance metric for your model
metrics = ['neg_mean_absolute_error', 'r2']

#build linear regression model
model = LinearRegression()

#use 5-Fold CV i.e. cv=5 to obtain performance scores
scores = cross_validate(model, X, y, cv=5, scoring=metrics)


#Separate performance metrics
mae_scores = scores['test_neg_mean_absolute_error']
r2_scores = scores['test_r2']

#Display mean and standard deviation of performance metrics
print("Mean mae of %0.2f with a standard deviation of %0.2f" % (mae_scores.mean(), mae_scores.std()))
print("Mean r2 of %0.2f with a standard deviation of %0.2f" % (r2_scores.mean(), r2_scores.std()))

# cross_validate vs. cross_val_score

In [None]:
#cross_validate allows you to use more than one performance metric at a time 
#while cross_val_score allows you to use one performance metric at a time

In [1]:
from sklearn import datasets
from sklearn.model_selection import cross_validate, cross_val_score
from sklearn.linear_model import LinearRegression

X, y = datasets.load_diabetes(return_X_y=True)
model = LinearRegression()

# Running cross_validate with multi metric
metrics = ['neg_mean_absolute_error', 'r2']
scores = cross_validate(model, X, y, cv=5, scoring=metrics)

mae_scores = scores['test_neg_mean_absolute_error']
r2_scores = scores['test_r2']

print("Mean mae of %0.2f with a standard deviation of %0.2f" % (mae_scores.mean(), mae_scores.std()))
print("Mean r2 of %0.2f with a standard deviation of %0.2f" % (r2_scores.mean(), r2_scores.std()))

# Running cross_val_score with single metric
scores = cross_val_score(model, X, y, cv=5, scoring='neg_root_mean_squared_error')

print("Mean score of %0.2f with a standard deviation of %0.2f" % (scores.mean(), scores.std()))


Mean mae of -44.28 with a standard deviation of 2.10
Mean r2 of 0.48 with a standard deviation of 0.05
Mean score of -54.69 with a standard deviation of 1.37


# Example 4: Using Grid Search + K-Fold Cross Validation to Build a Logistic Regression Model

In [7]:
# grid search logistic regression model on the sonar dataset
from pandas import read_csv
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.model_selection import GridSearchCV


# load dataset
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/sonar.csv'
dataframe = read_csv(url, header=None)


# split into input and output elements
data = dataframe.values
X, y = data[:, :-1], data[:, -1]


# Establish instance of logistic Regression classifier
model = LogisticRegression()

# define model evaluation
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)

# define hyperparameter search space
space = dict()
space['solver'] = ['newton-cg', 'lbfgs', 'liblinear']
space['penalty'] = ['none', 'l1', 'l2', 'elasticnet'] #Regularization options/parameters
space['C'] = [1e-5, 1e-4, 1e-3, 1e-2, 1e-1, 1, 10, 100]


#The main hyperparameters we may tune in logistic regression are: solver, penalty, and 
#regularization strength, C (sklearn documentation). 
#Solver is the algorithm to use in the optimization problem. 
#The choices are {'newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga'}, default='lbfgs'.

#The trade-off parameter of logistic regression that determines the strength of the regularization is called C, 
#and higher values of C correspond to less regularization (where we can specify the regularization function).
#C is actually the Inverse of regularization strength(lambda)


# define search (via GridSearch)
search = GridSearchCV(model, space, scoring='accuracy', n_jobs=-1, cv=cv)

# execute search - and use best hyperparameters to fit logistic regression model on data
result = search.fit(X, y)

# summarize and display result
print('Best Score: %s' % result.best_score_)

#Display best hyperparameters
print('Best Hyperparameters: %s' % result.best_params_)

Best Score: 0.7828571428571429
Best Hyperparameters: {'C': 1, 'penalty': 'l2', 'solver': 'newton-cg'}


1440 fits failed out of a total of 2880.
The score on these train-test partitions for these parameters will be set to nan.
If these failures are not expected, you can try to debug them by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
240 fits failed with the following error:
Traceback (most recent call last):
  File "C:\Users\abiy3759\Anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 680, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "C:\Users\abiy3759\Anaconda3\lib\site-packages\sklearn\linear_model\_logistic.py", line 1461, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "C:\Users\abiy3759\Anaconda3\lib\site-packages\sklearn\linear_model\_logistic.py", line 464, in _check_solver
    raise ValueError("penalty='none' is not supported for the liblinear solver")
ValueError: penalty='none' is not sup