# Evaluate XGBoost Models
The goal of developing a predictive model is to develop a model that is accurate on unseen data.
This can be achieved using statistical techniques where the training dataset is carefully used
to estimate the performance of the model on new and unseen data. In this tutorial you will
discover how you can evaluate the performance of your gradient boosting models with XGBoost
in Python.


### Evaluate Models With Train and Test Sets


In [1]:
# train-test split evaluation of xgboost model
from numpy import loadtxt
from xgboost import XGBClassifier
from sklearn.cross_validation import train_test_split
from sklearn.metrics import accuracy_score
# load data
dataset = loadtxt('pima-indians-diabetes.csv', delimiter=",")
# split data into X and y
X = dataset[:,0:8]
Y = dataset[:,8]
# split data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=7)
# fit model no training data
model = XGBClassifier()
model.fit(X_train, y_train)
# make predictions for test data
y_pred = model.predict(X_test)
predictions = [round(value) for value in y_pred]
# evaluate predictions
accuracy = accuracy_score(y_test, predictions)
print("Accuracy: %.2f%%" % (accuracy * 100.0))


Accuracy: 77.95%


### Evaluate Models With k-Fold Cross Validation
Cross validation is an approach that you can use to estimate the performance of a machine
learning algorithm with less variance than a single train-test set split. It works by splitting the
dataset into k-parts (e.g. k=5 or k=10). Each split of the data is called a fold. The algorithm
is trained on k-1 folds with one held back and tested on the held back fold. This is repeated
so that each fold of the dataset is given a chance to be the held back test set. After running
cross validation you end up with k-different performance scores that you can summarize using a
mean and a standard deviation.


In [2]:
# k-fold cross validation evaluation of xgboost model
from numpy import loadtxt
from xgboost import XGBClassifier
from sklearn.cross_validation import KFold
from sklearn.cross_validation import cross_val_score
# load data
dataset = loadtxt('pima-indians-diabetes.csv', delimiter=",")
# split data into X and y
X = dataset[:,0:8]
Y = dataset[:,8]
# CV model
model = XGBClassifier()
kfold = KFold(n=len(X), n_folds=10, random_state=7)
results = cross_val_score(model, X, Y, cv=kfold)
print("Accuracy: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))


Accuracy: 76.69% (7.11%)


If you have many classes for a classification type predictive modeling problem or the classes
are imbalanced (there are a lot more instances for one class than another), it can be a good idea to
create stratified folds when performing cross validation. This has the effect of enforcing the same
distribution of classes in each fold as in the whole training dataset when performing the cross
validation evaluation. The scikit-learn library provides this capability in the StratifiedKFold
class. Below is the same example modified to use stratified cross validation to evaluate an
XGBoost model.


In [3]:
# stratified k-fold cross validation evaluation of xgboost model
from numpy import loadtxt
from xgboost import XGBClassifier
from sklearn.cross_validation import StratifiedKFold
from sklearn.cross_validation import cross_val_score
# load data
dataset = loadtxt('pima-indians-diabetes.csv', delimiter=",")
# split data into X and y
X = dataset[:,0:8]
Y = dataset[:,8]
# CV model
model = XGBClassifier()
kfold = StratifiedKFold(Y, n_folds=10, random_state=7)
results = cross_val_score(model, X, Y, cv=kfold)
print("Accuracy: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))


Accuracy: 76.95% (5.88%)
