# Save Gradient Boosting Models with XGBoost

XGBoost can be used to create some of the most performant models for tabular data using the gradient boosting algorithm.

Once trained, it is often a good practice to save your model to file for later use in making predictions new test and validation datasets and entirely new data.

In this post you will discover how to save your XGBoost models to file using the standard Python pickle API.

## Serialize Your XGBoost Model with Pickle

Pickle is the standard way of serializing objects in Python.

You can use the Python [pickle](https://docs.python.org/2/library/pickle.html) API to serialize your machine learning algorithms and save the serialized format to a file, for example:

<pre>
# save model to file
pickle.dump(model, open("pima.pickle.dat", "wb"))
</pre>

Later you can load this file to deserialize your model and use it to make new predictions, for example:

<pre>
# load model from file
loaded_model = pickle.load(open("pima.pickle.dat", "rb"))
</pre>

The example below demonstrates how you can train a XGBoost model on the Pima Indians onset of diabetes [dataset](https://archive.ics.uci.edu/ml/datasets/Pima+Indians+Diabetes), save the model to file and later load it to make predictions.

The full code listing is provided below for completeness.

In [2]:
# Train XGBoost model, save to file using pickle, load and make predictions
from numpy import loadtxt
import xgboost
import pickle
from sklearn import cross_validation
from sklearn.metrics import accuracy_score

# load data
dataset = loadtxt('pima-indians-diabetes.csv', delimiter=",")

# split data into X and y
X = dataset[:,0:8]
Y = dataset[:,8]

# split data into train and test sets
seed = 7
test_size = 0.33
X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, Y, test_size=test_size, random_state=seed)

# fit model no training data
model = xgboost.XGBClassifier()
model.fit(X_train, y_train)

# save model to file
pickle.dump(model, open("pima.pickle.dat", "wb"))

# some time later...

# load model from file
loaded_model = pickle.load(open("pima.pickle.dat", "rb"))

# make predictions for test data
y_pred = loaded_model.predict(X_test)
predictions = [round(value) for value in y_pred]

# evaluate predictions
accuracy = accuracy_score(y_test, predictions)
print("Accuracy: %.2f%%" % (accuracy * 100.0))

Accuracy: 77.95%


Running this example saves your trained XGBoost model to the **pima.pickle.dat** pickle file in the current working directory.

After loading the model and making predictions on the training dataset, the accuracy of the model is printed.