# Saving and Loading a Pipeline

This short guide shows how serialize a Pipeline into a file and later on load it
to make predictions.

Note that some steps are not explained for simplicity. Full details
about them can be found in the previous parts of the tutorial.

We will:

1. Load and fit a pipeline to a dataset
2. Save the pipeline to a file.
3. Load the pipeline as a new object.
4. Make predictions using the new pipeline object.

## Fit the pipeline

The first step will be to load and fit the pipeline to the dataset.

In [1]:
from utils import load_census

dataset = load_census()

In [2]:
X_train, X_test, y_train, y_test = dataset.get_splits(1)

In [3]:
from mlblocks import MLPipeline

primitives = [
    'mlprimitives.custom.preprocessing.ClassEncoder',
    'mlprimitives.custom.feature_extraction.CategoricalEncoder',
    'sklearn.impute.SimpleImputer',
    'xgboost.XGBClassifier',
    'mlprimitives.custom.preprocessing.ClassDecoder'
]
pipeline = MLPipeline(primitives)

In [4]:
pipeline.fit(X_train, y_train)



## Save the Pipeline

Once the pipeline is fit and ready to make predictions we can store it in a file.
We will do so using [pickle](https://docs.python.org/3/library/pickle.html)

In [5]:
import pickle

with open('pipeline.pkl', 'wb') as f:
    pickle.dump(pipeline, f)

## Load the Pipeline

The saved pipeline can then be moved to another system where we can load it back to
memory using pickle again.

In [6]:
with open('pipeline.pkl', 'rb') as f:
    loaded_pipeline = pickle.load(f)

**IMPORTANT**: All the dependencies need to also be installed in the system that is loading the pipeline. This includes **MLBlocks** and **MLPrimitives** or any other libraries required by the pipeline primitives.

## Make Predictions

Once the pipeline is loaded it is ready to make predictions again

In [7]:
pred = pipeline.predict(X_test)

In [8]:
pred[0:5]

array([' >50K', ' <=50K', ' >50K', ' <=50K', ' <=50K'], dtype=object)