# ZenML Quickstart Guide

<a href="https://colab.research.google.com/github/zenml-io/zenml/blob/main/examples/quickstart/notebooks/quickstart.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This quickstart aims to help you get your first practical experience with ZenML and give you a brief overview of various functionalities. 

Throughout this quickstart, we will:
- Train a model, evaluate it, deploy it, and embed it in an inference pipeline,
- Automatically version, track, and cache data, models, and other artifacts,
- Track model hyperparameters and metrics in an experiment tracking tool,
- Measure and visualize train-test skew, training-serving skew, and data drift.

**New to MLOps?** Then you might want to start with our [**ZenBytes**](https://github.com/zenml-io/zenbytes) lesson series instead, where we cover each MLOps concept in much more detail. This quickstart assumes you are already familiar with basic MLOps issues and just want to learn how to approach them with ZenML.

## Install Libraries

In [None]:
%pip install zenml matplotlib  # install ZenML and matplotlib
!zenml integration install dash sklearn mlflow evidently facets -y  # install ZenML integrations
!zenml init  # Initialize a ZenML repository
!zenml profile create quickstart  # create a new ZenML profile
!zenml profile set quickstart  # use the new ZenML profile for all runs
%pip install pyparsing==2.4.2  # required for Colab

import IPython

# automatically restart kernel
IPython.Application.instance().kernel.do_shutdown(restart=True)

Please wait for the installation to complete before running subsequent cells. At the end of the installation, the notebook kernel will automatically restart.

## Using Google Colab

If you follow this quickstart in Google's Colab, you will need an [ngrok account](https://dashboard.ngrok.com/signup) to view some of the visualizations later. Please set up an account, then set your user token below:

In [None]:
NGROK_TOKEN = ""  # TODO: set your ngrok token if you are working on Colab

In [None]:
# COLAB ONLY setup
try:
    import google.colab

    IN_COLAB = True

    # install ngrok and set auth token
    !pip install pyngrok
    !ngrok authtoken {NGROK_TOKEN}

except ModuleNotFoundError as err:
    IN_COLAB = False

## Create an MLOps Stack

ZenML decouples your code from the infrastructure and tooling you use.
This enables you to quickly take your code from experimentation to production.
Furthermore, using ZenML prevents vendor lock-in by allowing you to switch out any part of your MLOps stack easily
See the [ZenML Integrations](https://docs.zenml.io/features/integrations) page for a list of all tools we currently support.

Throughout this quickstart, we will use the following MLOps stack: [MLFlow](https://mlflow.org/) for experiment tracking and model deployment, [Facets](https://pair-code.github.io/facets/) for visualizing train-test skew and training-serving skew, and [Evidently](https://evidentlyai.com/) for data drift detection.

![Quickstart MLOps Stack Overview](../_assets/stack_overview.png)

Before we start, we need to register all stack components that require configuration into our ZenML MLOps stack:

In [None]:
# Register the MLflow experiment tracker
!zenml experiment-tracker register mlflow_tracker --flavor=mlflow

# Register the MLflow model deployer
!zenml model-deployer register mlflow_deployer --flavor=mlflow

# Add the MLflow components into our default stack
!zenml stack update default -d mlflow_deployer -e mlflow_tracker

# Visualize the current ZenML stack
!zenml stack describe

## Define ML Pipelines
Let us now use ZenML to write two ML pipelines for continuous training and serving.

The training pipeline will:
- Load the [Digits dataset](https://paperswithcode.com/dataset/digits),
- Train a model on the training data (and track hyperparameters using [MLFlow](https://mlflow.org/)),
- Test the model on the test data,
- Deploy the model (with [MLFlow](https://mlflow.org/)) if the test accuracy is higher than a certain threshold,
- Compare train and test data for later skew visualization (with [Facets](https://pair-code.github.io/facets/)).

The inference pipeline will:
- Load inference data,
- Load the most recently deployed model,
- Run model inference on the inference data,
- Compare train and inference data for later skew visualization (with [Facets](https://pair-code.github.io/facets/)),
- Check for data drift (with [Evidently](https://evidentlyai.com/)).

You can see a visualization of the two pipelines below:

![Overview of Quickstart Pipelines](../_assets/quickstart_pipelines.png)

Let's now define those pipelines with ZenML. To do so, we simply write a Python function that defines how the data will move through the different steps and decorate it with ZenML's `@pipeline` decorator. Under the hood, ZenML will build a [DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph) that determines the order in which the steps need to be executed. 

In [None]:
from zenml.pipelines import pipeline


@pipeline(enable_cache=False)
def training_pipeline(
    training_data_loader,
    skew_comparison,
    trainer,
    evaluator,
    deployment_trigger,
    model_deployer,
):
    """Train, evaluate, and deploy a model."""
    X_train, X_test, y_train, y_test = training_data_loader()
    skew_comparison(X_train, X_test)
    model = trainer(X_train=X_train, y_train=y_train)
    test_acc = evaluator(X_test=X_test, y_test=y_test, model=model)
    deployment_decision = deployment_trigger(test_acc)
    model_deployer(deployment_decision, model)


@pipeline
def inference_pipeline(
    inference_data_loader,
    prediction_service_loader,
    predictor,
    training_data_loader,
    skew_comparison,
    drift_detector,
):
    """Inference pipeline with data drift detection."""
    inference_data = inference_data_loader()
    model_deployment_service = prediction_service_loader()
    predictor(model_deployment_service, inference_data)
    training_data, _, _, _ = training_data_loader()
    reference, comparison = skew_comparison(training_data, inference_data)
    drift_detector(reference, comparison)

## Implement Pipeline Steps

Next, we need to implement the steps that make up these pipelines. 
Again, we can do this by writing simple Python functions and decorating them with ZenML's `@step` decorator.

In total, we will need ten steps:
- Training data loader
- Inference data loader
- Model trainer
- Model evaluator
- Deployment trigger
- Model deployer
- Prediction service loader
- Predictor
- Skew comparison
- Drift detection

### Data Loaders
Let's start with data loading. We load the Digits dataset for training and, for simplicity, use some random noise samples for inference.

In [None]:
import numpy as np
import pandas as pd
from zenml.integrations.sklearn.helpers.digits import get_digits
from zenml.steps import step, Output


@step
def training_data_loader() -> Output(
    X_train=np.ndarray, X_test=np.ndarray, y_train=np.ndarray, y_test=np.ndarray
):
    """Loads the digits dataset as normal numpy arrays."""
    X_train, X_test, y_train, y_test = get_digits()
    return X_train, X_test, y_train, y_test

In [None]:
@step
def inference_data_loader() -> np.ndarray:
    """Load some (random) inference data."""
    return np.random.rand(1, 64)  # flattened 8x8 random noise image

### Model Trainer
To train our model, we define a step that builds an [sklearn SVC](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html) model and fits it on the given training data. Additionally, we log all model hyperparameters and metrics to [MLFlow](https://mlflow.org/).

Note that we do not need to save the model within the step explicitly; ZenML is automatically taking care of this for us. Under the hood, ZenML persists all step inputs and outputs in an [Artifact Store](https://docs.zenml.io/core-concepts#artifact-store) and their metadata in a [Metadata Store](https://docs.zenml.io/core-concepts#metadata-store). This also means that all of our data and models are automatically versioned and tracked.

In [None]:
import mlflow
from sklearn.base import ClassifierMixin
from sklearn.svm import SVC
from zenml.integrations.mlflow.mlflow_step_decorator import enable_mlflow


@enable_mlflow  # setup MLflow
@step(enable_cache=False)
def svc_trainer_mlflow(
    X_train: np.ndarray,
    y_train: np.ndarray,
) -> ClassifierMixin:
    """Train a sklearn SVC classifier and log to MLflow."""
    mlflow.sklearn.autolog()  # log all model hparams and metrics to MLflow
    model = SVC(gamma=0.001)
    model.fit(X_train, y_train)
    return model

### Model Evaluator and Deployment Trigger

Since our model is a [sklearn Model](https://scikit-learn.org/stable/developers/develop.html), we can simply call `model.score` to compute its test accuracy.

We then use the output of this step to only trigger deployment for models that achieved >90% test accuracy.

In [None]:
@step
def evaluator(
    X_test: np.ndarray,
    y_test: np.ndarray,
    model: ClassifierMixin,
) -> float:
    """Calculate the accuracy on the test set"""
    test_acc = model.score(X_test, y_test)
    print(f"Test accuracy: {test_acc}")
    return test_acc

In [None]:
@step
def deployment_trigger(test_acc: float) -> bool:
    """Only deploy if the test accuracy > 90%."""
    return test_acc > 0.9

### Model Deployment and Drift Detection

ZenML provides default steps for MLflow model deployment and Evidently drift detection, which we can simply import:

In [None]:
from zenml.integrations.mlflow.steps import mlflow_model_deployer_step

In [None]:
from zenml.integrations.evidently.steps import (
    EvidentlyProfileConfig,
    EvidentlyProfileStep,
)

evidently_profile_config = EvidentlyProfileConfig(
    column_mapping=None, profile_sections=["datadrift"]
)

drift_detector = EvidentlyProfileStep(config=evidently_profile_config)

### Skew Comparison

For the Facets skew comparison, we need to write a step that outputs two pandas DataFrames that should be compared for skew. Since the data loaders load all data in numpy array format, we simply need to write a function that converts from numpy to pandas here:

In [None]:
@step
def skew_comparison(
    reference_input: np.ndarray,
    comparison_input: np.ndarray,
) -> Output(reference=pd.DataFrame, comparison=pd.DataFrame):
    """Convert data from numpy to pandas for skew comparison."""
    columns = [str(x) for x in list(range(reference_input.shape[1]))]
    return pd.DataFrame(reference_input, columns=columns), pd.DataFrame(
        comparison_input, columns=columns
    )

### Prediction Service Loader and Predictor

Lastly, we need to write the inference pipeline steps for loading a deployed model and computing its prediction on the test data.

To load the deployed model, we query ZenML's artifact store to find a model deployed with our current MLOps stack and the given training pipeline and deployment step names (more on this later):

In [None]:
from zenml.services import BaseService
from zenml.repository import Repository


@step(enable_cache=False)
def prediction_service_loader() -> BaseService:
    """Load the model service of our train_evaluate_deploy_pipeline."""
    repo = Repository(skip_repository_check=True)
    model_deployer = repo.active_stack.model_deployer
    services = model_deployer.find_model_server(
        pipeline_name="training_pipeline",
        pipeline_step_name="mlflow_model_deployer_step",
        running=True,
    )
    service = services[0]
    return service

To inference the deployed model, we simply call its `predict()` method to get logits and compute the `argmax` to obtain the final prediction:

In [None]:
@step
def predictor(
    service: BaseService,
    data: np.ndarray,
) -> Output(predictions=list):
    """Run a inference request against a prediction service"""
    service.start(timeout=10)  # should be a NOP if already started
    prediction = service.predict(data)
    prediction = prediction.argmax(axis=-1)
    print(f"Prediction is: {[prediction.tolist()]}")
    return [prediction.tolist()]

## Run ZenML Pipelines

Running pipelines is as simple as calling the `run()` method on an instance of the defined pipeline.

**Note:** If you get an error `NoSuchProcess: process no longer exists (pid=95685)` when running the following cell, simply run the cell again.

In [None]:
training_pipeline(
    training_data_loader=training_data_loader(),
    skew_comparison=skew_comparison(),
    trainer=svc_trainer_mlflow(),
    evaluator=evaluator(),
    deployment_trigger=deployment_trigger(),
    model_deployer=mlflow_model_deployer_step(),
).run()

After the training pipeline run has finished, the trained model will have been deployed using MLflow and is ready for prediction requests. We can use `zenml served-models list` to get an overview of all currently deployed models:

In [None]:
!zenml served-models list

When we run the inference pipeline, the `prediction_service_loader` step will fetch a reference to the deployed model so further steps can send prediction requests to it. If we were to rerun the training pipeline, we would automatically overwrite the deployed model, so we can deploy new models to production without having to interrupt or modify the inference pipeline.

In [None]:
inference_pipeline(
    inference_data_loader=inference_data_loader(),
    prediction_service_loader=prediction_service_loader(),
    predictor=predictor(),
    training_data_loader=training_data_loader(),
    skew_comparison=skew_comparison(),
    drift_detector=drift_detector,
).run()

## Pipeline Visualization & Caching

After we have run a pipeline, we can visualize it using ZenML's Dash integration:

In [None]:
from zenml.integrations.dash.visualizers.pipeline_run_lineage_visualizer import (
    PipelineRunLineageVisualizer,
)

First off, let's load the ZenML repository, which is where all our pipelines live. We can then fetch a specific pipeline by name using the `get_pipeline()` method. The pipeline's name defaults to the function name, if not specified. Afterward, we can get a specific pipeline run using the pipeline's `get_run()` method. Alternatively, we can simply access the latest run of a pipeline using `runs[-1]`.

In [None]:
from zenml.repository import Repository

repo = Repository()

# get latest training pipeline run
train_run = repo.get_pipeline(pipeline_name="training_pipeline").runs[-1]

# get latest inference pipeline run
inference_run = repo.get_pipeline(pipeline_name="inference_pipeline").runs[-1]

To visualize a run, we can simply use `PipelineRunLineageVisualizer().visualize()` now:

In [None]:
def visualize_pipeline(run):
    if IN_COLAB:
        from pyngrok import ngrok

        public_url = ngrok.connect(8050)
        print(f"\x1b[31mIn Colab, use this URL instead: {public_url}!\x1b[0m")

    PipelineRunLineageVisualizer().visualize(run)

In [None]:
visualize_pipeline(train_run)

You should now see an interactive visualization in your browser. The squares represent your artifacts and the circles your pipeline steps. Also, note that the different nodes are color-coded. Right now, they should all be blue, meaning all steps were executed correctly. 
If a pipeline were to fail or run for too long, you could find the responsible step marked red or yellow.
Lastly, if you rerun any of the pipelines with a different `run_name`, you will notice that several nodes will change from blue to green, which indicates they were still cached from the last run and did not have to be rerun. This means we can switch out intermediate steps of our pipeline without rerunning any of the previous steps, which can save us a lot of time and resources in real production environments!

<img src="../_assets/train_pipeline.png" alt="Training Pipeline Visualization" width="50%"/>

Let's also visualize our inference pipeline:

In [None]:
visualize_pipeline(inference_run)

<img src="../_assets/inference_pipeline.png" alt="Inference Pipeline Visualization" width="50%"/>

## Visualize Data Skew and Data Drift

ZenML provides a variety of visualization tools in addition to the pipeline visualizer shown above. E.g., using the `FacetStatisticsVisualizer` we can visualize differences between data distributions to check for train-test or training-serving skew, and using the `EvidentlyVisualizer` we can visualize data drift:

In [None]:
from zenml.integrations.facets.visualizers.facet_statistics_visualizer import (
    FacetStatisticsVisualizer,
)
from zenml.integrations.evidently.visualizers import EvidentlyVisualizer

In [None]:
train_test_skew_step = train_run.get_step(name="skew_comparison")
FacetStatisticsVisualizer().visualize(train_test_skew_step, magic=True)

**Note**: In some environments the FacetsVisualizer needs to be run twice to get up running. If you executed the previous cell and nothing happens, just try to run the cell another time.

As we see, the train and test data distributions look very similar, meaning we have minimal train-test skew.

<img src="../_assets/train_test_skew.png" alt="Train-Test Skew Visualization with Facets" width="50%"/>

In [None]:
training_serving_skew_step = inference_run.get_step(name="skew_comparison")
FacetStatisticsVisualizer().visualize(training_serving_skew_step, magic=True)

When comparing training and serving data, we can see quite some skew. This is, of course, expected since we mocked our inference data loader using random data. Therefore, the visualizations probably also look different for you than the example below.

<img src="../_assets/training_serving_skew.png" alt="Training-Serving Skew Visualization with Facets" width="50%"/>

In [None]:
drift_detection_step = inference_run.get_step(name="drift_detector")
EvidentlyVisualizer().visualize(drift_detection_step)

Accordingly, Evidently will also detect data drift for several features:

<img src="../_assets/data_drift.png" alt="Evidently Data Drift Visualization" width="50%"/>

## MLflow Experiment Tracking

Lastly, remember how we added MLflow experiment tracking to our `svc_trainer_mlflow` step before?
Those two simple lines of code automatically configured and initialized MLflow and logged all hyperparameters and metrics there.

Let's start up the MLflow UI and check it out!

![MLflow UI](../_assets/mlflow_ui.png)

In [None]:
from zenml.integrations.mlflow.mlflow_utils import get_tracking_uri


def open_mlflow_ui(port=4997):
    if IN_COLAB:
        from pyngrok import ngrok

        public_url = ngrok.connect(port)
        print(f"\x1b[31mIn Colab, use this URL instead: {public_url}!\x1b[0m")

    !mlflow ui --backend-store-uri="{get_tracking_uri()}" --port={port}


open_mlflow_ui()

## Congratulations!

You just built your first ML Pipelines! You not only trained a model, you also deployed it, served it, and learned how to monitor and visualize everything that's going on. Did you notice how easy it was to bring all of the different components together using ZenML's abstractions? And that is just the tip of the iceberg of what ZenML can do; check out the [**Integrations**](https://docs.zenml.io/features/integrations) page for a list of all the cool MLOps tools that ZenML supports!

## Where to go next

* If you have questions or feedback... 
  * Join our [**Slack Community**](https://zenml.io/slack-invite) and become part of the ZenML family!
* If this quickstart was a bit too quick for you... 
  * Check out [**ZenBytes**](https://github.com/zenml-io/zenbytes), our lesson series on practical MLOps, where we cover each MLOps concept in much more detail.
* If you want to learn more about using or extending ZenML...
  * Check out our [**Docs**](https://docs.zenml.io/) or read through our code on [**Github**](https://github.com/zenml-io/zenml).
* If you want to quickly learn how to use a specific tool with ZenML...
  * Check out our collection of [**Examples**](https://github.com/zenml-io/zenml/tree/doc/hamza-misc-updates/examples).
* If you want to see some advanced ZenML use cases... 
  * Check out [**ZenFiles**](https://github.com/zenml-io/zenfiles), our collection of production-grade ML use-cases.