# Intro to MLOps using ZenML

## üåç Overview

This repository is a minimalistic MLOps project intended as a starting point to learn how to put ML workflows in production.

Within this notebook we will show you how simple it is to switch from running code locally to running it remotely. You will then be able to explore all the metadata of your run in the ZenML Dashboard.

<img src=".assets/Overview.png" width="50%" alt="Quickstart Overview">

Follow along this notebook to understand how you can use ZenML to productionalize your ML workflows!

## Run on Colab

You can use Google Colab to run this notebook, no local installation
required!

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](
https://colab.research.google.com/github/zenml-io/zenml/blob/main/examples/quickstart/quickstart.ipynb)

# üë∂ Step 0. Install Requirements

Let's install ZenML and all requirement to get started.

In [1]:
!pip install "zenml[server]"


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.1.2[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython -m pip install --upgrade pip[0m


In [None]:
from zenml.environment import Environment

# In case we are in a google colab, clone all additional relevant files
if Environment.in_google_colab():
    # Pull required modules from this example
    !git clone -b main https://github.com/zenml-io/zenml
    !cp -r zenml/examples/quickstart/* .
    !rm -rf zenml

!pip install -r requirements.txt

In [2]:
# Restart Kernel to ensure all libraries are properly loaded
import IPython
IPython.Application.instance().kernel.do_shutdown(restart=True)

{'status': 'ok', 'restart': True}


Please wait for the installation to complete before running subsequent cells. At
the end of the installation, the notebook kernel will restart.

## ‚òÅÔ∏è Step 1: Connect to your ZenML Server
To run this quickstart you need to connect to a ZenML Server. You can deploy it [yourself on your own infrastructure](https://docs.zenml.io/getting-started/deploying-zenml) or try it out for free, no credit-card required in our [ZenML Pro managed service](https://zenml.io/pro).

In [None]:
zenml_server_url = "INSERT_YOUR_SERVER_URL_HERE"  # in the form "https://URL_TO_SERVER"

!zenml connect --url $zenml_server_url

In [None]:
# Initialize ZenML and define the root for imports and docker builds
!zenml init

## ü•á Step 2: Build and run your first pipeline

In this quickstart we'll be working with a small dataset of sentences in old english paired with more modern formulations. The task is a text-to-text transformation.

When you're getting started with a machine learning problem you'll want to break down your code into distinct functions that load your data, bring it into the correct shape and finally produce a model. This is the experimentation phase where we try to massage our data into the right format and feed it into our model training.

<img src=".assets/Experiment.png" width="20%" alt="Experimentation phase and pipeline construction">

In [None]:
from zenml import step
from datasets import Dataset
from typing_extensions import Annotated


@step
def load_data() -> Annotated[Dataset, "raw_dataset"]:
    """Load and prepare the dataset."""

    def read_data(file_path):
        inputs = []
        targets = []

        with open(file_path, "r", encoding="utf-8") as file:
            for line in file:
                old, modern = line.strip().split("|")
                inputs.append(f"Translate Old English to Modern English: {old}")
                targets.append(modern)

        return {"input": inputs, "target": targets}

    # Assuming your file is named 'translations.txt'
    data = read_data("translations.txt")
    return Dataset.from_dict(data)

ZenML is built in a way that allows you to experiment with your data and build
your pipelines one step at a time.  If you want to call this function to see how it
works, you can just call it directly. Here we take a look at the first few rows
of your training dataset.

In [2]:
dataset = load_data()
dataset[0]

{'input': "Translate Old English to Modern English: Shall I compare thee to a summer's day?",
 'target': 'Should I compare you to a summer day?'}

Everything looks as we'd expect and the input/output pair looks to be in the right format ü•≥.

For the sake of this quickstart we have prepared a few steps in the steps-directory. We'll now connect these together into a pipeline. To do this simply plug multiple steps together through their inputs and outputs. Then just add the `@pipeline` decorator to the function that connects the steps.

In [3]:
from zenml import pipeline, Model
from zenml.client import Client
from zenml.model.model import Model

from steps import load_data, tokenize_data, train_model, evaluate_model, model_tester
from steps.model_trainer import T5_Model

# Initialize the ZenML client to fetch objects from the ZenML Server
client = Client()

Client().activate_stack("default") # We will start by using the default stack which is local

model_name = "YeOldeEnglishTranslator"
model = Model(
  name = "YeOldeEnglishTranslator",
  description = "Model to translate from old to modern english",
  tags = ["quickstart", "llm", "t5"]
)

@pipeline(enable_cache=True, model=model)
def english_translation_pipeline(
    model_type: T5_Model,
    per_device_train_batch_size: int,
    gradient_accumulation_steps: int,
    dataloader_num_workers: int,
    num_train_epochs: int = 5,
):
    """Define a pipeline that connects the steps."""
    dataset, test_dataset = load_data()
    tokenized_dataset, tokenizer = tokenize_data(dataset, model_type)
    model = train_model(
        tokenized_dataset,
        model_type,
        num_train_epochs,
        per_device_train_batch_size,
        gradient_accumulation_steps,
        dataloader_num_workers,
    )
    evaluate_model(model, tokenized_dataset)
    model_tester(model, tokenizer, test_dataset)

[31mThe ZenML global configuration version (0.64.0) is higher than the version of ZenML currently being used (0.63.0). Read more about this issue and how to solve it here: [0m[1;36mhttps://docs.zenml.io/reference/global-settings#version-mismatch-downgrading[31m[0m


We're ready to run the pipeline now, which we can do just as with the step - by calling the
pipeline function itself:

In [4]:
# Run the pipeline and configure some parameters at runtime
pipeline_run = english_translation_pipeline(
    model_type="t5-small",
    num_train_epochs=5,
    per_device_train_batch_size=2,
    gradient_accumulation_steps=4,
    dataloader_num_workers=4
)

[1;35mInitiating a new run for the pipeline: [0m[1;36menglish_translation_pipeline[1;35m.[0m
[1;35mNew model version [0m[1;36m134[1;35m was created.[0m
[1;35mDashboard URL for Model Version with name 134 : https://cloud.zenml.io/organizations/fc992c14-d960-4db7-812e-8f070c99c6f0/tenants/8a462fb6-b1fe-48df-9677-edc76bc8352d/model-versions/60713be3-f6af-48f2-9276-a886a44e4584[0m
[1;35mExecuting a new run.[0m
[1;35mUsing user: [0m[1;36malexej@zenml.io[1;35m[0m
[1;35mUsing stack: [0m[1;36mdefault[1;35m[0m
[1;35m  artifact_store: [0m[1;36mdefault[1;35m[0m
[1;35m  orchestrator: [0m[1;36mdefault[1;35m[0m
[1;35mDashboard URL: https://cloud.zenml.io/organizations/fc992c14-d960-4db7-812e-8f070c99c6f0/tenants/8a462fb6-b1fe-48df-9677-edc76bc8352d/runs/66a234a9-1348-47d1-b8fd-faa1c4c4c556[0m
[1;35mUsing cached version of [0m[1;36mload_data[1;35m.[0m
[1;35mStep [0m[1;36mload_data[1;35m has started.[0m
[1;35mStep [0m[1;36mtokenize_data[1;35m has start

Map:   0%|          | 0/507 [00:00<?, ? examples/s]

Saving the dataset (0/1 shards):   0%|          | 0/507 [00:00<?, ? examples/s]

[1;35mStep [0m[1;36mtokenize_data[1;35m has finished in [0m[1;36m7.900s[1;35m.[0m
[1;35mStep [0m[1;36mtokenize_data[1;35m completed successfully.[0m
[1;35mCaching [0m[1;36mdisabled[1;35m explicitly for [0m[1;36mtrain_model[1;35m.[0m
[1;35mStep [0m[1;36mtrain_model[1;35m has started.[0m


Step,Training Loss
10,11.7584
20,10.4138
30,7.8939
40,6.8164
50,5.4769
60,4.3322
70,3.3108
80,2.3906
90,2.0382
100,1.9076


[1;35mStep [0m[1;36mtrain_model[1;35m has finished in [0m[1;36m10m14s[1;35m.[0m
[1;35mStep [0m[1;36mtrain_model[1;35m completed successfully.[0m
[1;35mStep [0m[1;36mevaluate_model[1;35m has started.[0m
Average loss on the dataset: 0.47016171691939235
[1;35mStep [0m[1;36mevaluate_model[1;35m has finished in [0m[1;36m35.797s[1;35m.[0m
[1;35mStep [0m[1;36mevaluate_model[1;35m completed successfully.[0m
[1;35mStep [0m[1;36mmodel_tester[1;35m has started.[0m
[1;35mGenerated Old English: Forsooth, I say unto thee[0m
[1;35mModel Translation:  
[0m
[1;35mGenerated Old English: Hark! What light through yonder window breaks?[0m
[1;35mModel Translation: Warum st√º√ºnnt die Fenster? 
[0m
[1;35mGenerated Old English: Wherefore art thou Romeo?[0m
[1;35mModel Translation:  
[0m
[1;35mGenerated Old English: A horse! A horse! My kingdom for a horse![0m
[1;35mModel Translation: (, oder Un: Das- Die Ein Les 
[0m
[1;35mGenerated Old English: Out, damned

As you can see the pipeline has run successfully. It also printed out some examples - however it seems the model is not yet able to solve the task well. But we validated that the pipeline works.

<img src=".assets/DAG.png" width="50%" alt="Dashboard view">

Above you can see what the dashboard view of the pipeline in the ZenML Dashboard. You can find the URL for this in the logs above. 

We can now access the trained model and it's tokenizer from the ZenML Model Control Plane. 

In [5]:
# load the model object
model = client.get_model_version(model_name).get_model_artifact('model').load()
tokenizer = client.get_model_version(model_name).get_artifact('tokenizer').load()

With this in hand we can now play around with the model directly and try out some examples ourselves:

In [6]:
import torch

test_text = "I do desire we may be better strangers"

input_ids = tokenizer(
    test_text,
    return_tensors="pt",
    max_length=128,
    truncation=True,
    padding="max_length",
).input_ids

with torch.no_grad():
    outputs = model.generate(
        input_ids,
        max_length=128,
        num_return_sequences=1,
        no_repeat_ngram_size=2,
        top_k=50,
        top_p=0.95,
        temperature=0.7,
    )

decoded_output = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(decoded_output)




## Lets recap what we've done so far

We created a modular pipeline, this pipeline is modularly constructed from different steps. We have shown that this pipeline runs locally.

As expected, the modcel does not yet solve its task. To train a model that can solve our task well, we would have to train a larger model for longer. For this, we'll need to move away from our local environment. 

# ‚åö Step 3: Scale it up in the cloud

Our last section confirmed to us, that the pipeline works. Let's now run the pipeline in the environment of your choice.

For you to be able to try this step, you will need to have access to a cloud environment (AWS, GCP, AZURE). ZenML wraps around all the major cloud providers and orchestration tools and lets you easily deploy your code onto them.

To do this lets head over to the `Stack` section of your ZenML Dashboard. Here you'll be able to either connect to an existing or deploy a new environment. Choose on of the options presented to you there and come back when you have a stack ready to go. Then proceed to the appropirate section below. **Do not** run all three. Also be sure that you are running with a remote ZenML server (see Step 1 above).

<img src=".assets/StackCreate.png" width="20%" alt="Stack creation in the ZenML Dashboard">

## GCP

In [None]:
!zenml integration install gcp -y

from zenml.client import Client
from zenml.config import DockerSettings, ResourceSettings
from zenml.integrations.gcp.flavors.vertex_orchestrator_flavor import VertexOrchestratorSettings

# Set the name of your stack here
stack_name = "INSERT_STACK_NAME_HERE"

Client().activate_stack(stack_name)

configured_english_translation_pipeline = english_translation_pipeline.with_options(
    settings={
        "docker": DockerSettings(
            parent_image="pytorch/pytorch:2.4.0-cuda11.8-cudnn9-runtime",
            requirements=["zenml==0.63.0","pyarrow","datasets","transformers[torch]","sentencepiece"],
            environment={"ZENML_DISABLE_STEP_LOGS_STORAGE": True}
        ),
        "resources": ResourceSettings(memory="32GB"),
    }
)

## AWS

In [None]:
#!pip install sagemaker s3 s3fs
from zenml.client import Client
from zenml.config import DockerSettings, ResourceSettings
from zenml.integrations.aws.flavors.sagemaker_orchestrator_flavor import SagemakerOrchestratorSettings

# Set the name of your stack here
stack_name = "alexej-aws-quickstart-stack" #"INSERT_STACK_NAME_HERE"

Client().activate_stack(stack_name)

configured_english_translation_pipeline = english_translation_pipeline.with_options(
    settings={
        "docker": DockerSettings(
            parent_image="pytorch/pytorch:2.4.0-cuda11.8-cudnn9-runtime",
            requirements=["zenml==0.63.0","pyarrow","datasets","transformers[torch]","sentencepiece"],
            environment={"ZENML_DISABLE_STEP_LOGS_STORAGE": True}
        ),
        "resources": ResourceSettings(memory="32GB"),
        "orchestrator.sagemaker": SagemakerOrchestratorSettings(instance_type="ml.p2.xlarge")
    }
)

## Azure

In [None]:
!zenml integration install azure -y

from zenml.client import Client
from zenml.config import DockerSettings, ResourceSettings
from zenml.integrations.skypilot.flavors.skypilot_orchestrator_base_vm_config import SkypilotBaseOrchestratorSettings

# Set the name of your stack here
stack_name = "INSERT_STACK_NAME_HERE"

Client().activate_stack(stack_name)

configured_english_translation_pipeline = english_translation_pipeline.with_options(
    settings={
        "docker": DockerSettings(
            parent_image="pytorch/pytorch:2.4.0-cuda11.8-cudnn9-runtime",
            requirements=["zenml==0.63.0","pyarrow","datasets","transformers[torch]","sentencepiece"],
            environment={"ZENML_DISABLE_STEP_LOGS_STORAGE": True}
        ),
        "orchestrator.sagemaker": SkypilotBaseOrchestratorSettings(accelerators='V100', memory="32+", cpus="8+")
    }
)

## üöÄ Ready to launch

We now have configured zenml to use your very own cloud infrastructure.

<img src=".assets/SwitchStack.png" width="20%" alt="Stack switching with ZenML">

For the next pipeline run, we'll be training the same t5 model (`t5_small`) on your own infrastrucutre.

Note: The whole process may take a bit longer the first time around, as your pipeline code needs to be built into docker containers to be run in the orchestration environment of your stack. Any consecutive run of the pipeline, even with different parameters set, will not take as long again thanks to docker caching.

In [None]:
pipeline_run = configured_english_translation_pipeline(
    model_type="t5-small", 
    num_train_epochs=3,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,
    dataloader_num_workers=0
)

You did it! You build a pipeline locally, verified that all its parts work well together and now are running it on a production environment

<img src=".assets/Production.png" width="20%" alt="Pipeline running on your infrastructure.">

## Now its Up to you

You can now start worrying about making the model actually work well, as the model results are still not acceptable.

Here are some things that you could do:

* Iterate on the training data and its tokenization
* You can switch out the model itself. Instead of `model_type="t5_small"` you could use `model_type="t5_large"` for example
* You can train for longer by increasing the `num_train_epochs=xxx`. In order to speed this up you can also add accelerators to your orchestrators. Learn more about this in the section below.

No matter what avenue you choose to actually make the model work, we would love to see how you did it, so please reach out and share your solution with us either on [**Slack Community**](https://zenml.io/slack) or through our email hello@zenml.io.

## Adding Accelerators

Each of the cloud providers allows users to add accelerators to their serverless offerings. Here's what you need to add to the pipeline settings in order to unlock gpus. Keep in mind, that you might have to increase your quotas within the cloud providers.

### GCP

```
english_translation_pipeline.with_options(settings=

## Further exploration

This was just the tip of the iceberg of what ZenML can do; check out the [**docs**](https://docs.zenml.io/) to learn more
about the capabilities of ZenML. For example, you might want to:

- [Deploy ZenML](https://docs.zenml.io/user-guide/production-guide/connect-deployed-zenml) to collaborate with your colleagues.
- Run the same pipeline on a [cloud MLOps stack in production](https://docs.zenml.io/user-guide/production-guide/cloud-stack).
- Track your metrics in an experiment tracker like [MLflow](https://docs.zenml.io/stacks-and-components/component-guide/experiment-trackers/mlflow).

## What next?

* If you have questions or feedback... join our [**Slack Community**](https://zenml.io/slack) and become part of the ZenML family!
* If you want to quickly get started with ZenML, check out [ZenML Pro](https://zenml.io/pro).