# ZenML Quickstart: Bridging Local Development and Cloud Deployment

This repository demonstrates how ZenML streamlines the transition of machine learning workflows from local environments to cloud-scale operations.

## Key advantages:

Deploy to major cloud providers with minimal code changes

* Connect directly to your existing infrastructure
* Bridge the gap between ML and Ops teams
* Gain deep insights into pipeline metadata via the ZenML Dashboard

Unlike traditional MLOps tools, ZenML offers unparalleled flexibility and control. It integrates seamlessly with your infrastructure, allowing both ML and Ops teams to collaborate effectively without compromising on their specific requirements.

The notebook guides you through adapting local code for cloud deployment, showcasing ZenML's ability to enhance workflow efficiency while maintaining reproducibility and auditability in production.

Ready to unify your ML development and operations? Let's begin. The diagram below 
describes what we'll show you in this example.

<img src=".assets/Overview.png" width="80%" alt="Pipelines Overview">

1) We have done some of the experimenting for you already and created a simple finetuning pipeline for a text-to-text task.

2) We will run this pipeline on your machine and a verify that everything works as expected.

3) Now we'll connect ZenML to your infrastructure and configure everything.

4) Finally, we are ready to run our code remotely.

Follow along this notebook to understand how you can use ZenML to productionalize your ML workflows!

## Run on Colab

You can use Google Colab to run this notebook, no local installation
required!

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](
https://colab.research.google.com/github/zenml-io/zenml/blob/main/examples/quickstart/quickstart.ipynb)

# 👶 Step 0. Install Requirements

Let's install ZenML and all requirement to get started.

In [None]:
# Choose a cloud provider - at the end of this notebook you will run a pipeline on this cloud provider
CLOUD_PROVIDER = None  # Set this to "GCP", "AWS" or "AZURE" as needed


def in_google_colab() -> bool:
    """Checks wether this notebook is run in google colab."""
    try:
        import google.colab  # noqa

        return True

    except ModuleNotFoundError:
        return False


if in_google_colab():
    # Pull required modules from this example
    !git clone -b main https://github.com/zenml-io/zenml
    !cp -r zenml/examples/quickstart/* .
    !rm -rf zenml


# Common imports and setup
if CLOUD_PROVIDER.lower() == "gcp":
    !pip install -r requirements_gcp.txt

elif CLOUD_PROVIDER.lower() == "aws":
    !pip install -r requirements_aws.txt

elif CLOUD_PROVIDER.lower() == "azure":
    !pip install -r requirements_azure.txt

else:  # In this case the second half of the notebook won't work for you
    !pip install -r requirements.txt

In [None]:
# Restart Kernel to ensure all libraries are properly loaded
import IPython

IPython.Application.instance().kernel.do_shutdown(restart=True)


Please wait for the installation to complete before running subsequent cells. At
the end of the installation, the notebook kernel will restart.

## ☁️ Step 1: Connect to your ZenML Server
To run this quickstart you need to connect to a ZenML Server. You can deploy it [yourself on your own infrastructure](https://docs.zenml.io/getting-started/deploying-zenml) or try it out for free, no credit-card required in our [ZenML Pro managed service](https://zenml.io/pro).

In [None]:
zenml_server_url = (
    None  # INSERT URL TO SERVER HERE in the form "https://URL_TO_SERVER"
)

assert zenml_server_url

!zenml login $zenml_server_url

In [None]:
# Disable wandb
import os

os.environ["WANDB_DISABLED"] = "true"

# Initialize ZenML and define the root for imports and docker builds
!zenml init

!zenml stack set default

## 🥇 Step 2: Build and run your first pipeline

In this quickstart we'll be working with a small dataset of sentences in old english paired with more modern formulations. The task is a text-to-text transformation.

When you're getting started with a machine learning problem you'll want to break down your code into distinct functions that load your data, bring it into the correct shape and finally produce a model. This is the experimentation phase where we try to massage our data into the right format and feed it into our model training.

<img src=".assets/Experiment.png" width="30%" alt="Experimentation phase">

In [None]:
import requests
from datasets import Dataset
from typing_extensions import Annotated

from zenml import step

PROMPT = ""  # In case you want to also use a prompt you can set it here


def read_data_from_url(url):
    """Reads data from url.

    Assumes the individual data points are linebreak separated
    and input, targets are separated by a `|` pipe.
    """
    inputs = []
    targets = []

    response = requests.get(url)
    response.raise_for_status()  # Raise an exception for bad responses

    for line in response.text.splitlines():
        old, modern = line.strip().split("|")
        inputs.append(f"{PROMPT}{old}")
        targets.append(modern)

    return {"input": inputs, "target": targets}


@step
def load_data(
    data_url: str,
) -> Annotated[Dataset, "full_dataset"]:
    """Load and prepare the dataset."""

    # Fetch and process the data
    data = read_data_from_url(data_url)

    # Convert to Dataset
    return Dataset.from_dict(data)

ZenML is built in a way that allows you to experiment with your data and build
your pipelines one step at a time.  If you want to call this function to see how it
works, you can just call it directly. Here we take a look at the first few rows
of your training dataset.

In [None]:
data_source = "https://storage.googleapis.com/zenml-public-bucket/quickstart-files/translations.txt"

dataset = load_data(data_url=data_source)
print(f"Input: {dataset['input'][1]} - Target: {dataset['target'][1]}")

Everything looks as we'd expect and the input/output pair looks to be in the right format 🥳.

For the sake of this quickstart we have prepared a few steps in the steps-directory. We'll now connect these together into a pipeline. To do this simply plug multiple steps together through their inputs and outputs. Then just add the `@pipeline` decorator to the function that connects the steps.

In [None]:
import materializers
from steps import (
    evaluate_model,
    load_data,
    split_dataset,
    test_model,
    tokenize_data,
    train_model,
)
from steps.model_trainer import T5_Model

from zenml import Model, pipeline
from zenml.client import Client

assert materializers

# Initialize the ZenML client to fetch objects from the ZenML Server
client = Client()

Client().activate_stack(
    "default"
)  # We will start by using the default stack which is local

model_name = "YeOldeEnglishTranslator"
model = Model(
    name="YeOldeEnglishTranslator",
    description="Model to translate from old to modern english",
    tags=["quickstart", "llm", "t5"],
)


@pipeline(model=model)
def english_translation_pipeline(
    data_url: str,
    model_type: T5_Model,
    per_device_train_batch_size: int,
    gradient_accumulation_steps: int,
    dataloader_num_workers: int,
    num_train_epochs: int = 5,
):
    """Define a pipeline that connects the steps."""
    full_dataset = load_data(data_url)
    tokenized_dataset, tokenizer = tokenize_data(
        dataset=full_dataset, model_type=model_type
    )
    tokenized_train_dataset, tokenized_eval_dataset, tokenized_test_dataset = (
        split_dataset(
            tokenized_dataset,
            train_size=0.7,
            test_size=0.1,
            eval_size=0.2,
            subset_size=0.1,  # We use a subset of the dataset to speed things up
            random_state=42,
        )
    )
    model = train_model(
        tokenized_dataset=tokenized_train_dataset,
        model_type=model_type,
        num_train_epochs=num_train_epochs,
        per_device_train_batch_size=per_device_train_batch_size,
        gradient_accumulation_steps=gradient_accumulation_steps,
        dataloader_num_workers=dataloader_num_workers,
    )
    evaluate_model(model=model, tokenized_dataset=tokenized_eval_dataset)
    test_model(
        model=model,
        tokenized_test_dataset=tokenized_test_dataset,
        tokenizer=tokenizer,
    )

We're ready to run the pipeline now, which we can do just as with the step - by calling the
pipeline function itself:

In [None]:
# Run the pipeline and configure some parameters at runtime
pipeline_run = english_translation_pipeline(
    data_url="https://storage.googleapis.com/zenml-public-bucket/quickstart-files/translations.txt",
    model_type="t5-small",
    num_train_epochs=1,  # to make this demo fast, we start at 1 epoch
    per_device_train_batch_size=2,
    gradient_accumulation_steps=4,
    dataloader_num_workers=4,
)

As you can see the pipeline has run successfully. Here is a sneak-peak of the dashboard view into this pipeline. The URL for this view can be found in the logs.

<img src=".assets/DAG.png" width="70%" alt="Dashboard view">

This isn't all that the ZenML Dashboard has to offer, if you navigate over to the ZenML Model control plane, you'll also find the produced model along with a lot of important metadata.

<img src=".assets/ExamplePrompt.png" width="70%" alt="Model Control Plane view">

Here you'll also see a collection of example Input-Output pairs. As you can see, the model is currently not performing its task well.

We can now access the trained model and it's tokenizer from the ZenML Model Control Plane. This will allow us to interact with the model directly.

In [None]:
import torch

# load the model object
model = client.get_model_version(model_name).get_model_artifact("model").load()
tokenizer = (
    client.get_model_version(model_name).get_artifact("tokenizer").load()
)

test_text = "I do desire we may be better strangers"  # Insert your own test sentence here

input_ids = tokenizer(
    test_text,
    return_tensors="pt",
    max_length=128,
    truncation=True,
    padding="max_length",
).input_ids

with torch.no_grad():
    outputs = model.generate(
        input_ids,
        max_length=128,
        num_return_sequences=1,
        no_repeat_ngram_size=2,
        top_k=50,
        top_p=0.95,
        temperature=0.7,
    )

decoded_output = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(decoded_output)

## Lets recap what we've done so far

We created a modular pipeline, this pipeline is modularly constructed from different steps. We have shown that this pipeline runs locally.

As expected, the modcel does not yet solve its task. To train a model that can solve our task well, we would have to train a larger model for longer. For this, we'll need to move away from our local environment. 

# ⌚ Step 3: Scale it up in the cloud

Our last section confirmed to us, that the pipeline works. Let's now run the pipeline in the environment of your choice.

For you to be able to try this step, you will need to have access to a cloud environment (AWS, GCP, AZURE). ZenML wraps around all the major cloud providers and orchestration tools and lets you easily deploy your code onto them.

To do this lets head over to the `Stack` section of your ZenML Dashboard. Here you'll be able to either connect to an existing or deploy a new environment. Choose on of the options presented to you there and come back when you have a stack ready to go. 

<img src=".assets/StackCreate.png" width="70%" alt="Stack creation ZenML Dashboard">

Then proceed to the section below. Also be sure that you are running with a remote ZenML server (see Step 1 above).

In [None]:
from zenml.environment import Environment

# Set the cloud provider here
CLOUD_PROVIDER = None  # Set this to "GCP", "AWS" or "AZURE"
assert CLOUD_PROVIDER

# Set the name of the stack that you created within zenml
stack_name = None  # Set this
assert stack_name  # Set your stack, follow instruction above

from zenml import pipeline
from zenml.client import Client
from zenml.config import DockerSettings

settings = {}

# Common imports and setup
if CLOUD_PROVIDER.lower() == "gcp":
    parent_image = (
        "zenmldocker/zenml-public-pipelines:quickstart-0.72.0-py3.11-gcp"
    )
    skip_build = True

elif CLOUD_PROVIDER.lower() == "aws":
    from zenml.integrations.aws.flavors.sagemaker_orchestrator_flavor import (
        SagemakerOrchestratorSettings,
    )

    parent_image = "339712793861.dkr.ecr.eu-central-1.amazonaws.com/zenml-public-pipelines:quickstart-0.72.0-py3.11-aws"
    skip_build = True  # if you switch this to False, you need to remove the parent image

    settings["orchestrator.sagemaker"] = SagemakerOrchestratorSettings(
        instance_type="ml.m5.4xlarge"
    )

elif CLOUD_PROVIDER.lower() == "azure":
    parent_image = (
        "zenmldocker/zenml-public-pipelines:quickstart-0.72.0-py3.11-azure"
    )
    skip_build = True

Client().activate_stack(stack_name)

data_source = "https://storage.googleapis.com/zenml-public-bucket/quickstart-files/translations.txt"

# We've prebuilt a docker image for this quickstart to speed things up, feel free to delete the DockerSettings to build from scratch
settings["docker"] = DockerSettings(
    parent_image=parent_image, skip_build=skip_build
)

If you are in a google colab you might need to rerun the cell above a second time after the runtime restarted.

## 🚀 Ready to launch

We now have configured a ZenML stack that represents your very own cloud infrastructure. For the next pipeline run, we'll be training the same t5 model (`t5_small`) on your own infrastrucutre.

Note: The whole process may take a bit longer the first time around, as your pipeline code needs to be built into docker containers to be run in the orchestration environment of your stack. Any consecutive run of the pipeline, even with different parameters set, will not take as long again thanks to docker caching.

In [None]:
# In the case that we are within a colab environment we want to remove
# these folders
if Environment.in_google_colab():
    !rm -rf results
    !rm -rf sample_data

In [None]:
from pipelines import (
    english_translation_pipeline,
)

from zenml import Model

model_name = "YeOldeEnglishTranslator"
model = Model(
    name="YeOldeEnglishTranslator",
)

pipeline_run = english_translation_pipeline.with_options(
    settings=settings, model=model
)(
    data_url="https://storage.googleapis.com/zenml-public-bucket/quickstart-files/translations.txt",
    model_type="t5-small",
    num_train_epochs=2,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,
    dataloader_num_workers=0,  # Some cloud environment don't support multiple of these
)

You did it! You build a pipeline locally, verified that all its parts work well together and now are running it on a production environment

<img src=".assets/Production.png" width="20%" alt="Pipeline running on your infrastructure.">

Depending on the backend you chose, you can also go inspect your run in the orchestrator of your choice. Here is an example on GCP Vertex:

<img src=".assets/CloudDAGs.png" width="100%" alt="Pipeline running on Cloud orchestrator.">

## Adding Accelerators
Each of the cloud providers allows users to add accelerators to their serverless offerings. Here's what you need to add to the pipeline settings in order to unlock gpus. Keep in mind, that you might have to increase your quotas within the cloud providers.

In [None]:
from zenml.config import ResourceSettings

if CLOUD_PROVIDER == "GCP":
    from zenml.integrations.gcp.flavors.vertex_orchestrator_flavor import (
        VertexOrchestratorSettings,
    )

    # find out about your options here: https://docs.zenml.io/stack-components/orchestrators/vertex#additional-configuration

    english_translation_pipeline.with_options(
        settings={
            "orchestrator.vertex": VertexOrchestratorSettings(
                node_selector_constraint=(
                    "cloud.google.com/gke-accelerator",
                    "NVIDIA_TESLA_P4",
                )
            ),
            "resources": ResourceSettings(memory="32GB", gpu_count=1),
        }
    )
if CLOUD_PROVIDER == "AWS":
    from zenml.integrations.aws.flavors.sagemaker_orchestrator_flavor import (
        SagemakerOrchestratorSettings,
    )

    # find out your options here: https://docs.zenml.io/stack-components/orchestrators/sagemaker#configuration-at-pipeline-or-step-level

    english_translation_pipeline.with_options(
        settings={
            "orchestrator.sagemaker": SagemakerOrchestratorSettings(
                instance_type="ml.p2.xlarge"
            )
        }
    )
if CLOUD_PROVIDER == "AZURE":
    from zenml.integrations.azure.flavors import AzureMLOrchestratorSettings

    # find out your options here: https://docs.zenml.io/stack-components/orchestrators/azureml#settings
    # The quickest way is probably to configure a compute-instance in azure ml. This instance should contain
    # a gpu. Then specify the name of the compute instance here.

    compute_name = None  # Insert the name of your compute instance here

    english_translation_pipeline.with_options(
        settings={
            "orchestrator.azureml": AzureMLOrchestratorSettings(
                mode="compute-instance", compute_name=compute_name
            )
        }
    )

## Now it's up to you

You can now start worrying about making the model actually work well on our toy example or any other dataset you like.

Here are some things that you could do:

* Iterate on the training data and its tokenization
* You can switch out the model itself. Instead of `model_type="t5_small"` you could use `model_type="t5_large"` for example
* You can train for longer by increasing the `num_train_epochs=xxx`. In order to speed this up you can also add accelerators to your orchestrators. Learn more about this in the section below.

No matter what avenue you choose to actually make the model work, we would love to see how you did it, so please reach out and share your solution with us either on [**Slack Community**](https://zenml.io/slack) or through our email hello@zenml.io.

## Further exploration

This was just the tip of the iceberg of what ZenML can do; check out the [**docs**](https://docs.zenml.io/) to learn more
about the capabilities of ZenML. For example, you might want to:

- [Deploy ZenML](https://docs.zenml.io/user-guide/production-guide/connect-deployed-zenml) to collaborate with your colleagues.
- Run the same pipeline on a [cloud MLOps stack in production](https://docs.zenml.io/user-guide/production-guide/cloud-stack).
- Track your metrics in an experiment tracker like [MLflow](https://docs.zenml.io/stacks-and-components/component-guide/experiment-trackers/mlflow).

## What next?

* If you have questions or feedback... join our [**Slack Community**](https://zenml.io/slack) and become part of the ZenML family!
* If you want to quickly get started with ZenML, check out [ZenML Pro](https://zenml.io/pro).