# ZenML Quickstart: Bridging Local Development and Cloud Deployment

This repository demonstrates how ZenML streamlines the transition of machine learning workflows from local environments to cloud-scale operations.

Key advantages:

Deploy to major cloud providers with minimal code changes

* Connect directly to your existing infrastructure
* Bridge the gap between ML and Ops teams
* Gain deep insights into pipeline metadata via the ZenML Dashboard

Unlike traditional MLOps tools, ZenML offers unparalleled flexibility and control. It integrates seamlessly with your infrastructure, allowing both ML and Ops teams to collaborate effectively without compromising on their specific requirements.

The notebook guides you through adapting local code for cloud deployment, showcasing ZenML's ability to enhance workflow efficiency while maintaining reproducibility and auditability in production.

Ready to unify your ML development and operations? Let's begin. The diagram below 
describes what we'll show you in this example.

<img src=".assets/Overview.png" width="80%" alt="Pipelines Overview">

1) We have done some of the experimenting for you already and created a simple finetuning pipeline for a text-to-text
   task.
2) We will run this pipeline on your machine and a verify that everything works as expected.
3) Now we'll connect ZenML to your infrastructure and configure everything.
4) Finally, we are ready to run our code remotely.

Follow along this notebook to understand how you can use ZenML to productionalize your ML workflows!

## Run on Colab

You can use Google Colab to run this notebook, no local installation
required!

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](
https://colab.research.google.com/github/zenml-io/zenml/blob/main/examples/quickstart/quickstart.ipynb)

# 👶 Step 0. Install Requirements

Let's install ZenML and all requirement to get started.

In [None]:
!pip install uv
!pip install zenml

In [None]:
from zenml.environment import Environment

# In case we are in a google colab, clone all additional relevant files
if Environment.in_google_colab():
    # Pull required modules from this example
    !git clone -b main https://github.com/zenml-io/zenml
    !cp -r zenml/examples/quickstart/* .
    !rm -rf zenml

!pip install -r requirements.txt

In [None]:
# Restart Kernel to ensure all libraries are properly loaded
import IPython
IPython.Application.instance().kernel.do_shutdown(restart=True)


Please wait for the installation to complete before running subsequent cells. At
the end of the installation, the notebook kernel will restart.

## ☁️ Step 1: Connect to your ZenML Server
To run this quickstart you need to connect to a ZenML Server. You can deploy it [yourself on your own infrastructure](https://docs.zenml.io/getting-started/deploying-zenml) or try it out for free, no credit-card required in our [ZenML Pro managed service](https://zenml.io/pro).

In [None]:
zenml_server_url = "INSERT_YOUR_SERVER_URL_HERE"  # in the form "https://URL_TO_SERVER"

!zenml connect --url $zenml_server_url

In [None]:
# Initialize ZenML and define the root for imports and docker builds
!zenml init

## 🥇 Step 2: Build and run your first pipeline

In this quickstart we'll be working with a small dataset of sentences in old english paired with more modern formulations. The task is a text-to-text transformation.

When you're getting started with a machine learning problem you'll want to break down your code into distinct functions that load your data, bring it into the correct shape and finally produce a model. This is the experimentation phase where we try to massage our data into the right format and feed it into our model training.

<img src=".assets/Experiment.png" width="30%" alt="Experimentation phase">

In [None]:
import requests
from datasets import Dataset
from typing import Tuple
from typing_extensions import Annotated

from zenml import step
from zenml.integrations.huggingface.materializers import HFDatasetMaterializer

PROMPT = ""  # In case you want to also use a prompt you can set it here

DEFAULT_TRAIN_DATA = "https://storage.googleapis.com/zenml-public-bucket/quickstart-files/translations.txt"
DEFAULT_TEST_DATA = "https://storage.googleapis.com/zenml-public-bucket/quickstart-files/test-translations.txt"


def read_data_from_url(url):
    inputs = []
    targets = []

    response = requests.get(url)
    response.raise_for_status()  # Raise an exception for bad responses

    for line in response.text.splitlines():
        old, modern = line.strip().split("|")
        inputs.append(f"{PROMPT}{old}")
        targets.append(modern)

    return {"input": inputs, "target": targets}


@step(output_materializers=HFDatasetMaterializer)
def load_data(
    train_url: str = DEFAULT_TRAIN_DATA,
    test_url: str = DEFAULT_TRAIN_DATA
) -> Tuple[
    Annotated[Dataset, "dataset"],
    Annotated[Dataset, "test_dataset"],
]:
    """Load and prepare the dataset."""
    
    # Fetch and process the data
    data = read_data_from_url(train_url)
    test_data = read_data_from_url(test_url)

    return Dataset.from_dict(data), Dataset.from_dict(test_data)

ZenML is built in a way that allows you to experiment with your data and build
your pipelines one step at a time.  If you want to call this function to see how it
works, you can just call it directly. Here we take a look at the first few rows
of your training dataset.

In [None]:
train_dataset, test_dataset = load_data()
print(f"Input: {train_dataset['input'][0]} - Target: {train_dataset['target'][0]}")

Everything looks as we'd expect and the input/output pair looks to be in the right format 🥳.

For the sake of this quickstart we have prepared a few steps in the steps-directory. We'll now connect these together into a pipeline. To do this simply plug multiple steps together through their inputs and outputs. Then just add the `@pipeline` decorator to the function that connects the steps.

In [None]:
from zenml import pipeline, Model
from zenml.client import Client
from zenml.model.model import Model

from steps import load_data, tokenize_data, train_model, evaluate_model, model_tester
from steps.model_trainer import T5_Model

# Initialize the ZenML client to fetch objects from the ZenML Server
client = Client()

Client().activate_stack("default") # We will start by using the default stack which is local

model_name = "YeOldeEnglishTranslator"
model = Model(
  name = "YeOldeEnglishTranslator",
  description = "Model to translate from old to modern english",
  tags = ["quickstart", "llm", "t5"]
)

@pipeline(enable_cache=True, model=model)
def english_translation_pipeline(
    model_type: T5_Model,
    per_device_train_batch_size: int,
    gradient_accumulation_steps: int,
    dataloader_num_workers: int,
    num_train_epochs: int = 5,
):
    """Define a pipeline that connects the steps."""
    dataset, test_dataset = load_data()
    tokenized_dataset, tokenizer = tokenize_data(dataset, model_type)
    model = train_model(
        tokenized_dataset,
        model_type,
        num_train_epochs,
        per_device_train_batch_size,
        gradient_accumulation_steps,
        dataloader_num_workers,
    )
    evaluate_model(model, tokenized_dataset)
    model_tester(model, tokenizer, test_dataset)

We're ready to run the pipeline now, which we can do just as with the step - by calling the
pipeline function itself:

In [None]:
# Run the pipeline and configure some parameters at runtime
pipeline_run = english_translation_pipeline(
    model_type="t5-small",
    num_train_epochs=5,
    per_device_train_batch_size=2,
    gradient_accumulation_steps=4,
    dataloader_num_workers=4
)

As you can see the pipeline has run successfully. It also printed out some examples - however it seems the model is not yet able to solve the task well. But we validated that the pipeline works.

<img src=".assets/DAG.png" width="70%" alt="Dashboard view">

Above you can see what the dashboard view of the pipeline in the ZenML Dashboard. You can find the URL for this in the logs above. 

We can now access the trained model and it's tokenizer from the ZenML Model Control Plane. 

In [None]:
# load the model object
model = client.get_model_version(model_name).get_model_artifact('model').load()
tokenizer = client.get_model_version(model_name).get_artifact('tokenizer').load()

With this in hand we can now play around with the model directly and try out some examples ourselves:

In [None]:
import torch

test_text = "I do desire we may be better strangers"

input_ids = tokenizer(
    test_text,
    return_tensors="pt",
    max_length=128,
    truncation=True,
    padding="max_length",
).input_ids

with torch.no_grad():
    outputs = model.generate(
        input_ids,
        max_length=128,
        num_return_sequences=1,
        no_repeat_ngram_size=2,
        top_k=50,
        top_p=0.95,
        temperature=0.7,
    )

decoded_output = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(decoded_output)

## Lets recap what we've done so far

We created a modular pipeline, this pipeline is modularly constructed from different steps. We have shown that this pipeline runs locally.

As expected, the modcel does not yet solve its task. To train a model that can solve our task well, we would have to train a larger model for longer. For this, we'll need to move away from our local environment. 

# ⌚ Step 3: Scale it up in the cloud

Our last section confirmed to us, that the pipeline works. Let's now run the pipeline in the environment of your choice.

For you to be able to try this step, you will need to have access to a cloud environment (AWS, GCP, AZURE). ZenML wraps around all the major cloud providers and orchestration tools and lets you easily deploy your code onto them.

To do this lets head over to the `Stack` section of your ZenML Dashboard. Here you'll be able to either connect to an existing or deploy a new environment. Choose on of the options presented to you there and come back when you have a stack ready to go. 

<img src=".assets/StackCreate.png" width="70%" alt="Stack creation ZenML Dashboard">

Then proceed to the section below. Also be sure that you are running with a remote ZenML server (see Step 1 above).

In [None]:
# Set the cloud provider here
CLOUD_PROVIDER = "GCP"  # Change this to "AWS" or "Azure" as needed

from zenml.client import Client
from zenml.config import DockerSettings, ResourceSettings

# Common imports and setup
if CLOUD_PROVIDER == "GCP":
    !zenml integration install gcp -y
elif CLOUD_PROVIDER == "AWS":
    !pip install sagemaker s3 s3fs
elif CLOUD_PROVIDER == "Azure":
    !zenml integration install azure -y

# Common pipeline configuration
configured_english_translation_pipeline = english_translation_pipeline.with_options(
    settings={
        "docker": DockerSettings(
            parent_image="zenmldocker/zenml-public-pipelines:quickstart-0.64.0-py3.11",
            requirements=[""],  # Add any additional docker requirements here
        ),
        "resources": ResourceSettings(memory="32GB"),
    }
)

## 🚀 Ready to launch

We now have configured a ZenML stack that represents your very own cloud infrastructure.

<img src=".assets/SwitchStack.png" width="20%" alt="Stack switching with ZenML">

For the next pipeline run, we'll be training the same t5 model (`t5_small`) on your own infrastrucutre.

Note: The whole process may take a bit longer the first time around, as your pipeline code needs to be built into docker containers to be run in the orchestration environment of your stack. Any consecutive run of the pipeline, even with different parameters set, will not take as long again thanks to docker caching.

In [None]:
# Set the name of your stack here
stack_name = "INSERT_STACK_NAME_HERE"

Client().activate_stack(stack_name)

pipeline_run = configured_english_translation_pipeline(
    model_type="t5-small", 
    num_train_epochs=3,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,
    dataloader_num_workers=0
)

You did it! You build a pipeline locally, verified that all its parts work well together and now are running it on a production environment

<img src=".assets/Production.png" width="20%" alt="Pipeline running on your infrastructure.">

## Adding Accelerators

Each of the cloud providers allows users to add accelerators to their serverless offerings. Here's what you need to add to the pipeline settings in order to unlock gpus. Keep in mind, that you might have to increase your quotas within the cloud providers.

### GCP

For GCP Vertex you can use the the VertexOrchestratorSettings to specify configuration options, click [here](https://docs.zenml.io/stack-components/orchestrators/vertex#additional-configuration) to learn more. Additionally you can use the Resource Settings for more general infrastructure configuration.

```python
from zenml.config import ResourceSettings
from zenml.integrations.gcp.flavors.vertex_orchestrator_flavor import VertexOrchestratorSettings

english_translation_pipeline.with_options(
    settings={
        "orchestrator.vertex": VertexOrchestratorSettings(
            node_selector_constraint=("cloud.google.com/gke-accelerator", "NVIDIA_TESLA_P4")
        )
        "resources": ResourceSettings(memory="32GB", gpu=1),
    }
)
```

### AWS

For AWS Sagemaker you can use the the SagemakerOrchestratorSettings to specify configuration options, click [here](https://docs.zenml.io/stack-components/orchestrators/sagemaker#configuration-at-pipeline-or-step-level) to learn more.

```python
from zenml.integrations.aws.flavors.sagemaker_orchestrator_flavor import SagemakerOrchestratorSettings

english_translation_pipeline.with_options(
    settings={
        "orchestrator.sagemaker": SagemakerOrchestratorSettings(instance_type="ml.p2.xlarge")
    }
)
```

### Azure

For Azure Skypilot you can use the the SkypilotAzureOrchestratorSettings to specify configuration options, click [here](https://docs.zenml.io/stack-components/orchestrators/skypilot-vm#additional-configuration) to learn more.

```python
from zenml.integrations.skypilot_azure.flavors.skypilot_orchestrator_azure_vm_flavor import SkypilotAzureOrchestratorSettings

english_translation_pipeline.with_options(
    settings={
        "orchestrator.vm_azure": SkypilotAzureOrchestratorSettings(instance_type="Standard_NC6")
    }
)
```

## Now it's up to you

You can now start worrying about making the model actually work well on our toy example or any other dataset you like.

Here are some things that you could do:

* Iterate on the training data and its tokenization
* You can switch out the model itself. Instead of `model_type="t5_small"` you could use `model_type="t5_large"` for example
* You can train for longer by increasing the `num_train_epochs=xxx`. In order to speed this up you can also add accelerators to your orchestrators. Learn more about this in the section below.

No matter what avenue you choose to actually make the model work, we would love to see how you did it, so please reach out and share your solution with us either on [**Slack Community**](https://zenml.io/slack) or through our email hello@zenml.io.

## Further exploration

This was just the tip of the iceberg of what ZenML can do; check out the [**docs**](https://docs.zenml.io/) to learn more
about the capabilities of ZenML. For example, you might want to:

- [Deploy ZenML](https://docs.zenml.io/user-guide/production-guide/connect-deployed-zenml) to collaborate with your colleagues.
- Run the same pipeline on a [cloud MLOps stack in production](https://docs.zenml.io/user-guide/production-guide/cloud-stack).
- Track your metrics in an experiment tracker like [MLflow](https://docs.zenml.io/stacks-and-components/component-guide/experiment-trackers/mlflow).

## What next?

* If you have questions or feedback... join our [**Slack Community**](https://zenml.io/slack) and become part of the ZenML family!
* If you want to quickly get started with ZenML, check out [ZenML Pro](https://zenml.io/pro).