# ZenML: Open-source MLOps Framework for reproducible ML pipelines

![Test](_assets/Logo/zenml.svg)

![Sam](_assets/sam.png)

In [None]:
from absl import logging as absl_logging
import warnings

warnings.filterwarnings("ignore")
%load_ext autoreload
%autoreload 2
absl_logging.set_verbosity(-10000)

Let's begin by initializing ZenML in our directory. We are going to use a local stack to begin with, for simplicity and then transition to other stacks. This can be achieved in code by executing the following block.

# Initialize ZenML

In [None]:
!rm -rf .zen
!zenml init
!zenml stack set default

We will start by looking at the definition of a pipeline that we want to build. This will give an overview of what we want to achieve and how we plan on getting there. We'll dive into the details on some of the interesting steps after that.

# Basics: Steps, Pipelines, Post-Execution, Lineage, Caching

Get familiar with the basics

## Create your first pipeline

In [None]:
from zenml.steps import step
from zenml.pipelines import pipeline


@step
def a() -> int:
    return 2


@step
def b() -> int:
    return 3


@step
def add(a: int, b: int) -> int:
    print(f"Adding {a} + {b}")
    return a + b


@pipeline
def my_pipeline(first, second, add):
    f = first()
    s = second()
    add(f, s)


my_pipeline(
    first=a(),
    second=b(),
    add=add(),
).run()

In [None]:
a.entrypoint()  # run the function directly

## Post-execution workflow

pipelines -> runs -> steps -> outputs

In [None]:
from zenml.repository import Repository

repo = Repository()

In [None]:
p = repo.get_pipeline("my_pipeline")
p

In [None]:
run = p.runs[-1]
run

In [None]:
steps = run.steps
s = steps[-1]
s

In [None]:
s.output.read()

## See lineage

In [None]:
!zenml integration install dash -f

In [None]:
from zenml.integrations.dash.visualizers.pipeline_run_lineage_visualizer import (
    PipelineRunLineageVisualizer,
)

PipelineRunLineageVisualizer().visualize(run)

## Caching in action

In [None]:
@step
def c() -> int:
    return 11

In [None]:
my_pipeline(
    first=a(),
    second=c(),
    add=add(),
).run()

In [None]:
from zenml.integrations.dash.visualizers.pipeline_run_lineage_visualizer import (
    PipelineRunLineageVisualizer,
)

latest_run = repo.get_pipeline("my_pipeline").runs[-1]
PipelineRunLineageVisualizer().visualize(latest_run)