# ZenML: Open-source MLOps Framework for reproducible ML pipelines

![Test](_assets/Logo/zenml.svg)

<div class="alert alert-block alert-danger">
    <b>Note:</b> This lesson is still in progress and some commands may not work as described. Please expect an update until 20th April 2022. 
</div>

In [54]:
from absl import logging as absl_logging
import warnings
warnings.filterwarnings('ignore')
%load_ext autoreload
%autoreload 2
absl_logging.set_verbosity(-10000)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


Let's begin by initializing ZenML in our directory. We are going to use a local stack to begin with, for simplicity and then transition to other stacks. This can be achieved in code by executing the following block.

# Initialize ZenML

In [2]:
!rm -rf .zen
!zenml init

[?25l[32m⠋[0m Initializing ZenML repository at /home/htahir1/workspace/zenbytes.
[2K[1A[2K[32m⠙[0m Initializing ZenML repository at /home/htahir1/workspace/zenbytes.
[2K[1A[2K[32m⠹[0m Initializing ZenML repository at /home/htahir1/workspace/zenbytes.
[2K[1A[2K[32m⠸[0m Initializing ZenML repository at /home/htahir1/workspace/zenbytes.
[2K[1A[2K[32m⠼[0m Initializing ZenML repository at /home/htahir1/workspace/zenbytes.
[2K[1A[2K[32m⠴[0m Initializing ZenML repository at /home/htahir1/workspace/zenbytes.
[2K[1A[2K[32m⠦[0m Initializing ZenML repository at /home/htahir1/workspace/zenbytes.
[2K[1A[2K[32m⠧[0m Initializing ZenML repository at /home/htahir1/workspace/zenbytes.
[2K[1A[2K[2;36mZenML repository initialized at [0m[2;35m/home/htahir1/workspace/[0m[2;95mzenbytes.[0m
[2;32m⠧[0m[2;36m [0m[2;36mInitializing ZenML repository at /home/htahir1/workspace/zenbytes.[0m
[2K[1A[2K[32m⠧[0m Initializing ZenML repository at /home/htahir1/wor

# Install integrations

ZenML handles integrations natively, to avoid dependency conflicts, so make sure to use the following command to install the integrations required for this lesson.

![All](_assets/integrations_all.png "All")

In [24]:
!zenml integration install kubeflow seldon aws -f

[35mUnable to find integration [0m[32m's3'[0m[35m.[0m
[2K[32m⠼[0m Installing integrations...Collecting s3fs==2022.3.0
[2K[32m⠦[0m Installing integrations...  Downloading s3fs-2022.3.0-py3-none-any.whl (26 kB)
[2K[32m⠙[0m Installing integrations...Collecting boto3==1.21.21
  Downloading boto3-1.21.21-py3-none-any.whl (132 kB)
[2K     [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m132.3/132.3 KB[0m [31m2.6 MB/s[0m eta [36m0:00:00[0m[31m2.5 MB/s[0m eta [36m0:00:01[0m
[2K[32m⠸[0m Installing integrations...Collecting fsspec==2022.3.0
  Downloading fsspec-2022.3.0-py3-none-any.whl (136 kB)
[2K     [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m136.1/136.1 KB[0m [31m7.5 MB/s[0m eta [36m0:00:00[0m
[2K[32m⠼[0m Installing integrations...Collecting aiobotocore~=2.2.0
  Downloading aiobotocore-2.2.0.tar.gz (59 kB)
[2K     [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m59.7/59.7 KB[0m [31m8.7 MB/s[0

[2K[32m⠹[0m Installing integrations...Collecting aioitertools>=0.5.1
  Downloading aioitertools-0.9.0-py3-none-any.whl (22 kB)
[2K[32m⠸[0m Installing integrations...  Downloading aioitertools-0.8.0-py3-none-any.whl (21 kB)
[2K[32m⠴[0m Installing integrations...Building wheels for collected packages: aiobotocore
[2K[32m⠸[0m Installing integrations..|  Building wheel for aiobotocore (setup.py) ... [?25ldone
[?25h  Created wheel for aiobotocore: filename=aiobotocore-2.2.0-py3-none-any.whl size=57109 sha256=e184e4ec4948eaa376adcc255f8ef8110876fdf2a31884cbcecaea1f2697ff78
  Stored in directory: /home/htahir1/.cache/pip/wheels/db/12/8e/d44b7a03257689abfdaa25a0df327834efdcab15f7cc2bbba9
Successfully built aiobotocore
[2K[32m⠏[0m Installing integrations...Installing collected packages: multidict, jmespath, fsspec, frozenlist, async-timeout, aioitertools, yarl, botocore, aiosignal, s3transfer, aiohttp, boto3, aiobotocore, s3fs
[2K[32m⠹[0m Installing integrations...Successful

# The Concept of MLOps Stacks

The ZenML stack is a concept that describes the union of Metadata Store, Artifact Store and Orchestrator that will be used for all pipeline runs. When you get started with zenml you start off with a default local stack.

In [11]:
!zenml stack list

[2;36mRunning with active profile: [0m[2;32m'default'[0m[2;36m [0m[1;2;36m([0m[2;36mlocal[0m[1;2;36m)[0m
┏━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━┓
┃[1m [0m[1mACTIVE[0m[1m [0m│[1m [0m[1mSTACK NAME[0m[1m [0m│[1m [0m[1mARTIFACT_STORE[0m[1m [0m│[1m [0m[1mMETADATA_STORE[0m[1m [0m│[1m [0m[1mORCHESTRATOR[0m[1m [0m┃
┠────────┼────────────┼────────────────┼────────────────┼──────────────┨
┃   👉   │ default    │ default        │ default        │ default      ┃
┗━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━┛


## The Local Stack

You can imagine the local stack to look like this. Within the diagram we show how a generic pipeline interacts with the local stack.

![LocalStack](_assets/localstack.png "LocalStack")

## The Kubeflow Pipelines stack

We will now use the Kubeflow integration to extend the concept of stacks

Now we want to transition to a kubeflow stack that will look a little bit like this. Note that for kubeflow pipelines we also need a registry where the docker images for each step are registered. 

![KubeflowStack](_assets/aws_stack_redesigned.png "KubeflowStack")

But we have good news! You barely have to do anything to transition.

# Transitioning to Production with Kubeflow on AWS

There are two steps to follow in order to continue.

- Set up the neccessary cloud resources on the provider of your choice
- Configure ZenML with a new stack to be able to communicate with these resources

## Set up using the cloud guide

In order to continue, it is best to follow the updated cloud guide for ZenML found [here](https://docs.zenml.io/features/guide-aws-gcp-azure). Please return after finishing the `pre-requisites` section.

It is recommended you use AWS as your cloud provider to follow along the lesson. However, if you were to select GCP or Azure, it should not so hard to actually modify the below commands to work accordingly.

## Create your AWS Kubeflow Stack

Now we can configure a new stack that points to your newly created resources on the cloud

If you remember from the main README, Kubernetes and Docker are a pre-requisite to this part of the guide. Please make sure you have them installed. You also need to install the [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) to move forward.

In [30]:
# Replace the following with your own configuration. Use the below as exemplary.

AWS_EKS_CLUSTER="zenhacks-cluster"
AWS_REGION="us-east-1"
ECR_REGISTRY_NAME="715803424590.dkr.ecr.us-east-1.amazonaws.com"
S3_BUCKET_NAME="s3://zenbytes-bucket"

In [44]:
# Register container registry
# Point Docker to the registry
!aws ecr get-login-password --region {AWS_REGION} | docker login --username AWS --password-stdin {ECR_REGISTRY_NAME}
!zenml container-registry register ecr_registry --type=default --uri={ECR_REGISTRY_NAME}

# Register orchestrator (Kubeflow on AWS)
# if you don't have your kubectl pointing to the eks cluster, run this
!aws eks --region {AWS_REGION} update-kubeconfig --name {AWS_EKS_CLUSTER}
!zenml orchestrator register eks_orchestrator --type=kubeflow 

# Register metadata store and artifact store
!zenml metadata-store register kubeflow_metadata_store --type=kubeflow
!zenml artifact-store register s3_store --type=s3 --path={S3_BUCKET_NAME}


# Register a secret manager
!zenml secrets-manager register local_secret_manager --type=local

# Register the aws_kubeflow_stack
!zenml stack register aws_kubeflow_stack -m kubeflow_metadata_store -a s3_store -o eks_orchestrator -c ecr_registry -x local_secret_manager

Login Succeeded
[2;36mRunning with active profile: [0m[2;32m'default'[0m[2;36m [0m[1;2;36m([0m[2;36mlocal[0m[1;2;36m)[0m
[31m╭─[0m[31m──────────────────── [0m[1;31mTraceback [0m[1;2;31m(most recent call last)[0m[31m ─────────────────────[0m[31m─╮[0m
[31m│[0m                                                                              [31m│[0m
[31m│[0m [2;33m/home/htahir1/.virtualenvs/zenbytes_2/bin/[0m[1;33mzenml[0m:[94m8[0m in [92m<module>[0m                [31m│[0m
[31m│[0m                                                                              [31m│[0m
[31m│[0m   [2m5 [0m[94mfrom[0m [4;96mzenml[0m[4;96m.[0m[4;96mcli[0m[4;96m.[0m[4;96mcli[0m [94mimport[0m cli                                            [31m│[0m
[31m│[0m   [2m6 [0m[94mif[0m [91m__name__[0m == [33m'[0m[33m__main__[0m[33m'[0m:                                               [31m│[0m
[31m│[0m   [2m7 [0m[2m│   [0msys.argv[[94m0[0m] =

Updated context arn:aws:eks:us-east-1:715803424590:cluster/zenhacks-cluster in /home/htahir1/.kube/config
[2;36mRunning with active profile: [0m[2;32m'default'[0m[2;36m [0m[1;2;36m([0m[2;36mlocal[0m[1;2;36m)[0m
[31m╭─[0m[31m──────────────────── [0m[1;31mTraceback [0m[1;2;31m(most recent call last)[0m[31m ─────────────────────[0m[31m─╮[0m
[31m│[0m                                                                              [31m│[0m
[31m│[0m [2;33m/home/htahir1/.virtualenvs/zenbytes_2/bin/[0m[1;33mzenml[0m:[94m8[0m in [92m<module>[0m                [31m│[0m
[31m│[0m                                                                              [31m│[0m
[31m│[0m   [2m5 [0m[94mfrom[0m [4;96mzenml[0m[4;96m.[0m[4;96mcli[0m[4;96m.[0m[4;96mcli[0m [94mimport[0m cli                                            [31m│[0m
[31m│[0m   [2m6 [0m[94mif[0m [91m__name__[0m == [33m'[0m[33m__main__[0m[33m'[0m:                         

[2;36mRunning with active profile: [0m[2;32m'default'[0m[2;36m [0m[1;2;36m([0m[2;36mlocal[0m[1;2;36m)[0m
[31m╭─[0m[31m──────────────────── [0m[1;31mTraceback [0m[1;2;31m(most recent call last)[0m[31m ─────────────────────[0m[31m─╮[0m
[31m│[0m                                                                              [31m│[0m
[31m│[0m [2;33m/home/htahir1/.virtualenvs/zenbytes_2/bin/[0m[1;33mzenml[0m:[94m8[0m in [92m<module>[0m                [31m│[0m
[31m│[0m                                                                              [31m│[0m
[31m│[0m   [2m5 [0m[94mfrom[0m [4;96mzenml[0m[4;96m.[0m[4;96mcli[0m[4;96m.[0m[4;96mcli[0m [94mimport[0m cli                                            [31m│[0m
[31m│[0m   [2m6 [0m[94mif[0m [91m__name__[0m == [33m'[0m[33m__main__[0m[33m'[0m:                                               [31m│[0m
[31m│[0m   [2m7 [0m[2m│   [0msys.argv[[94m0[0m] = re.sub([33mr[

[2;36mRunning with active profile: [0m[2;32m'default'[0m[2;36m [0m[1;2;36m([0m[2;36mlocal[0m[1;2;36m)[0m
[31m╭─[0m[31m──────────────────── [0m[1;31mTraceback [0m[1;2;31m(most recent call last)[0m[31m ─────────────────────[0m[31m─╮[0m
[31m│[0m                                                                              [31m│[0m
[31m│[0m [2;33m/home/htahir1/.virtualenvs/zenbytes_2/bin/[0m[1;33mzenml[0m:[94m8[0m in [92m<module>[0m                [31m│[0m
[31m│[0m                                                                              [31m│[0m
[31m│[0m   [2m5 [0m[94mfrom[0m [4;96mzenml[0m[4;96m.[0m[4;96mcli[0m[4;96m.[0m[4;96mcli[0m [94mimport[0m cli                                            [31m│[0m
[31m│[0m   [2m6 [0m[94mif[0m [91m__name__[0m == [33m'[0m[33m__main__[0m[33m'[0m:                                               [31m│[0m
[31m│[0m   [2m7 [0m[2m│   [0msys.argv[[94m0[0m] = re.sub([33mr[

[2;36mRunning with active profile: [0m[2;32m'default'[0m[2;36m [0m[1;2;36m([0m[2;36mlocal[0m[1;2;36m)[0m
[1;35mRegistered stack component with type 'secrets_manager' and name 'local_secret_manager'.[0m
[2;36mSuccessfully registered secrets manager `local_secret_manager`.[0m
[2;36mRunning with active profile: [0m[2;32m'default'[0m[2;36m [0m[1;2;36m([0m[2;36mlocal[0m[1;2;36m)[0m
[2K[32m⠹[0m Registering stack 'aws_kubeflow_stack'.....[1;35mRegistered stack with name 'aws_kubeflow_stack'.[0m
[2K[2;36mStack [0m[2;32m'aws_kubeflow_stack'[0m[2;36m successfully registered![0m
[2K[32m⠸[0m Registering stack 'aws_kubeflow_stack'...beflow_stack'...[0m
[1A[2K

## Transition to Production (Run on the Cloud)

Once the stack is configured, all that is left to do is to set it active and to run a pipeline. Note that the code itself DOES NOT need to change, only the active stack.

ZenML will detect that the stack has changed, and instead of running your pipeline locally, will build a Docker Image, push it to the container registry with your requirements, and deploy the pipeline with that image on Kubeflow Pipelines. This whole process is usually very painful but simplified with ZenML, and is completely customizable.

For now, try it out! It might take a few minutes to build and push the image, but after that you'd see your pipeline in the cloud!

<div class="alert alert-block alert-info">
    <b>Note:</b> Currently running pipelines defined within a jupyter notebook cell is
    not supported. To get around this you can run the train pipeline within this repo. 
</div>

In [None]:
!zenml stack set aws_kubeflow_stack

# Let's train within kubeflow pipelines - this will deploy the pipeline in a one of manner
!python run.py --deploy --kubernetes-context=arn:aws:eks:us-east-1:715803424590:cluster/zenhacks-cluster --namespace=kubeflow --base-url=http://abb84c444c7804aa98fc8c097896479d-377673393.us-east-1.elb.amazonaws.com 

[2;36mRunning with active profile: [0m[2;32m'default'[0m[2;36m [0m[1;2;36m([0m[2;36mlocal[0m[1;2;36m)[0m
[?25l[2;36mActive stack set to: [0m[2;32m'aws_kubeflow_stack'[0m
[2K[32m⠋[0m Setting the active stack to 'aws_kubeflow_stack'...beflow_stack'...[0m
[1A[2K[1;35mCreating run for pipeline: `[0m[33;21mcontinuous_deployment_pipeline`[1;35m[0m
[1;35mCache disabled for pipeline `[0m[33;21mcontinuous_deployment_pipeline`[1;35m[0m
[1;35mRegistered stack component with type 'artifact_store' and name 's3_store'.[0m
[1;35mRegistered stack component with type 'container_registry' and name 'ecr_registry'.[0m
[1;35mRegistered stack component with type 'metadata_store' and name 'kubeflow_metadata_store'.[0m
[1;35mRegistered stack component with type 'orchestrator' and name 'eks_orchestrator'.[0m
[1;35mRegistered stack component with type 'secrets_manager' and name 'local_secret_manager'.[0m
[1;35mRegistered stack with name 'aws_kubeflow_stack'.[0m
[1;3

In order to see the pipeline run, you should port-forward Kubeflow Pipelines to: [http://localhost:8080/](http://localhost:8080/). You might want to try this is a seperate shell:

```
kubectl port-forward -n kubeflow svc/ml-pipeline-ui 8080:80
```

In [None]:
# Do this only if the port forward from `zenml stack up` did not work. 
!kubectl port-forward -n kubeflow svc/ml-pipeline-ui 8080:80

And thats it you have successfully transitioned from local to production by simply switching you ZenML stack. This is just scratching the surface!

Next up, more about stacks, running pipelines on a schedule, and much more coming soon!