# Table of Contents
- [Installation](#Installation)
- [Setup](#Setup)
- [Run.py](#Run-py)
- [Argo UI](#Argo-UI)

# <a name="Installation"></a> Installation

## Prerequisite 


**(Skip this step if already installed)**

Before using [dflow](#https://github.com/deepmodeling/dflow), we need to install the following two things:
- Docker (Official installation instruction: https://docs.docker.com/desktop/mac/install/)
- Minikube (Official installation instruction: https://minikube.sigs.k8s.io/docs/start/)

## Install pydflow

In [1]:
!pip install pydflow



**Once installed, restart the jupyter notebook kernel to make the installation to take effect.**

# <a name="Setup"></a> Setup

## Minikube

Dflow runs on kubernetes (k8s), so we need to start minikube

In [2]:
!minikube start

😄  minikube v1.26.0 on Ubuntu 20.04 (amd64)
✨  Using the docker driver based on existing profile
👍  Starting control plane node minikube in cluster minikube
🚜  Pulling base image ...
🏃  Updating the running docker "minikube" container ...
❗  This container is having trouble accessing https://k8s.gcr.io
💡  To pull new external images, you may need to configure a proxy: https://minikube.sigs.k8s.io/docs/reference/networking/proxy/
🐳  Preparing Kubernetes v1.24.1 on Docker 20.10.17 ...[K[K
    ▪ kubelet.cgroup-driver=systemd
🔎  Verifying Kubernetes components...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟  Enabled addons: storage-provisioner, default-storageclass
💡  kubectl not found. If you need it, try: 'minikube kubectl -- get pods -A'
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default


## Argo-workflows

Dflow is built on [argo-workflow](https://github.com/argoproj/argo-workflows), so we need to setup argo engine in k8s:

1. To get started quickly, we can use the quick start manifest which will install Argo Workflows as well as some commonly used components:

In [2]:
!alias kubectl="minikube kubectl --"

In [3]:
!minikube kubectl -- create ns argo
!minikube kubectl -- apply --namespace argo -f argo.yaml

Error from server (AlreadyExists): namespaces "argo" already exists
customresourcedefinition.apiextensions.k8s.io/clusterworkflowtemplates.argoproj.io unchanged
customresourcedefinition.apiextensions.k8s.io/cronworkflows.argoproj.io unchanged
customresourcedefinition.apiextensions.k8s.io/workfloweventbindings.argoproj.io unchanged
customresourcedefinition.apiextensions.k8s.io/workflows.argoproj.io unchanged
customresourcedefinition.apiextensions.k8s.io/workflowtaskresults.argoproj.io unchanged
customresourcedefinition.apiextensions.k8s.io/workflowtasksets.argoproj.io unchanged
customresourcedefinition.apiextensions.k8s.io/workflowtemplates.argoproj.io unchanged
serviceaccount/argo unchanged
serviceaccount/argo-server unchanged
serviceaccount/github.com unchanged
role.rbac.authorization.k8s.io/agent unchanged
role.rbac.authorization.k8s.io/argo-role unchanged
role.rbac.authorization.k8s.io/argo-server-role unchanged
role.rbac.authorization.k8s.io/executor unchanged
role.rbac.authorizati

2. To monitor the setup progress, we can look at the pod status

In [4]:
!minikube kubectl -- get pod -n argo

NAME                                   READY   STATUS    RESTARTS        AGE
argo-server-7f779db785-lfxbv           1/1     Running   6 (4m53s ago)   158m
minio-64889fc698-kzmjd                 1/1     Running   2 (5m49s ago)   158m
postgres-6b5944c545-cdw7z              1/1     Running   3 (5m29s ago)   158m
workflow-controller-74f9c77d7d-gffkl   1/1     Running   5 (4m50s ago)   158m


**NOTE!!!!**: This process might take a while, depending on the internet speed. Wait and keep refreshing the above cell. Once the `STATUS` of all pods is `RUNNING`, you can proceed with the next step.

**IMPORTANT!!!!**

3. Open a port-forward so you can access the UI:

    Since we need to keep this UI running, we can run this command in the terminal:
    
```shell
minikube kubectl -- --namespace argo port-forward deployment/argo-server 60001:2746 --address 0.0.0.0
```

We can access the Argo UI: https://your-bohrium-ip-address:60001

Security warning will be shown but we can safely ignore it. This is because we haven't add ceritificate to this address. 

# <a name="Run-py"></a> Run.py

In the previous steps, we finished installing and seting up the necessary tools and environments for dflow to run. In this section, we will prepare a simple python script using dflow.

Imagine, we want to achieve the following workflow:

Step 1. 
1. Echo a string to msg.txt 
    
2. Echo a number and redirect it to results.txt 
    
Step 2.
1. Duplicate the content in msg.txt two times and redirect it to a new file
    
2. Get the value in results.txt and times the number by 2 and redirect it to results.txt

To construct a workflow in dflow, three parts are needed:
1. Construct OP templates
2. Instantiate the OP template to Step
3. Put steps together and submit the workflow

## Construct OP template

As explained in the [readme](https://github.com/dptech-corp/dflow#122--op-template), OP template is the fundamental component in dflow. For this particular workflow above, we need two OP templates:

For step 1:

In [5]:
from dflow import ShellOPTemplate
step1_templ = ShellOPTemplate(
                name="Hello",
                image="alpine:latest",
                script="echo {{inputs.parameters.msg}} > /tmp/msg.txt && echo {{inputs.parameters.number}} > /tmp/results.txt",
)

This defines the operation to be executed. Next, we need to setup the inputs and outputs for this step.

In [6]:
from dflow import InputParameter, OutputParameter, OutputArtifact
step1_templ.inputs.parameters = {
                            "msg": InputParameter(),
                            "number": InputParameter(),
}
step1_templ.outputs.parameters = {
                            "out_param": OutputParameter(value_from_path="/tmp/results.txt")
}
step1_templ.outputs.artifacts = {
                            "out_art": OutputArtifact(path="/tmp/msg.txt")
}

For step 2: 

In [7]:
step2_templ = ShellOPTemplate(
                name="Duplicate",
                image="alpine:latest",
                script="cat /tmp/foo.txt /tmp/foo.txt > /tmp/bar.txt && echo $(({{inputs.parameters.number}}*2)) > /tmp/results.txt",
)

This defines the operation to be executed. Notice 2 things:
1. We duplicated the content in `/tmp/foo.txt` 2 times, instead of `/tmp/msg.txt` in step 1. This is because OPTemplates are indepednent of each other. To make `/tmp/foo.txt` the same as `/tmp/msg.txt`, we only need to initialize it correctly when instantiating the OP template.
2. We redirected the output of the arithmetic operation to `/tmp/results.txt`. This is not the file appeared in step 1.

In [8]:
from dflow import InputArtifact
step2_templ.inputs.artifacts = {
                            "in_art":InputArtifact(path="/tmp/foo.txt") 
}
step2_templ.inputs.parameters = {
                            "number": InputParameter(),
}
step2_templ.outputs.artifacts = {
                            "out_art": OutputArtifact(path="/tmp/bar.txt")
}
step2_templ.outputs.parameters = {
                            "out_param": OutputParameter(value_from_path="/tmp/results.txt")
}

## Instantiate the OP template to Step

`Step` in the central block for building a workflow. A `Step` is created by instantiating an OP template. When a `Step` is initialized, values of all input parameters and sources of all input artifacts declared in the OP template must be specified.

In [9]:
from dflow import Step

step1 = Step (
            name="step1",
            template=step1_templ,
            parameters={"msg":"HelloWorld!", "number": 1},
)
step2 = Step(
            name="step2",
            template=step2_templ,
            parameters={"number":step1.outputs.parameters["out_param"]},
            artifacts={"in_art":step1.outputs.artifacts["out_art"]},
)

Step 1 takes in two values as parameters: "HelloWorld!" and 1. Step 2 takes the values and files from step 1 as the input parameters and artifacts.

## Put steps together and submit a workflow

We finished building blocks of this workflow. Now we can put them togther

In [60]:
from dflow.workflow import config
config['host']="https://your-bohrium-ip-address:60001"

from dflow import Workflow
wf = Workflow(name="helloworld")
wf.add(step1)
wf.add(step2)

This creates a workflow with name "helloworld" and adds two steps in series.

We can then submit the workflow. One caveat: we will get warning about certificiate verification since we haven't yet added cerificate to the address we specified for the UI. To suppress it, we can run the following 

In [61]:
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

In [62]:
wf.submit()

Workflow has been submitted (ID: helloworld-5wgxn)


{'metadata': {'name': 'helloworld-5wgxn', 'generateName': 'helloworld-', 'namespace': 'argo', 'uid': '395b9225-39e3-489d-b17e-5755d3851113', 'resourceVersion': '14460', 'generation': 1, 'creationTimestamp': '2022-06-28T08:32:32Z', 'labels': {'workflows.argoproj.io/creator': 'system-serviceaccount-argo-argo-server'}, 'managedFields': [{'manager': 'argo', 'operation': 'Update', 'apiVersion': 'argoproj.io/v1alpha1', 'time': '2022-06-28T08:32:32Z', 'fieldsType': 'FieldsV1', 'fieldsV1': {'f:metadata': {'f:generateName': {}, 'f:labels': {'.': {}, 'f:workflows.argoproj.io/creator': {}}}, 'f:spec': {}, 'f:status': {}}}]}, 'spec': {'templates': [{'name': 'helloworld-steps', 'inputs': {}, 'outputs': {}, 'metadata': {}, 'steps': [[{'name': 'step1', 'template': 'hello', 'arguments': {'parameters': [{'name': 'msg', 'value': 'HelloWorld!'}, {'name': 'number', 'value': '1'}]}, 'continueOn': {}}], [{'name': 'step2', 'template': 'duplicate', 'arguments': {'parameters': [{'name': 'number', 'value': "{{=

Another caveat: if you want to rerun the workflow, you need to start a new workflow by reruning `wf = Workflow(name="helloworld")`

# <a name="Argo-UI"></a> Argo UI

After finishing the previous steps, we can access the workflow we just ran on the UI (https://127.0.0.1:2746)

We should see the following once loaded.

<img src="./imgs/argoui_main_page.png" alt="argoUI_mainpage"/>

We can see the workflow we just ran. Left click it then we can see the following.

<img src="./imgs/workflow_overview.png" alt="workflow_overview"/>

This gives us an overview of the workflow. The first node does not do anything. The two following nodes are the steps specified by us. Click on it then we can see more information about each step.

We can access the input/outputs of step 2. We can see the parameters of the step on the UI. We can download `out_art` by clicking the download buttom. 

<img src="./imgs/access_one_node.png" alt="access_one_node"/>

After decompressing it, you should see a file named `bar.txt`. (This is also what we specified). Open it, you should see `HelloWorld!\nHelloWorld!`