## A brief hands on illustration of Argo Workflows



### Containers e.g. docker (prerequisite) 

[**Definition (wikipedia)**](https://en.wikipedia.org/wiki/Docker_(software)): 
Docker is a set of platform as a service (PaaS) products that use OS-level 
virtualization to deliver software in packages called containers. [...] The 
software that hosts the containers is called Docker Engine.

<center>
  <img src="Images/Docker-architecture-high-level-overview.png" alt="drawing" width="800"/>

  Docker architecture ([image copyright](https://docs.docker.com/get-started/overview/#docker-architecture))
</center>

Notes:
 * Main commands: `docker build <image>`, `docker pull <image>`, `docker run <image>`
 * [Docker Desktop](https://www.docker.com/products/docker-desktop/): a popular 
   implementation made by 
   [Docker Inc.](https://en.wikipedia.org/wiki/Docker,_Inc.)
   Since [August 2021](https://en.wikipedia.org/wiki/Docker_(software)#Adoption) 
   Docker Desktop for Windows and MacOS is **no longer free** for enterprise 
   users.
 * [Alternative implementations](https://blog.alexellis.io/building-containers-without-docker/): 
   [Kaniko](https://github.com/GoogleContainerTools/kaniko)(unofficial Google 
   builder), 
   [podman](https://podman.io/)/[buildah](https://github.com/containers/buildah)
   (RedHat / IBM's efforts),
   [pouch](https://github.com/alibaba/pouch) (Alibaba)...

---

### Kubernetes (prerequisite)

[**Definition** (wikipedia)](https://en.wikipedia.org/wiki/Kubernetes): Kubernetes
(commonly stylized as K8s) is an open-source container (including docker) 
orchestration system for automating software deployment, scaling, and management.

<center>
  <img src="Images/kubernetes_architecture_and_cluster_components-Medium_dot_com.png" alt="drawing" width="800"/>
  
  Kubernetes architecture and components ([image copyright](https://miro.medium.com/max/1000/1*kSRH4T8S1YmAuHbpgQ3Ylw.png))
</center>

Note: [Pod](https://kubernetes.io/docs/concepts/workloads/pods/) (K8s terminology): 
a group of one or more containers (the smallest deployable units of computing that
you can create and manage in Kubernetes)

---

### [Minikube](https://minikube.sigs.k8s.io/docs/) (prerequisite)

[**Definition**](https://github.com/kubernetes/minikube):  minikube implements 
local Kubernetes cluster on macOS, Linux, and Windows.

<center>
  <img src="Images/minikube-architecture.png" alt="drawing" width="800"/>
</center>

Notes: 
* Minikube is provided by the k8s community 
  ([Apache license](https://github.com/kubernetes/minikube/blob/master/LICENSE)).
* Minikube provides a docker engine.

---
## Installing minikube

In [8]:

# Minikube provides a local Kubernetes cluster on common desktops
!brew install minikube
!minikube --memory=8G --cpus 4 start

😄  minikube v1.25.2 on Darwin 12.1
✨  Using the hyperkit driver based on existing profile
👍  Starting control plane node minikube in cluster minikube
🏃  Updating the running hyperkit "minikube" VM ...
🐳  Preparing Kubernetes v1.23.3 on Docker 20.10.12 ...[K[K
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace


In [1]:
# Interaction with k8s (local) cluster is done through cli
!brew install kubernetes-cli
!kubectl config view


apiVersion: v1
clusters: null
contexts: null
current-context: ""
kind: Config
preferences: {}
users: null


In [1]:
!kubectl get pods --all-namespaces

The connection to the server localhost:8080 was refused - did you specify the right host or port?


<img src="Images/kubernetes_architecture_and_cluster_components-Medium_dot_com.png" alt="drawing" width="800"/>

---

## ArgoWorkflows: brief introduction

[**Definition**](https://argoproj.github.io/argo-workflows/): Argo Workflows is 
an open source container-native workflow engine for orchestrating parallel jobs
on Kubernetes (and implemented as a Kubernetes Custom Resource Definition).

<center>
  <img src="Images/Argo_workflows_Architecture_diagram.png" alt="drawing" width="600"/>

  Argo Workflows architecture ([image copyright](https://argoproj.github.io/argo-workflows/architecture/))
</center>

### What you can do with argo

* Provides a REST API
* Featured UI
* Work with workflows
  * Create/define
  * Persist (on the Kubernetes cluster) a.k.a. "templating"
  * run a workflow

### Main usages: 

* ML (Machine Learning), 
* ETL (Extract Transform Load), 
* Batch/Data processing, 
* CI/CD

---

### Installing Argo Workflows

In [None]:
# ArgoWorkflows comes as a k8s CRD
!kubectl create ns argo
!kubectl config set-context --current --namespace=argo
!kubectl apply -f https://raw.githubusercontent.com/argoproj/argo-workflows/master/manifests/quick-start-postgres.yaml

In [1]:
# Interaction with argo servers is done through cli (just as kubernetes)
!kubectl get pods --all-namespaces

NAMESPACE     NAME                                  READY   STATUS    RESTARTS      AGE
argo          argo-server-78f47df69f-7pqwj          0/1     Running   2 (27s ago)   62s
argo          minio-76f795c89b-lbvkw                1/1     Running   0             62s
argo          postgres-869f7fbd7f-7fl7f             1/1     Running   0             62s
argo          workflow-controller-b99cbc8bf-mfk6g   1/1     Running   2 (25s ago)   62s
kube-system   coredns-64897985d-rb59b               1/1     Running   0             81s
kube-system   etcd-minikube                         1/1     Running   0             97s
kube-system   kube-apiserver-minikube               1/1     Running   0             96s
kube-system   kube-controller-manager-minikube      1/1     Running   0             94s
kube-system   kube-proxy-5lvkd                      1/1     Running   0             82s
kube-system   kube-scheduler-minikube               1/1     Running   0             94s
kube-system   storage-provisione

<center>
  <img src="Images/Argo_workflows_Architecture_diagram.png" alt="drawing" width="400"/>
</center>

---
## Interacting with argo (server)

In [2]:
# Interaction with argo server can done through cli (just as with k8s)
!brew install argo



In [10]:
!argo list

NAME                STATUS      AGE   DURATION   PRIORITY
hello-world-fq6vc   Succeeded   19m   10s        0
hello-world-k5fnn   Succeeded   20m   10s        0
hello-world-n2j9q   Succeeded   23m   20s        0
hello-world-snspb   Succeeded   1h    31s        0


In [2]:
# Interaction with argo server can also be done through UI
# Forward ad-hoc ports
kubectl -n argo port-forward deployment/argo-server 2746:2746 &

Forwarding from 127.0.0.1:2746 -> 2746


In [1]:
# Open UI per se
!open https://localhost:2746

## Authoring workflows

Workflows (pipelines) are a succession of container jobs/batches witch hooked up Input/Outputs.
So one first needs containers. 


In [4]:
# User Minikube's built-in docker command, refer e.g. to
eval $(minikube docker-env)
docker run -it docker/whalesay cowsay "salut les bad gones"

 _____________________ 
< salut les bad gones >
 --------------------- 
    \
     \
      \     
                    ##        .            
              ## ## ##       ==            
           ## ## ## ##      ===            
       /""""""""""""""""___/ ===        
  ~~~ {~~ ~~~~ ~~~ ~~~~ ~~ ~ /  ===- ~~~   
       \______ o          __/            
        \    \        __/             
          \____\______/   


Just as kubernetes uses [YAML](https://github.com/argoproj/argo-workflows/blob/master/examples/hello-world.yaml) 
for its configuration files, (argo) workflow descriptions, a.k.a. as templates, 
are expressed in YAML. Here is the workflow calling the `cowsay` container

```bash
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: hello-world-                       # Use to label the logs
  annotations:
    workflows.argoproj.io/description: |
      This is a simple hello world example.
spec:
  entrypoint: whalesay                             # On entry jump to the template named "whalesay"
  templates:
  - name: whalesay                                 # A template is a step/job/container-call
    container:
      image: docker/whalesay:latest                # Pull that image from docker-hub
      command: [cowsay]                            # Run it with that command and arguments
      args: ["hello world"]
```

In [None]:
# Submission can be done on from CLI
argo submit --watch https://raw.githubusercontent.com/argoproj/argo-workflows/master/examples/hello-world.yaml

You can also watch it running with the UI

Alternatively you can also 

---
## Running your local workflows
You first need to build the containers your workflow will use

In [3]:
# User Minikube's built-in docker command, refer e.g. to
eval $(minikube docker-env)
docker build -t vcity/collect_lyon_data Docker/Collect-DockerContext/
docker build -t vcity/3duse ../Docker/3DUse-DockerContext/
docker build -t vcity/citygml2stripper ../Docker/CityGML2Stripper-DockerContext/
docker build --no-cache -t vcity/py3dtilers https://github.com/VCityTeam/py3dtilers-docker.git#:Context
docker pull refstudycentre/scratch-base:latest

Sending build context to Docker daemon  36.35kB
Step 1/8 : FROM python:3.7-buster
 [...]
Step 8/8 : CMD ['python3', 'entrypoint.py']
---> Running in 4da9408de657
Successfully built b74a1646146f
Successfully tagged vcity/collect_lyon_data:latest

Sending build context to Docker daemon  9.216kB
Step 1/20 : FROM ubuntu:18.04
 [...]
Step 20/20 : FROM python:3.7-buster
Successfully tagged vcity/collect_lyon_data:latest

 [...]


In [5]:
# Expose (a part of) your desktop/local filesystem as k8s available volume.
# Change your working directory to where you want your workflow inputs/outputs 
# to be located and then 
!minikube mount `pwd`:/data/host &

✦ ❯ 📁  Mounting host path /Users/eboix/tmp/VCity/ExpeData-Workflows_testing/ArgoWorkflows into VM as /data/host ...
▪ User ID:      docker
▪ Options:      map[]
▪ Bind Address: X.Y.Z.1:64263
✅  Successfully mounted /Users/eboix/tmp/VCity/ExpeData-Workflows_testing/ArgoWorkflows to /data/host


A workflow can now [use this directory as a volume](https://minikube.sigs.k8s.io/docs/handbook/mount/)) as

```bash
"volumes": [
  {
    "name": "host-mount",
    "hostPath": {
      "path": "/data/host"
    }
  }
]
```
