## ArgoCD to manage applications on the Kubernetes cluster

With our Kubernetes cluster up and running, we are ready to deploy applications on it!

In [1]:
export PATH=/work/.local/bin:$PATH
export PYTHONUSERBASE=/work/.local
export ANSIBLE_CONFIG=/work/MLOps/continous_X_pipeline/ansible/ansible.cfg
export ANSIBLE_ROLES_PATH=roles

First, we will deploy our birdclef “platform”. This has all the “accessory” services we need to support our machine learning application.

Let’s add the birdclef-platform application now. In the output of the following cell, look for the MinIO secret, which will be generated and then printed in the output:

In [2]:
cd /work/.ssh

In [3]:
ssh-add id_rsa_chameleon_project_g38


Identity added: id_rsa_chameleon_project_g38 (sudharshanramesh@Sudharshans-MBP.lan)


In [8]:
cd /work/MLOps/continous_X_pipeline/ansible/

In [9]:
ansible-playbook -i inventory.yml argocd/argocd_add_platform.yml


PLAY [Deploy MLflow platform via ArgoCD & Helm with MinIO secret handling] *****

TASK [Gathering Facts] *********************************************************
ok: [node1]

TASK [Get ArgoCD admin password from Kubernetes secret] ************************
changed: [node1]

TASK [Decode ArgoCD admin password] ********************************************
changed: [node1]

TASK [Log in to ArgoCD] ********************************************************
ok: [node1]

TASK [Add repository to ArgoCD] ************************************************
changed: [node1]

TASK [Detect external IP starting with 10.56] **********************************
ok: [node1]

TASK [Ensure birdclef-platform namespace exists] *******************************
ok: [node1]

TASK [Create birdclef-platform namespace if missing] ***************************
skipping: [node1]

TASK [Check if MinIO secret already exists] ************************************
ok: [node1]

TASK [Generate MinIO secret key] *****************

Let's analyse the code for [argocd_add_platform](https://github.com/exploring-curiosity/MLOps/edit/main/continous_X_pipeline/ansible/argocd/argocd_add_platform.yml) :

Here we are executing the following tasks. 

1. Creating Directory for the mount : Since we are using a persistent Block Storage which already exists, we will have to run two commands. One to build the directory and the other to mount it.

2. We next mount the directory

Note : As of this writing, this code has been commented out we do not yet have the persistent Block Storage for the services. This would also require us to modify the files in Platform Helm charts, but has been left out for now to ensure the complete integration works. 

3. We next get the ArgoCD admin password from Kubernetes secret

4. We Decode ArgoCD admin password

5. We Log in to ArgoCD

6. Add repository to ArgoCD. This helps in syncing the platform for any changes in the Kubernetes Manifest files. 

7. We ensure birdclef-platform namespace exists

8. We create birdclef-platform namespace if missing

9. Check if MinIO secret already exists, in case we are running this flow again

10. If we are running this flow for the first time, we generate MinIO secret key. 

11. Fetching existing MinIO secret key if already exists. 

12. Decoding existing MinIO secret key

13. Check if ArgoCD application exists

14. Create ArgoCD Helm application (like MinIO, MLFLow, PostgreSQL, Prometheus, LabelStudio etc) if it does not exist. 

15. Update ArgoCD Helm application if it exists

16. Display MinIO credentials to login. 

After running this flow for the first time, any changes made in Helm Application via git will directly be reflected in ArgoCD. 

Once the platform is deployed, we can open:
(substitute A.B.C.D with floating IP) \
MinIO object Store :  http://A.B.C.D:9001 \
MLFlow             :  http://A.B.C.D:8000  \
Label-Studio : http://A.B.C.D:5000 \
Prometheus : http://A.B.C.D:4000 \
Grafana : http://A.B.C.D:3000 

Next, we need to deploy the Bird Classification application. Before we do, we need to build a container image. We will run a one-time workflow in Argo Workflows to build the initial container images for the “staging”, “canary”, and “production” environments:

In [2]:
cd /work/MLOps/continous_X_pipeline/ansible

In [25]:
ansible-playbook -i inventory.yml argocd/workflow_build_init.yml


PLAY [Run Argo Workflow from GitHub Repo] **************************************

TASK [Gathering Facts] *********************************************************
ok: [node1]

TASK [Clone or update birdclef-iac repo] ***************************************
changed: [node1]

PLAY [Run Argo Workflow from GitHub Repo] **************************************

TASK [Gathering Facts] *********************************************************
ok: [node1]

TASK [Clone or update birdclef-iac repo] ***************************************
ok: [node1]

TASK [Submit Argo Workflow] ****************************************************
changed: [node1]

TASK [Extract Workflow Name] ***************************************************
ok: [node1]

TASK [Wait for workflow to complete (success or fail)] *************************
changed: [node1]

TASK [Get final workflow result] ***********************************************
changed: [node1]

TASK [Display workflow phase] *********************************

Through this workflow : [workflow_build_init](https://github.com/exploring-curiosity/MLOps/blob/main/continous_X_pipeline/ansible/argocd/workflow_build_init.yml) 

we are calling the [build-initial.yaml](https://github.com/exploring-curiosity/MLOps/blob/main/continous_X_pipeline/workflows/build-initial.yaml) file which executes the following tasks : 

Builds the initial container images for staging, canary, and production using the [FastAPI wrapper](https://github.com/harishbalajib/BirdClassification) for the model. 

In [26]:
cd /work/MLOps/continous_X_pipeline/ansible


In [27]:
ansible-playbook -i inventory.yml argocd/argocd_add_staging.yml


PLAY [Deploy Bird Classification Staging via ArgoCD & Helm] ********************

TASK [Gathering Facts] *********************************************************
ok: [node1]

TASK [Get ArgoCD admin password] ***********************************************
changed: [node1]

TASK [Decode ArgoCD password] **************************************************
changed: [node1]

TASK [Login to ArgoCD] *********************************************************
ok: [node1]

TASK [Detect external IP starting with 10.56] **********************************
ok: [node1]

TASK [Create birdclef-staging namespace if missing] ****************************
ok: [node1]

TASK [Check if ArgoCD app exists] **********************************************
ok: [node1]

TASK [Create ArgoCD Helm app if not exists] ************************************
skipping: [node1]

TASK [Update ArgoCD Helm app if exists] ****************************************
changed: [node1]

TASK [Display ArgoCD app status] *****************

By executing the workflow [argocd_add_staging.yml](https://github.com/exploring-curiosity/MLOps/blob/main/continous_X_pipeline/ansible/argocd/argocd_add_staging.yml) we are primarily creating the birdclef-staging namespace which we can monitor in ArgoCD. And by using this worflow, we are executing [staging](https://github.com/exploring-curiosity/MLOps/tree/main/continous_X_pipeline/k8s/staging) manifest, where we actually create a container for the staging environment from the above staging image we created. 

At the end of this workflow, our application should be up and running and available at http://A.B.C.D:8081 (where A.B.C.D is our public IP)


In [28]:
cd /work/MLOps/continous_X_pipeline/ansible

In [29]:
ansible-playbook -i inventory.yml argocd/argocd_add_canary.yml


PLAY [Deploy Bird Classification Canary via ArgoCD & Helm] *********************

TASK [Gathering Facts] *********************************************************
ok: [node1]

TASK [Get ArgoCD admin password] ***********************************************
changed: [node1]

TASK [Decode ArgoCD password] **************************************************
changed: [node1]

TASK [Login to ArgoCD] *********************************************************
ok: [node1]

TASK [Detect external IP starting with 10.56] **********************************
ok: [node1]

TASK [Create birdclef-canary namespace if missing] *****************************
ok: [node1]

TASK [Check if ArgoCD app exists] **********************************************
ok: [node1]

TASK [Create ArgoCD Helm app if not exists] ************************************
skipping: [node1]

TASK [Update ArgoCD Helm app if exists] ****************************************
changed: [node1]

TASK [Display ArgoCD app status] *****************

By executing the workflow [argocd_add_canary.yml](https://github.com/exploring-curiosity/MLOps/blob/main/continous_X_pipeline/ansible/argocd/argocd_add_canary.yml) we are primarily creating the birdclef-canary namespace which we can monitor in ArgoCD. And by using this worflow, we are executing [canary](https://github.com/exploring-curiosity/MLOps/tree/main/continous_X_pipeline/k8s/canary) manifest, where we actually create a container for the canary environment from the above canary image we created. 

At the end of this workflow, our application should be up and running and available at http://A.B.C.D:8080 (where A.B.C.D is our public IP)

In [30]:
cd /work/MLOps/continous_X_pipeline/ansible

In [31]:
ansible-playbook -i inventory.yml argocd/argocd_add_prod.yml


PLAY [Deploy Bird Classification Production via ArgoCD & Helm] *****************

TASK [Gathering Facts] *********************************************************
ok: [node1]

TASK [Get ArgoCD admin password] ***********************************************
changed: [node1]

TASK [Decode ArgoCD password] **************************************************
changed: [node1]

TASK [Login to ArgoCD] *********************************************************
ok: [node1]

TASK [Detect external IP starting with 10.56] **********************************
ok: [node1]

TASK [Create birdclef-production namespace if missing] *************************
ok: [node1]

TASK [Check if ArgoCD app exists] **********************************************
ok: [node1]

TASK [Create ArgoCD Helm app if not exists] ************************************
skipping: [node1]

TASK [Update ArgoCD Helm app if exists] ****************************************
changed: [node1]

TASK [Display ArgoCD app status] *****************

By executing the workflow [argocd_add_prod.yml](https://github.com/exploring-curiosity/MLOps/blob/main/continous_X_pipeline/ansible/argocd/argocd_add_prod.yml) we are primarily creating the birdclef-production namespace which we can monitor in ArgoCD. And by using this worflow, we are executing [production](https://github.com/exploring-curiosity/MLOps/tree/main/continous_X_pipeline/k8s/production) manifest, where we actually create a container for the staging environment from the above production image we created. 

At the end of this workflow, our application should be up and running and available at http://A.B.C.D (where A.B.C.D is our public IP)

In [32]:
cd /work/MLOps/continous_X_pipeline/ansible

Now, we will manage our application lifecycle with Argo Worfklows. We will understand these workflow more in depth in the next sections. 

In [33]:
ansible-playbook -i inventory.yml argocd/workflow_templates_apply.yml


PLAY [Clone repo and apply specific Argo WorkflowTemplates] ********************

TASK [Gathering Facts] *********************************************************
ok: [node1]

TASK [Clone or update birdclef-iac repo] ***************************************
ok: [node1]

TASK [Apply selected WorkflowTemplates to Argo namespace] **********************
changed: [node1] => (item=build-container-image.yaml)
changed: [node1] => (item=deploy-container-image.yaml)
changed: [node1] => (item=promote-model.yaml)
changed: [node1] => (item=train-model.yaml)

TASK [Verify applied WorkflowTemplates] ****************************************
changed: [node1]

TASK [Show WorkflowTemplates] **************************************************
ok: [node1] => 
  wft_list.stdout: |-
    NAME                     AGE
    build-container-image    3h9m
    deploy-container-image   3h9m
    promote-model            3h9m
    train-model              3h9m

PLAY RECAP *************************************************