## ArgoCD for Kubernetes Application Management

Now that our Kubernetes cluster is set up, we can use ArgoCD to deploy and manage applications on it!

In [None]:
export PATH=/work/.local/bin:$PATH
export PYTHONUSERBASE=/work/.local
export ANSIBLE_CONFIG=/work/ML-SysOps_Project-main\ 2/continous_X_pipeline/ansible/ansible.cfg
export ANSIBLE_ROLES_PATH=roles

First, we will deploy our birdclef “platform”. This has all the “accessory” services we need to support our machine learning application.

Let’s add the birdclef-platform application now. In the output of the following cell, look for the MinIO secret, which will be generated and then printed in the output:

In [4]:
cd /work/ML-SysOps_Project-main\ 2/continous_X_pipeline/ansible/

In [5]:
ansible-playbook -i inventory.yml argocd/argocd_add_platform.yml


PLAY [Deploy MLflow platform via ArgoCD & Helm with MinIO secret handling] *****

TASK [Gathering Facts] *********************************************************
ok: [node1]

TASK [Get ArgoCD admin password from Kubernetes secret] ************************
changed: [node1]

TASK [Decode ArgoCD admin password] ********************************************
changed: [node1]

TASK [Log in to ArgoCD] ********************************************************
ok: [node1]

TASK [Add repository to ArgoCD] ************************************************
fatal: [node1]: FAILED! => changed=true 
  cmd:
  - argocd
  - repo
  - add
  - https://github.com/ho1447/ML-SysOps_Project.git
  - --port-forward
  - --port-forward-namespace=argocd
  delta: '0:00:00.300193'
  end: '2025-05-14 10:14:00.629133'
  msg: non-zero return code
  rc: 20
  start: '2025-05-14 10:14:00.328940'
  stderr: '{"level":"fatal","msg":"rpc error: code = Unknown desc = error testing repository connectivity: Get \"https://github.co

: 2

### Analysis of the-argo-cd-platform Code:

This playbook performs the following tasks:

1. **Create Directory for Mount**: Creates a directory and mounts it to the existing persistent Block Storage (currently commented out as the storage isn't set up yet).
2. **Mount the Directory**: Mounts the directory for use (currently disabled).
3. **Retrieve ArgoCD Admin Password**: Gets the ArgoCD admin password from the Kubernetes secret.
4. **Decode ArgoCD Admin Password**: Decodes the ArgoCD password for login.
5. **Login to ArgoCD**: Logs into the ArgoCD UI.
6. **Add Repository to ArgoCD**: Syncs the platform for any updates in the Kubernetes manifests.
7. **Ensure Namespace Exists**: Verifies the `Modular-Speech-platform` namespace exists or creates it if missing.
8. **Check and Generate MinIO Secret**: Checks if the MinIO secret exists, and generates it if this is the first run.
9. **Fetch and Decode MinIO Secret**: Fetches and decodes the MinIO secret if already created.
10. **Check for ArgoCD Application**: Verifies if an ArgoCD application exists.
11. **Create or Update ArgoCD Helm Application**: Creates or updates applications (e.g., MinIO, MLFlow, PostgreSQL) via Helm.
12. **Display MinIO Credentials**: Displays MinIO credentials for login.

After the first run, any changes to the Helm application through Git will be automatically reflected in ArgoCD.


Once the platform is deployed, you can access the following services by substituting `A.B.C.D` with your floating IP:

* **MinIO Object Store**: [http://A.B.C.D:9001](http://A.B.C.D:9001)
* **MLFlow**: [http://A.B.C.D:8000](http://A.B.C.D:8000)
* **Label Studio**: [http://A.B.C.D:5000](http://A.B.C.D:5000)
* **Prometheus**: [http://A.B.C.D:4000](http://A.B.C.D:4000)
* **Grafana**: [http://A.B.C.D:3000](http://A.B.C.D:3000)


Next, we will deploy the **Modular Speech application**. Before deploying, we need to build the container images for the different environments. To do this, we’ll run a one-time workflow in **Argo Workflows** to build the initial container images for the following environments:

* **Staging**
* **Canary**
* **Production**

This will ensure that each environment has its own respective image, ready for deployment.


In [7]:
cd /work/ML-SysOps_Project-main\ 2/continous_X_pipeline/ansible

In [8]:
ansible-playbook -i inventory.yml argocd/workflow_build_init.yml


PLAY [Run Argo Workflow from GitHub Repo] **************************************

TASK [Gathering Facts] *********************************************************
ok: [node1]

TASK [Clone or update Modular-Speech-iac repo] *********************************
changed: [node1]

PLAY [Run Argo Workflow from GitHub Repo] **************************************

TASK [Gathering Facts] *********************************************************
ok: [node1]

TASK [Clone or update Modular-Speech-iac repo] *********************************
changed: [node1]

TASK [Submit Argo Workflow] ****************************************************
changed: [node1]

TASK [Extract Workflow Name] ***************************************************
ok: [node1]

TASK [Wait for workflow to complete (success or fail)] *************************
changed: [node1]

TASK [Get final workflow result] ***********************************************
changed: [node1]

TASK [Display workflow phase] ****************************

: 2

Through this workflow : workflow_build_init

we are calling the build-initial.yaml file which executes the following tasks : 

Builds the initial container images for staging, canary, and production using the FastAPI wrapper for the model. 

In [9]:
cd /work/ML-SysOps_Project-main\ 2/continous_X_pipeline/ansible


In [11]:
ansible-playbook -i inventory.yml argocd/argocd_add_staging.yml


PLAY [Deploy Modular-Speech Classification Staging via ArgoCD & Helm] **********

TASK [Gathering Facts] *********************************************************
ok: [node1]

TASK [Get ArgoCD admin password] ***********************************************
changed: [node1]

TASK [Decode ArgoCD password] **************************************************
changed: [node1]

TASK [Login to ArgoCD] *********************************************************
ok: [node1]

TASK [Detect external IP starting with 10.56] **********************************
ok: [node1]

TASK [Create Modular-Speech-staging namespace if missing] **********************
fatal: [node1]: FAILED! => changed=false 
  cmd:
  - kubectl
  - create
  - namespace
  - Modular-Speech-staging
  delta: '0:00:00.078087'
  end: '2025-05-14 09:01:53.615235'
  failed_when_result: true
  msg: non-zero return code
  rc: 1
  start: '2025-05-14 09:01:53.537148'
  stderr: 'The Namespace "Modular-Speech-staging" is invalid: metadata.name: Inv

: 2

By executing the workflow argocd_add_staging.yml we are primarily creating the birdclef-staging namespace which we can monitor in ArgoCD. And by using this worflow, we are executing staging manifest, where we actually create a container for the staging environment from the above staging image we created. 

At the end of this workflow, our application should be up and running and available at http://A.B.C.D:8081 (where A.B.C.D is our public IP)


In [28]:
cd /work/MLOps/continous_X_pipeline/ansible

In [29]:
ansible-playbook -i inventory.yml argocd/argocd_add_canary.yml


PLAY [Deploy Bird Classification Canary via ArgoCD & Helm] *********************

TASK [Gathering Facts] *********************************************************
ok: [node1]

TASK [Get ArgoCD admin password] ***********************************************
changed: [node1]

TASK [Decode ArgoCD password] **************************************************
changed: [node1]

TASK [Login to ArgoCD] *********************************************************
ok: [node1]

TASK [Detect external IP starting with 10.56] **********************************
ok: [node1]

TASK [Create birdclef-canary namespace if missing] *****************************
ok: [node1]

TASK [Check if ArgoCD app exists] **********************************************
ok: [node1]

TASK [Create ArgoCD Helm app if not exists] ************************************
skipping: [node1]

TASK [Update ArgoCD Helm app if exists] ****************************************
changed: [node1]

TASK [Display ArgoCD app status] *****************

By executing the workflow argocd_add_canary.yml we are primarily creating the birdclef-canary namespace which we can monitor in ArgoCD. And by using this worflow, we are executing canary manifest, where we actually create a container for the canary environment from the above canary image we created. 

At the end of this workflow, our application should be up and running and available at http://A.B.C.D:8080 (where A.B.C.D is our public IP)

In [13]:
cd /work/ML-SysOps_Project-main\ 2/continous_X_pipeline/ansible

In [14]:
ansible-playbook -i inventory.yml argocd/argocd_add_prod.yml


PLAY [Deploy Modular Speech Production via ArgoCD & Helm] **********************

TASK [Gathering Facts] *********************************************************
ok: [node1]

TASK [Get ArgoCD admin password] ***********************************************
changed: [node1]

TASK [Decode ArgoCD password] **************************************************
changed: [node1]

TASK [Login to ArgoCD] *********************************************************
ok: [node1]

TASK [Detect external IP starting with 10.56] **********************************
ok: [node1]

TASK [Create Modular-Speech-production namespace if missing] *******************
fatal: [node1]: FAILED! => changed=false 
  cmd:
  - kubectl
  - create
  - namespace
  - Modular-Speech-production
  delta: '0:00:00.129734'
  end: '2025-05-14 09:08:40.582922'
  failed_when_result: true
  msg: non-zero return code
  rc: 1
  start: '2025-05-14 09:08:40.453188'
  stderr: 'The Namespace "Modular-Speech-production" is invalid: metadata.nam

: 2

By executing the workflow argocd_add_prod.yml we are primarily creating the birdclef-production namespace which we can monitor in ArgoCD. And by using this worflow, we are executing production manifest, where we actually create a container for the staging environment from the above production image we created. 

At the end of this workflow, our application should be up and running and available at http://A.B.C.D (where A.B.C.D is our public IP)

In [32]:
cd /work/MLOps/continous_X_pipeline/ansible

Now, we will manage our application lifecycle with Argo Worfklows. We will understand these workflow more in depth in the next sections. 

In [33]:
ansible-playbook -i inventory.yml argocd/workflow_templates_apply.yml


PLAY [Clone repo and apply specific Argo WorkflowTemplates] ********************

TASK [Gathering Facts] *********************************************************
ok: [node1]

TASK [Clone or update birdclef-iac repo] ***************************************
ok: [node1]

TASK [Apply selected WorkflowTemplates to Argo namespace] **********************
changed: [node1] => (item=build-container-image.yaml)
changed: [node1] => (item=deploy-container-image.yaml)
changed: [node1] => (item=promote-model.yaml)
changed: [node1] => (item=train-model.yaml)

TASK [Verify applied WorkflowTemplates] ****************************************
changed: [node1]

TASK [Show WorkflowTemplates] **************************************************
ok: [node1] => 
  wft_list.stdout: |-
    NAME                     AGE
    build-container-image    3h9m
    deploy-container-image   3h9m
    promote-model            3h9m
    train-model              3h9m

PLAY RECAP *************************************************