# AKS Cookbook

## 🧪 Kubeflow on AKS lab

![visual](visual.png)

The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable.  
In this lab you will deploy an Azure Kubernetes Service (AKS) cluster and other Azure services (Container Registry, Managed Identity, Key Vault) with Azure CLI and Bicep. You will then install Kubeflow using the default settings using Kustomize and create a jupyter notebook server you can easily access on your browser
It's based on the Vanilla Installation [from this page](https://azure.github.io/kubeflow-aks/main/docs/deployment-options/vanilla-installation/).  

▶️ Click on the `Run All` button to execute all the subsequent steps in sequence, or run each step individually by executing the cells one at a time.

### TOC

- [0️⃣ Initialize notebook variables](#0)
- [1️⃣ Verify the Azure CLI and connected Azure subscription](#1)
- [2️⃣ Create Kubeflow deployment with BICEP 🦾](#2)
- [3️⃣ Retrieve deployment outputs](#3)
- [4️⃣ Connect to the AKS cluster](#4)
- [5️⃣ Retrieve the list of AKS cluster nodes](#5)
- [6️⃣ Deploy Kubeflow without TLS using Default Password](#6)
- [7️⃣ Check the status of the pods created](#7)
- [8️⃣ Access the Kubeflow dashboard with a port forward](#8)
- [🗑️ Clean up resources](#clean)


<a id='0'></a>
### 0️⃣ Initialize notebook variables
You can use this notebook with existing resources or to create the necessary resources.  
Adjust the location parameters according your preferences and on the [product availability by Azure region](https://azure.microsoft.com/en-us/explore/global-infrastructure/products-by-region/?cdn=disable).

In [None]:
import os, time, json, requests, shutil, utils

create_resources = True # specify True if you want to create new resources, False to use existing ones

if create_resources:
    # create new resources with the following properties
    deployment_name = os.path.basename(os.path.dirname(globals()['__vsc_ipynb_file__']))
    resource_group_name = f"lab-{deployment_name}" # change the name to match your naming convention
    resource_group_location = "eastus2"
else:
    # or use the following existing resources
    resource_group_name = ""
    aks_resource_name = ""

utils.print_ok('Notebook initiaized')

<a id='1'></a>
### 1️⃣ Verify the Azure CLI and connected Azure subscription
The following commands ensure that you have the latest version of the Azure CLI and relevant extensions installed while also verifying that the Azure CLI is connected to your Azure subscription.

In [None]:
output = utils.run("az account show", "Retrieved az account", "Failed to get the current az account")
if output.success and output.json_data:
    current_user = output.json_data['user']['name']
    subscription_id = output.json_data['id']
    tenant_id = output.json_data['tenantId']

output = utils.run("az ad signed-in-user show", "Retrieved signed-in-user", "Failed to get signed-in-user")
if output.success and output.json_data:
    signed_in_user_id = output.json_data['id']
    utils.print_info(f"Signed-in User Id: {signed_in_user_id}")

output = utils.run("az provider register --namespace Microsoft.ContainerService --wait", "Microsoft.ContainerService registered in your subscription", "Failed to register Microsoft.ContainerService")
output = utils.run("az provider register --namespace Microsoft.KubernetesConfiguration --wait", "Microsoft.KubernetesConfiguration registered in your subscription", "Failed to register Microsoft.KubernetesConfiguration")
output = utils.run("az extension add --name k8s-extension", "az k8s-extension installed", "Failed to install az k8s-extension")
output = utils.run("az extension update --name k8s-extension", "az k8s-extension updated", "Failed to update az k8s-extension")
output = utils.run("az extension add --name aks-preview", "az aks-preview extension installed", "Failed to install az aks-preview extension")
output = utils.run("az extension update --name aks-preview", "az aks-preview extension updated", "Failed to update az aks-preview extension")


<a id='2'></a>
### 2️⃣ Create Kubeflow deployment with BICEP 🦾
All resources deployed in this lab will be created within the designated resource group.   
The following step creates an AKS cluster using a BICEP deployment. 

In [None]:
if not os.path.exists(".temp/kubeflow-aks"):
    utils.run("git clone --recurse-submodules https://github.com/Azure/kubeflow-aks.git .temp/kubeflow-aks", "Cloned kubeflow-aks repo", "Failed to clone kubeflow-aks repo")
if not os.path.exists(".temp/kubeflow-aks/manifests/vanilla"):
    shutil.copytree('.temp/kubeflow-aks/deployments/vanilla', '.temp/kubeflow-aks/manifests/vanilla') 

if create_resources:
    utils.create_resource_group(create_resources, resource_group_name, resource_group_location)
    bicep_parameters = {
        "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentParameters.json#",
        "contentVersion": "1.0.0.0",
        "parameters": {
            "signedinuser": { "value": signed_in_user_id }
        }
    }    
    with open('params.json', 'w') as bicep_parameters_file:
        bicep_parameters_file.write(json.dumps(bicep_parameters))

    output = utils.run(f"az deployment group create --name {deployment_name} --resource-group {resource_group_name} --template-file .temp/kubeflow-aks/main.bicep --parameters params.json", 
                 f"Deployment '{deployment_name}' succeeded", f"Deployment '{deployment_name}' failed")


<a id='3'></a>
### 3️⃣ Retrieve deployment outputs

In [None]:
if create_resources:
    # retrieve deployment outputs
    output = utils.run(f"az deployment group show --name {deployment_name} -g {resource_group_name}", f"Retrieved deployment: {deployment_name}", f"Failed to retrieve deployment: {deployment_name}")
    if output.success and output.json_data:
        kv_resource_name = utils.get_deployment_output(output, 'kvAppName', 'KeyVault resource name')
        aks_resource_name = utils.get_deployment_output(output, 'aksClusterName', 'AKS resource name')

output = utils.run(f"az acr list -g {resource_group_name} --only-show-errors", "Retrieved ACR name", "Failed to retrieve ACR name")
if output.success and output.json_data:
    acr_resource_name = output.json_data[0]['name']



<a id='4'></a>
### 4️⃣ Connect to the AKS cluster
Configure kubectl to connect to your Kubernetes cluster using the [az aks get-credentials](https://learn.microsoft.com/en-us/cli/azure/aks?view=azure-cli-latest#az-aks-get-credentials) command. This command downloads credentials and configures the Kubernetes CLI to use them.

In [None]:
output = utils.run(f"az aks get-credentials --resource-group {resource_group_name} --name {aks_resource_name} --overwrite-existing",
             f"Credentials for AKS cluster '{aks_resource_name}' configured",
             f"Failed to configure credentials for AKS cluster '{aks_resource_name}'")

output = utils.run("kubelogin convert-kubeconfig -l azurecli", "kubelogin succeeded", "Failed to run kubelogin")


<a id='5'></a>
### 5️⃣ Retrieve the list of AKS cluster nodes
Verify the connection to your cluster using the [kubectl get](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#get) command. This command returns a list of the cluster nodes.

In [None]:
! kubectl get nodes

<a id='6'></a>
### 6️⃣ Deploy Kubeflow without TLS using Default Password

In [None]:
! kubectl apply -k .temp/kubeflow-aks/manifests/vanilla


<a id='7'></a>
### 7️⃣ Check the status of the pods created


In [None]:
! kubectl get pods -n cert-manager
! kubectl get pods -n istio-system
! kubectl get pods -n auth
! kubectl get pods -n knative-eventing
! kubectl get pods -n knative-serving
! kubectl get pods -n kubeflow
! kubectl get pods -n kubeflow-user-example-com

<a id='8'></a>
### 8️⃣ Access the Kubeflow dashboard with a port forward

Run kubectl port-forward to access the Kubeflow dashboard



In [None]:
! kubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80

utils.print_ok(f"Kubeflow UI is available at http://localhost:8080")
utils.print_info('The default email address is user@example.com and the default password is 12341234')

<a id='clean'></a>
### 🗑️ Clean up resources
When you're finished with the lab, you should remove all your deployed resources from Azure to avoid extra charges and keep your Azure subscription uncluttered. Use the [clean-up-resources notebook](clean-up-resources.ipynb) for that.