## Wallaroo Dashboard Metrics Retrieval Tutorial

The following tutorial demonstrates using the Wallaroo MLOps API to retrieve Wallaroo metrics data.  These requests are compliant with Prometheus API endpoints.  

This tutorial is split into two sections:

* Inference Data Generation:  This section creates Wallaroo pipeline and inference requests to generate the log files and other data.
* Wallaroo Dashboard Metrics Retrieval via the Wallaroo MLOps API:  Details the Wallaroo MLOps API metrics retrieval endpoints and provides a demonstration of retrieving metrics data.

### Prerequisites

This tutorial assumes the following:

* A Wallaroo Ops environment is installed.
* The Wallaroo SDK is installed.  These examples use the Wallaroo SDK to generate the initial inferences information for the metrics requests.

## Inference Data Generation

This part of the tutorial generates the inference results used for the rest of the tutorial.

### Import libraries

The first step is to import the libraries required.

In [13]:
import json
import numpy as np
import pandas as pd

import pytz
import datetime

import requests
from requests.auth import HTTPBasicAuth

import wallaroo

### Connect to the Wallaroo Instance

A connection to Wallaroo is established via the Wallaroo client.  The Python library is included in the Wallaroo install and available through the Jupyter Hub interface provided with your Wallaroo environment.

This is accomplished using the `wallaroo.Client()` command, which provides a URL to grant the SDK permission to your specific Wallaroo environment.  When displayed, enter the URL into a browser and confirm permissions.  Store the connection into a variable that can be referenced later.

If logging into the Wallaroo instance through the internal JupyterHub service, use `wl = wallaroo.Client()`.  For more information on Wallaroo Client settings, see the [Client Connection guide](https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-essentials-client/).

In [14]:
wl = wallaroo.Client(api_endpoint="https://autoscale-uat-gcp.wallaroo.dev/", 
                     auth_type="sso")



Please log into the following URL in a web browser:

	https://autoscale-uat-gcp.wallaroo.dev/auth/realms/master/device?user_code=RLEL-LSWU

Login successful!


In [None]:
model_name = "ccfraud-model"
model_file_name = "./models/ccfraud.onnx"


The following queries are available for resource consumption.

#### Storage Resources

| Name | Query |
|---|---|
| Get the total size of the minio volume (used to keep models and orch data) | `kubelet_volume_stats_capacity_bytes{persistentvolumeclaim="minio"}` |
| Get the total size of the plateau volume (used to keep the inference logs) | `kubelet_volume_stats_capacity_bytes{persistentvolumeclaim="plateau-managed-disk"}` |
| Get the total models storage | `sum(max by(sha) (model_bytes))` |
| Get the total size of the all the pipeline files by pipeline (w/o pipeline_version) | `sum by(pipeline) (pipeline_models_bytes)` |
| Get the total size of all the pipeline files by pipeline AND pipeline_version | `sum by(pipeline, pipeline_version) (pipeline_models_bytes)` |
| Get the total size of all the orchestrations by orchestration AND id | `sum by(id, orchestration) (orchestration_bytes)` |
| Get the plateau topic storage usage by topic name | `topic_bytes{topic="<topic_name>"}` |
| Get the used plateau (log) storage | `sum (topic_bytes)` |

#### CPU and Memory Resources

| Name |  Parameterized Query | Example Query |
|---|---|---|
| Total allocated CPU resources (cluster-wide), Current point in time | `sum(kube_node_status_capacity{resource="cpu"})` | `sum(kube_node_status_capacity{resource="cpu"})` |
| Total allocated CPU resources (cluster-wide), Average over some time  | `avg_over_time(sum(kube_node_status_capacity{resource="cpu"})[rangeVector])` | `avg_over_time(sum(kube_node_status_capacity{resource="cpu"})[5m:15s])` |
| Used CPU resources (cluster wide) Current point in time | `sum(rate(container_cpu_usage_seconds_total[{time_interval}]))` | `sum(rate(container_cpu_usage_seconds_total[5m]))` |
| Used CPU resources (cluster wide) Average over some time | `avg_over_time(sum(rate(container_cpu_usage_seconds_total[{time_interval}]))[rangeVector])` | `avg_over_time(sum(rate(container_cpu_usage_seconds_total[5m]))[5m:15s])` |
| Total CPU requests by wallaroo workload | `sum(kube_pod_container_resource_requests{resource="cpu",app_kubernetes_io_instance="wallaroo"})` | `sum(kube_pod_container_resource_requests{resource="cpu",app_kubernetes_io_instance="wallaroo"})` | 
| Total CPU limits by wallaroo workload | `sum(kube_pod_container_resource_limits{resource="cpu",app_kubernetes_io_instance="wallaroo"})` | `sum(kube_pod_container_resource_limits{resource="cpu",app_kubernetes_io_instance="wallaroo"})` |
| Total allocated Memory resource (cluster-wide) Current point in time | `sum(kube_node_status_capacity{resource="memory"})` | `sum(kube_node_status_capacity{resource="memory"})` |
| Total allocated Memory resource Average over some time | `avg_over_time(sum(kube_node_status_capacity{resource="memory"})[rangeVector])` | `avg_over_time(sum(kube_node_status_capacity{resource="memory"})[5m:15s])` |
| Used Memory resources (cluster-wide) Current point in time | `sum(container_memory_usage_bytes)` | `sum(container_memory_usage_bytes)` | 
| Used Memory resources (cluster-wide) Average over some time. | `avg_over_time(sum(container_memory_usage_bytes)[rangeVector])` | `avg_over_time(sum(container_memory_usage_bytes)[5m:15s])` |
| Total Memory requests by wallaroo workload | `sum(kube_pod_container_resource_requests{resource="memory",app_kubernetes_io_instance="wallaroo"})` | `sum(kube_pod_container_resource_requests{resource="memory",app_kubernetes_io_instance="wallaroo"})` |
| Total Memory limits by wallaroo workload | `sum(kube_pod_container_resource_limits{resource="memory", app_kubernetes_io_instance="wallaroo"})` | `sum(kube_pod_container_resource_limits{resource="memory", app_kubernetes_io_instance="wallaroo"})` | 

#### GPU Resources

| Name |  Parameterized Query |
|---|---|
| Total GPU in use by pods resources | `sum(kube_pod_container_resource_limits{resource=~"nvidia_com_gpu\|qualcomm_com_qaic",app_kubernetes_io_instance="wallaroo"})` |
| Total GPU allocated in the cluster | `sum(kube_node_status_capacity{resource=~"nvidia_com_gpu\|qualcomm_com_qaic"})` |

#### 7 Day Average Queries

The following queries are targeted for average results over a 7 day period.

| Name |  Parameterized Query | Example Query |
|---|---|---|
| Memory requests | `avg_over_time(sum(kube_pod_container_resource_requests{resource="memory",app_kubernetes_io_instance="wallaroo"})[7d:])` |
| CPU requests | `avg_over_time(sum(kube_pod_container_resource_requests{resource="cpu",app_kubernetes_io_instance="wallaroo"})[7d:])` |
| CPU utilization (percentage) for a given pipeline | `avg_over_time(sum(rate(container_cpu_usage_seconds_total{namespace="<pipeline namespace>"}[5m]))[7d:]) / sum(kube_pod_container_resource_requests{namespace="<pipeline namespace>",resource="cpu"}) * 100` | `avg_over_time(sum(rate(container_cpu_usage_seconds_total{namespace="sample-pipeline-114"}[5m]))[7d:]) / sum(kube_pod_container_resource_requests{namespace="sample-pipeline-114",resource="cpu"}) * 100` |
 | Memory utilization (percentage) | `avg_over_time(sum(container_memory_usage_bytes{namespace="<pipeline namespace>"})[7d:]) / sum(kube_pod_container_resource_requests{namespace="<pipeline namespace>", resource="memory"}) * 100` | `avg_over_time(sum(container_memory_usage_bytes{namespace="sample-pipeline-114"})[7d:]) / sum(kube_pod_container_resource_requests{namespace="sample-pipeline-114", resource="memory"}) * 100` |



### Utilization Query Tests

In [6]:
#| Get the total size of the minio volume (used to keep models and orch data) | `kubelet_volume_stats_capacity_bytes{persistentvolumeclaim="minio"}` |

# this is the URL to get prometheus metrics
query_url = f"{wl.api_endpoint}/v1/metrics/api/v1/query"

# Retrieve the token 
headers = wl.auth.auth_header()

# Get the total size of all the pipeline files by pipeline AND pipeline_version
query = 'kubelet_volume_stats_capacity_bytes{persistentvolumeclaim="minio"}'

#request parameters
params_rps = {
    'query': query,
}

response_rps = requests.get(query_url, headers=headers, params=params_rps)


if response_rps.status_code == 200:
    print("Query Response:")
    display(response_rps.json())
else:
    print("Failed to fetch query response:", response_rps.status_code, response_rps.text)

Query Response:


{'status': 'success',
 'data': {'resultType': 'vector',
  'result': [{'metric': {'__name__': 'kubelet_volume_stats_capacity_bytes',
     'beta_kubernetes_io_arch': 'amd64',
     'beta_kubernetes_io_instance_type': 'e2-standard-8',
     'beta_kubernetes_io_os': 'linux',
     'cloud_google_com_gke_boot_disk': 'pd-balanced',
     'cloud_google_com_gke_container_runtime': 'containerd',
     'cloud_google_com_gke_cpu_scaling_level': '8',
     'cloud_google_com_gke_logging_variant': 'DEFAULT',
     'cloud_google_com_gke_max_pods_per_node': '110',
     'cloud_google_com_gke_memory_gb_scaling_level': '32',
     'cloud_google_com_gke_nodepool': 'persistent',
     'cloud_google_com_gke_os_distribution': 'cos',
     'cloud_google_com_gke_provisioning': 'standard',
     'cloud_google_com_gke_stack_type': 'IPV4',
     'cloud_google_com_machine_family': 'e2',
     'cloud_google_com_private_node': 'false',
     'failure_domain_beta_kubernetes_io_region': 'us-central1',
     'failure_domain_beta_kuber

| Get the total size of the plateau volume (used to keep the inference logs) | `kubelet_volume_stats_capacity_bytes{persistentvolumeclaim="plateau-managed-disk"}` |

In [7]:
#| Get the total size of the plateau volume (used to keep the inference logs) | `kubelet_volume_stats_capacity_bytes{persistentvolumeclaim="plateau-managed-disk"}` |

# this is the URL to get prometheus metrics
query_url = f"{wl.api_endpoint}/v1/metrics/api/v1/query"

# Retrieve the token 
headers = wl.auth.auth_header()

# Get the total size of all the pipeline files by pipeline AND pipeline_version
query = 'kubelet_volume_stats_capacity_bytes{persistentvolumeclaim="plateau-managed-disk"}'

#request parameters
params_rps = {
    'query': query,
}

response_rps = requests.get(query_url, headers=headers, params=params_rps)


if response_rps.status_code == 200:
    print("Query Response:")
    display(response_rps.json())
else:
    print("Failed to fetch query response:", response_rps.status_code, response_rps.text)

Query Response:


{'status': 'success',
 'data': {'resultType': 'vector',
  'result': [{'metric': {'__name__': 'kubelet_volume_stats_capacity_bytes',
     'beta_kubernetes_io_arch': 'amd64',
     'beta_kubernetes_io_instance_type': 'e2-standard-8',
     'beta_kubernetes_io_os': 'linux',
     'cloud_google_com_gke_boot_disk': 'pd-balanced',
     'cloud_google_com_gke_container_runtime': 'containerd',
     'cloud_google_com_gke_cpu_scaling_level': '8',
     'cloud_google_com_gke_logging_variant': 'DEFAULT',
     'cloud_google_com_gke_max_pods_per_node': '110',
     'cloud_google_com_gke_memory_gb_scaling_level': '32',
     'cloud_google_com_gke_nodepool': 'persistent',
     'cloud_google_com_gke_os_distribution': 'cos',
     'cloud_google_com_gke_provisioning': 'standard',
     'cloud_google_com_gke_stack_type': 'IPV4',
     'cloud_google_com_machine_family': 'e2',
     'cloud_google_com_private_node': 'false',
     'failure_domain_beta_kubernetes_io_region': 'us-central1',
     'failure_domain_beta_kuber

| Get the total models storage | `sum(max by(sha) (model_bytes))` |

In [8]:
#| Get the total models storage | `sum(max by(sha) (model_bytes))` |

# this is the URL to get prometheus metrics
query_url = f"{wl.api_endpoint}/v1/metrics/api/v1/query"

# Retrieve the token 
headers = wl.auth.auth_header()

# Get the total size of all the pipeline files by pipeline AND pipeline_version
query = 'sum(max by(sha) (model_bytes))'

#request parameters
params_rps = {
    'query': query,
}

response_rps = requests.get(query_url, headers=headers, params=params_rps)


if response_rps.status_code == 200:
    print("Query Response:")
    display(response_rps.json())
else:
    print("Failed to fetch query response:", response_rps.status_code, response_rps.text)

Query Response:


{'status': 'success',
 'data': {'resultType': 'vector',
  'result': [{'metric': {}, 'value': [1762288601.487, '476090674588']}]}}

| Get the total size of the all the pipeline files by pipeline (w/o pipeline_version) | `sum by(pipeline) (pipeline_models_bytes)` |


In [9]:
#| Get the total models storage | `sum by(pipeline) (pipeline_models_bytes)` |

# this is the URL to get prometheus metrics
query_url = f"{wl.api_endpoint}/v1/metrics/api/v1/query"

# Retrieve the token 
headers = wl.auth.auth_header()

# Get the total size of all the pipeline files by pipeline AND pipeline_version
query = 'sum by(pipeline) (pipeline_models_bytes)'

#request parameters
params_rps = {
    'query': query,
}

response_rps = requests.get(query_url, headers=headers, params=params_rps)


if response_rps.status_code == 200:
    print("Query Response:")
    display(response_rps.json())
else:
    print("Failed to fetch query response:", response_rps.status_code, response_rps.text)

Query Response:


{'status': 'success',
 'data': {'resultType': 'vector',
  'result': [{'metric': {'pipeline': 'retina-face-resnet50-pipe'},
    'value': [1762288787.385, '109105123']},
   {'metric': {'pipeline': 'logging-pipeline-850802'},
    'value': [1762288787.385, '1928']},
   {'metric': {'pipeline': 'assay-demonstration-tutorial-pk100'},
    'value': [1762288787.385, '556622']},
   {'metric': {'pipeline': 'tinyllama-openai'},
    'value': [1762288787.385, '48146516335']},
   {'metric': {'pipeline': 'auto-conversion-xgboost-classification-pipeline-112261'},
    'value': [1762288787.385, '285313']},
   {'metric': {'pipeline': 'retail-inv-openvino'},
    'value': [1762288787.385, '77858235']},
   {'metric': {'pipeline': 'redeploy-ccfraud-pipeline-142849'},
    'value': [1762288787.385, '1928']},
   {'metric': {'pipeline': 'house-price-predictor-drift'},
    'value': [1762288787.385, '225818']},
   {'metric': {'pipeline': 'byop-vllm-100cb'}, 'value': [1762288787.385, '0']},
   {'metric': {'pipeline':

| Get the total size of all the pipeline files by pipeline AND pipeline_version | `sum by(pipeline, pipeline_version) (pipeline_models_bytes)` |


In [10]:
#| Get the total size of all the pipeline files by pipeline AND pipeline_version | `sum by(pipeline, pipeline_version) (pipeline_models_bytes)` |

# this is the URL to get prometheus metrics
query_url = f"{wl.api_endpoint}/v1/metrics/api/v1/query"

# Retrieve the token 
headers = wl.auth.auth_header()

# Get the total size of all the pipeline files by pipeline AND pipeline_version
query = 'sum by(pipeline, pipeline_version) (pipeline_models_bytes)'

#request parameters
params_rps = {
    'query': query,
}

response_rps = requests.get(query_url, headers=headers, params=params_rps)


if response_rps.status_code == 200:
    print("Query Response:")
    display(response_rps.json())
else:
    print("Failed to fetch query response:", response_rps.status_code, response_rps.text)

Query Response:


{'status': 'success',
 'data': {'resultType': 'vector',
  'result': [{'metric': {'pipeline': 'retina-face-resnet50-pipe',
     'pipeline_version': '772'},
    'value': [1762288980.294, '0']},
   {'metric': {'pipeline': 'logging-pipeline-850802',
     'pipeline_version': '2205'},
    'value': [1762288980.294, '0']},
   {'metric': {'pipeline': 'assay-demonstration-tutorial-pk100',
     'pipeline_version': '1854'},
    'value': [1762288980.294, '0']},
   {'metric': {'pipeline': 'tinyllama-openai', 'pipeline_version': '2139'},
    'value': [1762288980.294, '0']},
   {'metric': {'pipeline': 'auto-conversion-xgboost-classification-pipeline-112261',
     'pipeline_version': '1619'},
    'value': [1762288980.294, '285313']},
   {'metric': {'pipeline': 'retail-inv-openvino', 'pipeline_version': '2040'},
    'value': [1762288980.294, '0']},
   {'metric': {'pipeline': 'redeploy-ccfraud-pipeline-142849',
     'pipeline_version': '2184'},
    'value': [1762288980.294, '1928']},
   {'metric': {'pipe

| Get the total size of all the orchestrations by orchestration AND id | `sum by(id, orchestration) (orchestration_bytes)` |


In [11]:
#| Get the total size of all the pipeline files by pipeline AND pipeline_version | `sum by(id, orchestration) (orchestration_bytes)` |

# this is the URL to get prometheus metrics
query_url = f"{wl.api_endpoint}/v1/metrics/api/v1/query"

# Retrieve the token 
headers = wl.auth.auth_header()

# Get the total size of all the pipeline files by pipeline AND pipeline_version
query = 'sum by(id, orchestration) (orchestration_bytes)'

#request parameters
params_rps = {
    'query': query,
}

response_rps = requests.get(query_url, headers=headers, params=params_rps)


if response_rps.status_code == 200:
    print("Query Response:")
    display(response_rps.json())
else:
    print("Failed to fetch query response:", response_rps.status_code, response_rps.text)

Query Response:


{'status': 'success',
 'data': {'resultType': 'vector',
  'result': [{'metric': {'id': 'e1efb74c-079a-4623-813c-ca031ba42467',
     'orchestration': 'house price orchestration example'},
    'value': [1762289263.751, '730446531']},
   {'metric': {'id': '80740d5b-8c27-45f2-a3d9-78a9f42bff6f',
     'orchestration': 'demand-forecasting-orchestration'},
    'value': [1762289263.751, '687792709']},
   {'metric': {'id': '01e89d0f-0488-4e02-89d3-c02ae9400be9'},
    'value': [1762289263.751, '448441429']},
   {'metric': {'id': 'acd1ca9c-5f07-4718-9066-2c3f6bc04a89',
     'orchestration': 'uploadedbytesdemo'},
    'value': [1762289263.751, '439215189']},
   {'metric': {'id': 'd02e40ca-0705-4ab9-a88e-5274a3f5c613',
     'orchestration': 'demand-forecasting-orchestration'},
    'value': [1762289263.751, '687792709']},
   {'metric': {'id': '550e1bfa-b916-4aaa-957e-6702cf27b9b2',
     'orchestration': 'demand-forecasting-orchestration'},
    'value': [1762289263.751, '687792712']},
   {'metric': {'

| Get the plateau topic storage usage by topic name | `topic_bytes{topic="<topic_name>"}` |



In [12]:
#| Get the total size of all the pipeline files by pipeline AND pipeline_version | `topic_bytes{topic="<topic_name>"}` |

# this is the URL to get prometheus metrics
query_url = f"{wl.api_endpoint}/v1/metrics/api/v1/query"

# Retrieve the token 
headers = wl.auth.auth_header()

# Get the total size of all the pipeline files by pipeline AND pipeline_version
query = 'topic_bytes{topic="<topic_name>"}'

#request parameters
params_rps = {
    'query': query,
}

response_rps = requests.get(query_url, headers=headers, params=params_rps)


if response_rps.status_code == 200:
    print("Query Response:")
    display(response_rps.json())
else:
    print("Failed to fetch query response:", response_rps.status_code, response_rps.text)

Query Response:


{'status': 'success', 'data': {'resultType': 'vector', 'result': []}}

In [None]:
| Get the used plateau (log) storage | `sum (topic_bytes)` |

#| Get the total size of all the pipeline files by pipeline AND pipeline_version | `sum (topic_bytes)` |

# this is the URL to get prometheus metrics
query_url = f"{wl.api_endpoint}/v1/metrics/api/v1/query"

# Retrieve the token 
headers = wl.auth.auth_header()

# Get the total size of all the pipeline files by pipeline AND pipeline_version
query = 'sum (topic_bytes)'

#request parameters
params_rps = {
    'query': query,
}

response_rps = requests.get(query_url, headers=headers, params=params_rps)


if response_rps.status_code == 200:
    print("Query Response:")
    display(response_rps.json())
else:
    print("Failed to fetch query response:", response_rps.status_code, response_rps.text)

| Name |  Parameterized Query |
|---|---|
| Total GPU in use by pods resources | `sum(kube_pod_container_resource_limits{resource=~"nvidia_com_gpu\|qualcomm_com_qaic",app_kubernetes_io_instance="wallaroo"})` |



In [17]:
# | Total GPU in use by pods resources | 
# `sum(kube_pod_container_resource_limits{resource=~"nvidia_com_gpu\|qualcomm_com_qaic",app_kubernetes_io_instance="wallaroo"})` |

# this is the URL to get prometheus metrics
query_url = f"{wl.api_endpoint}/v1/metrics/api/v1/query"

# Retrieve the token 
headers = wl.auth.auth_header()

# Get the total size of all the pipeline files by pipeline AND pipeline_version
query = 'sum(kube_pod_container_resource_limits{resource=~"nvidia_com_gpu|qualcomm_com_qaic",app_kubernetes_io_instance="wallaroo"})'

#request parameters
params_rps = {
    'query': query,
}

response_rps = requests.get(query_url, headers=headers, params=params_rps)


if response_rps.status_code == 200:
    print("Query Response:")
    display(response_rps.json())
else:
    print("Failed to fetch query response:", response_rps.status_code, response_rps.text)

Query Response:


{'status': 'success', 'data': {'resultType': 'vector', 'result': []}}

| Total GPU allocated in the cluster | `sum(kube_node_status_capacity{resource=~"nvidia_com_gpu\|qualcomm_com_qaic"})` |

In [18]:
# | Total GPU in use by pods resources | 
# `sum(kube_node_status_capacity{resource=~"nvidia_com_gpu|qualcomm_com_qaic"})` |

# this is the URL to get prometheus metrics
query_url = f"{wl.api_endpoint}/v1/metrics/api/v1/query"

# Retrieve the token 
headers = wl.auth.auth_header()

# Get the total size of all the pipeline files by pipeline AND pipeline_version
query = 'sum(kube_node_status_capacity{resource=~"nvidia_com_gpu|qualcomm_com_qaic"})'

#request parameters
params_rps = {
    'query': query,
}

response_rps = requests.get(query_url, headers=headers, params=params_rps)


if response_rps.status_code == 200:
    print("Query Response:")
    display(response_rps.json())
else:
    print("Failed to fetch query response:", response_rps.status_code, response_rps.text)

Query Response:


{'status': 'success', 'data': {'resultType': 'vector', 'result': []}}

#### 7 Day Average Queries

The following queries are targeted for average results over a 7 day period.

| Name |  Parameterized Query | Example Query |
|---|---|---|
| Memory requests | `avg_over_time(sum(kube_pod_container_resource_requests{resource="memory",app_kubernetes_io_instance="wallaroo"})[7d:])` |


In [19]:
#| Memory requests | `avg_over_time(sum(kube_pod_container_resource_requests{resource="memory",app_kubernetes_io_instance="wallaroo"})[7d:])` |

# this is the URL to get prometheus metrics
query_url = f"{wl.api_endpoint}/v1/metrics/api/v1/query"

# Retrieve the token 
headers = wl.auth.auth_header()

# Get the total size of all the pipeline files by pipeline AND pipeline_version
query = 'avg_over_time(sum(kube_pod_container_resource_requests{resource="memory",app_kubernetes_io_instance="wallaroo"})[7d:])'

#request parameters
params_rps = {
    'query': query,
}

response_rps = requests.get(query_url, headers=headers, params=params_rps)


if response_rps.status_code == 200:
    print("Query Response:")
    display(response_rps.json())
else:
    print("Failed to fetch query response:", response_rps.status_code, response_rps.text)

Query Response:


{'status': 'success',
 'data': {'resultType': 'vector',
  'result': [{'metric': {}, 'value': [1762364329.566, '16282367505.643349']}]}}

| CPU requests | `avg_over_time(sum(kube_pod_container_resource_requests{resource="cpu",app_kubernetes_io_instance="wallaroo"})[7d:])` |


In [21]:
#| CPU requests | `avg_over_time(sum(kube_pod_container_resource_requests{resource="cpu",app_kubernetes_io_instance="wallaroo"})[7d:])` |

# this is the URL to get prometheus metrics
query_url = f"{wl.api_endpoint}/v1/metrics/api/v1/query"

# Retrieve the token 
headers = wl.auth.auth_header()

# Get the total size of all the pipeline files by pipeline AND pipeline_version
query = 'avg_over_time(sum(kube_pod_container_resource_requests{resource="cpu",app_kubernetes_io_instance="wallaroo"})[7d:])'

#request parameters
params_rps = {
    'query': query,
}

response_rps = requests.get(query_url, headers=headers, params=params_rps)


if response_rps.status_code == 200:
    print("Query Response:")
    display(response_rps.json())
else:
    print("Failed to fetch query response:", response_rps.status_code, response_rps.text)

Query Response:


{'status': 'success',
 'data': {'resultType': 'vector',
  'result': [{'metric': {}, 'value': [1762365221.159, '12.207195960414376']}]}}

| CPU utilization (percentage) for a given pipeline | `avg_over_time(sum(rate(container_cpu_usage_seconds_total{namespace="<pipeline namespace>"}[5m]))[7d:]) / sum(kube_pod_container_resource_requests{namespace="<pipeline namespace>",resource="cpu"}) * 100` | `avg_over_time(sum(rate(container_cpu_usage_seconds_total{namespace="sample-pipeline-114"}[5m]))[7d:]) / sum(kube_pod_container_resource_requests{namespace="sample-pipeline-114",resource="cpu"}) * 100` |
 

In [28]:
#| CPU utilization (percentage) for a given pipeline | `avg_over_time(sum(rate(container_cpu_usage_seconds_total{namespace="<pipeline namespace>"}[5m]))[7d:]) / sum(kube_pod_container_resource_requests{namespace="<pipeline namespace>",resource="cpu"}) * 100` | `avg_over_time(sum(rate(container_cpu_usage_seconds_total{namespace="sample-pipeline-114"}[5m]))[7d:]) / sum(kube_pod_container_resource_requests{namespace="sample-pipeline-114",resource="cpu"}) * 100` |
 

# this is the URL to get prometheus metrics
query_url = f"{wl.api_endpoint}/v1/metrics/api/v1/query"

# Retrieve the token 
headers = wl.auth.auth_header()

# Get the total size of all the pipeline files by pipeline AND pipeline_version
#https://autoscale-uat-gcp.wallaroo.dev/v1/api/pipelines/infer/qualcomm-rag-523/qualcomm-rag
query = 'avg_over_time(sum(rate(container_cpu_usage_seconds_total{namespace="vgg16-clustering-pipeline-65"}[5m]))[7d:]) / sum(kube_pod_container_resource_requests{namespace="vgg16-clustering-pipeline-65",resource="cpu"}) * 100'


#request parameters
params_rps = {
    'query': query,
}

response_rps = requests.get(query_url, headers=headers, params=params_rps)


if response_rps.status_code == 200:
    print("Query Response:")
    display(response_rps.json())
else:
    print("Failed to fetch query response:", response_rps.status_code, response_rps.text)

Query Response:


{'status': 'success',
 'data': {'resultType': 'vector',
  'result': [{'metric': {}, 'value': [1762374982.49, '1.027024147239488']}]}}

| Memory utilization (percentage) | `avg_over_time(sum(container_memory_usage_bytes{namespace="<pipeline namespace>"})[7d:]) / sum(kube_pod_container_resource_requests{namespace="<pipeline namespace>", resource="memory"}) * 100` | `avg_over_time(sum(container_memory_usage_bytes{namespace="sample-pipeline-114"})[7d:]) / sum(kube_pod_container_resource_requests{namespace="sample-pipeline-114", resource="memory"}) * 100` |

In [29]:
#| Memory utilization (percentage) | `avg_over_time(sum(container_memory_usage_bytes{namespace="qualcomm-rag-523"})[7d:]) / sum(kube_pod_container_resource_requests{namespace="qualcomm-rag-523", resource="memory"}) * 100` |

# this is the URL to get prometheus metrics
query_url = f"{wl.api_endpoint}/v1/metrics/api/v1/query"

# Retrieve the token 
headers = wl.auth.auth_header()

# Get the total size of all the pipeline files by pipeline AND pipeline_version
query = 'avg_over_time(sum(container_memory_usage_bytes{namespace="vgg16-clustering-pipeline-65"})[7d:]) / sum(kube_pod_container_resource_requests{namespace="vgg16-clustering-pipeline-65", resource="memory"}) * 100'

#request parameters
params_rps = {
    'query': query,
}

response_rps = requests.get(query_url, headers=headers, params=params_rps)


if response_rps.status_code == 200:
    print("Query Response:")
    display(response_rps.json())
else:
    print("Failed to fetch query response:", response_rps.status_code, response_rps.text)

Query Response:


{'status': 'success',
 'data': {'resultType': 'vector',
  'result': [{'metric': {}, 'value': [1762375000.599, '33.77326707856905']}]}}