# Basic Examples of Metrics with Prometheus Operator

## Prerequisites

 * A kubernetes cluster with kubectl configured
 * curl
 

## Setup Seldon Core

Install Seldon Core as described in [docs](https://docs.seldon.io/projects/seldon-core/en/latest/workflow/install.html).

Then port-forward to that ingress on localhost:8004 in a separate terminal either with:
```bash
kubectl port-forward -n istio-system svc/istio-ingressgateway 8004:80
```

In [23]:
%%bash
kubectl create namespace seldon || echo "Seldon namespace already exists"
kubectl config set-context $(kubectl config current-context) --namespace=seldon

Error from server (AlreadyExists): namespaces "seldon" already exists


Seldon namespace already exists
Context "kind-seldon" modified.


## Install Prometheus Operator

In [24]:
%%bash
kubectl create namespace seldon-monitoring
helm repo add bitnami https://charts.bitnami.com/bitnami
# Note: we set prometheus.scrapeInterval=1s for CI tests reliability here
helm upgrade --install seldon-monitoring kube-prometheus \
    --version 8.3.1 \
    --set fullnameOverride=seldon-monitoring \
    --namespace seldon-monitoring \
    --repo https://charts.bitnami.com/bitnami

Error from server (AlreadyExists): namespaces "seldon-monitoring" already exists


"bitnami" already exists with the same configuration, skipping
Release "seldon-monitoring" does not exist. Installing it now.




NAME: seldon-monitoring
LAST DEPLOYED: Tue Jan 17 10:06:02 2023
NAMESPACE: seldon-monitoring
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
CHART NAME: kube-prometheus
CHART VERSION: 8.3.1
APP VERSION: 0.61.1

** Please be patient while the chart is being deployed **

Watch the Prometheus Operator Deployment status using the command:

    kubectl get deploy -w --namespace seldon-monitoring -l app.kubernetes.io/name=kube-prometheus-operator,app.kubernetes.io/instance=seldon-monitoring

Watch the Prometheus StatefulSet status using the command:

    kubectl get sts -w --namespace seldon-monitoring -l app.kubernetes.io/name=kube-prometheus-prometheus,app.kubernetes.io/instance=seldon-monitoring

Prometheus can be accessed via port "9090" on the following DNS name from within your cluster:

    seldon-monitoring-prometheus.seldon-monitoring.svc.cluster.local

To access Prometheus from outside the cluster execute the following commands:

    echo "Prometheus URL: http://127.0.0.1:9090

In [3]:
!kubectl get all -n seldon-monitoring

NAME                                                        READY   STATUS    RESTARTS   AGE
pod/alertmanager-seldon-monitoring-alertmanager-0           2/2     Running   0          3m2s
pod/prometheus-seldon-monitoring-prometheus-0               2/2     Running   0          3m1s
pod/seldon-monitoring-blackbox-exporter-7d5f5895d8-czwkj    1/1     Running   0          3m4s
pod/seldon-monitoring-kube-state-metrics-5dd77fd87d-sf755   1/1     Running   0          3m4s
pod/seldon-monitoring-node-exporter-gtjvk                   0/1     Pending   0          3m4s
pod/seldon-monitoring-operator-bd4fd7b75-nwfrx              1/1     Running   0          3m4s

NAME                                           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
service/alertmanager-operated                  ClusterIP   None            <none>        9093/TCP,9094/TCP,9094/UDP   3m2s
service/prometheus-operated                    ClusterIP   None            <none>        9090/TCP

In [25]:
%%bash
# Extra sleep as statefulset is not always present right away
sleep 5 
kubectl rollout status -n seldon-monitoring deployment/seldon-monitoring-operator
kubectl rollout status -n seldon-monitoring statefulset.apps/prometheus-seldon-monitoring-prometheus
kubectl rollout status -n seldon-monitoring statefulsets/prometheus-seldon-monitoring-prometheus

deployment "seldon-monitoring-operator" successfully rolled out
statefulset rolling update complete 1 pods at revision prometheus-seldon-monitoring-prometheus-7b486666c5...
statefulset rolling update complete 1 pods at revision prometheus-seldon-monitoring-prometheus-7b486666c5...


In [26]:
%%bash
cat <<EOF | kubectl apply -f -
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: seldon-podmonitor
  namespace: seldon-monitoring
spec:
  selector:
    matchLabels:
      app.kubernetes.io/managed-by: seldon-core
  podMetricsEndpoints:
    - port: metrics
      path: /prometheus
  namespaceSelector:
    any: true
EOF

podmonitor.monitoring.coreos.com/seldon-podmonitor unchanged


## Deploy Example Model

In [27]:
%%writefile echo-sdep.yaml
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: echo
  namespace: seldon
spec:
  predictors:
  - name: default
    replicas: 1
    graph:
      name: classifier
      type: MODEL
    componentSpecs:
    - spec:
        containers:
        - image: seldonio/echo-model:1.15.0-dev
          name: classifier

Overwriting echo-sdep.yaml


In [28]:
!kubectl delete -f echo-sdep.yaml
!kubectl apply -f echo-sdep.yaml

Error from server (NotFound): error when deleting "echo-sdep.yaml": seldondeployments.machinelearning.seldon.io "echo" not found
seldondeployment.machinelearning.seldon.io/echo created


In [29]:
%%bash
deployment=$(kubectl get deploy -l seldon-deployment-id=echo -o jsonpath='{.items[0].metadata.name}')
kubectl rollout status deploy/${deployment}

Waiting for deployment "echo-default-0-classifier" rollout to finish: 0 of 1 updated replicas are available...
deployment "echo-default-0-classifier" successfully rolled out


## Sent series of REST requests

In [35]:
%%bash

# Wait for the model to become fully ready
echo "Waiting 5s for model to fully ready"
sleep 5

# Send 20 requests to REST endpoint
for i in `seq 1 10`; do sleep 0.1 && \
   curl -s -H "Content-Type: application/json" \
   -d '{"data": {"ndarray":[[1.0, 2.0, 5.0]]}}' \
   http://localhost:8004/seldon/seldon/echo/api/v1.0/predictions > /dev/null ; \
done

# Give time for metrics to get collected by Prometheus
echo "Waiting 10s for Prometheus to scrape metrics"
sleep 10

Waiting 5s for model to fully ready
Waiting 10s for Prometheus to scrape metrics


## Check Metrics (REST)

In [36]:
import json

In [37]:
%%writefile get-metrics.sh
QUERY='query=seldon_api_executor_client_requests_seconds_count{deployment_name=~"echo",namespace=~"seldon",method=~"post"}'
QUERY_URL=http://seldon-monitoring-prometheus.seldon-monitoring.svc.cluster.local:9090/api/v1/query

kubectl run --quiet=true -it --rm curlmetrics-$(date +%s) --image=radial/busyboxplus:curl --restart=Never -- \
    curl --data-urlencode ${QUERY} ${QUERY_URL}

Overwriting get-metrics.sh


In [38]:
metrics = ! bash get-metrics.sh
metrics = json.loads(metrics[0])

In [39]:
metrics

{'status': 'success',
 'data': {'resultType': 'vector',
  'result': [{'metric': {'__name__': 'seldon_api_executor_client_requests_seconds_count',
     'code': '200',
     'container': 'seldon-container-engine',
     'deployment_name': 'echo',
     'endpoint': 'metrics',
     'instance': '10.244.0.36:8000',
     'job': 'seldon-monitoring/seldon-podmonitor',
     'method': 'post',
     'model_image': 'seldonio/echo-model',
     'model_name': 'classifier',
     'model_version': '1.15.0-dev',
     'namespace': 'seldon',
     'pod': 'echo-default-0-classifier-6f8fd9d7b8-ht4g2',
     'predictor_name': 'default',
     'service': '/predict'},
    'value': [1673930453.285, '10']}]}}

In [40]:
counter = int(metrics["data"]["result"][0]["value"][1])
assert counter == 10, f"expected 10 requests, got {counter}"

## Send series GRPC requests

In [42]:
%%bash
cd ./seldon-core/executor/proto && for i in `seq 1 10`; do sleep 0.1 && \
    grpcurl -d '{"data": {"ndarray":[[1.0, 2.0, 5.0]]}}' \
    -rpc-header seldon:echo -rpc-header namespace:seldon \
    -plaintext -proto ./prediction.proto \
     0.0.0.0:8004 seldon.protos.Seldon/Predict > /dev/null ; \
done

# Give time for metrics to get collected by Prometheus
echo "Waiting 10s for Prometheus to scrape metrics"
sleep 10

Waiting 10s for Prometheus to scrape metrics


## Check metrics (GRPC)

In [47]:
%%writefile get-metrics.sh
QUERY='query=seldon_api_executor_client_requests_seconds_count{deployment_name=~"echo",namespace=~"seldon",method=~"unary"}'
QUERY_URL=http://seldon-monitoring-prometheus.seldon-monitoring.svc.cluster.local:9090/api/v1/query

kubectl run --quiet=true -it --rm curlmetrics-$(date +%s) --image=radial/busyboxplus:curl --restart=Never -- \
    curl --data-urlencode ${QUERY} ${QUERY_URL}

Overwriting get-metrics.sh


In [48]:
metrics = ! bash get-metrics.sh
metrics = json.loads(metrics[0])

In [49]:
counter = int(metrics["data"]["result"][0]["value"][1])
assert counter == 20, f"expected 20 requests, got {counter}"

AssertionError: expected 10 requests, got 20

## Check Custom Metrics

This model defines a few custom metrics in its `.py` class definition:
```Python
    def metrics(self):
        print("metrics called")
        return [
            # a counter which will increase by the given value
            {"type": "COUNTER", "key": "mycounter", "value": 1},

            # a gauge which will be set to given value
            {"type": "GAUGE", "key": "mygauge", "value": 100},

            # a timer (in msecs) which  will be aggregated into HISTOGRAM
            {"type": "TIMER", "key": "mytimer", "value": 20.2},
        ]
```      

We will be checking value of `mygaguge` metrics.

In [43]:
%%writefile get-metrics.sh
QUERY='query=mygauge{deployment_name=~"echo",namespace=~"seldon"}'
QUERY_URL=http://seldon-monitoring-prometheus.seldon-monitoring.svc.cluster.local:9090/api/v1/query

kubectl run --quiet=true -it --rm curlmetrics-$(date +%s) --image=radial/busyboxplus:curl --restart=Never -- \
    curl --data-urlencode ${QUERY} ${QUERY_URL}

Overwriting get-metrics.sh


In [44]:
metrics = ! bash get-metrics.sh
metrics = json.loads(metrics[0])

In [45]:
metrics

{'status': 'success',
 'data': {'resultType': 'vector',
  'result': [{'metric': {'__name__': 'mygauge',
     'container': 'classifier',
     'deployment_name': 'echo',
     'endpoint': 'metrics',
     'image_name': 'seldonio/echo-model',
     'image_version': '1.15.0-dev',
     'instance': '10.244.0.26:6000',
     'job': 'seldon-monitoring/seldon-podmonitor',
     'method': 'predict',
     'model_image': 'seldonio/echo-model',
     'model_name': 'classifier',
     'model_version': '1.15.0-dev',
     'namespace': 'seldon',
     'pod': 'echo-default-0-classifier-6f8fd9d7b8-sgjjt',
     'predictor_name': 'default',
     'predictor_version': 'default',
     'seldon_deployment_name': 'echo',
     'worker_id': '47'},
    'value': [1673261684.981, '100']}]}}

In [46]:
gauge = int(metrics["data"]["result"][0]["value"][1])
assert gauge == 100, f"expected 100 on guage, got {gauge}"

## Cleanup

In [22]:
!kubectl delete sdep -n seldon echo
!helm uninstall -n seldon-monitoring seldon-monitoring

seldondeployment.machinelearning.seldon.io "echo" deleted
release "seldon-monitoring" uninstalled
