Skip to content

Latest commit

 

History

History
141 lines (113 loc) · 6.9 KB

File metadata and controls

141 lines (113 loc) · 6.9 KB

Metrics and Monitoring

Getting started with Prometheus-based monitoring of KFServing models.

Table of Contents

  1. Install Prometheus
  2. Access Prometheus Metrics
  3. Metrics-driven experiments and progressive delivery
  4. Removal

Install Prometheus

Prerequisites: Kubernetes cluster and Kustomize v3.

Install Prometheus using Prometheus Operator.

cd kfserving
kubectl apply -k docs/samples/metrics-and-monitoring/prometheus-operator
kubectl wait --for condition=established --timeout=120s crd/prometheuses.monitoring.coreos.com
kubectl wait --for condition=established --timeout=120s crd/servicemonitors.monitoring.coreos.com
kubectl apply -k docs/samples/metrics-and-monitoring/prometheus

Note: The above steps install Kubernetes resource objects in the kfserving-monitoring namespace. This is Kustomizable. To install under a different namespace, say my-monitoring, change kfserving-monitoring to my-monitoring in the following three files: a) prometheus-operator/namespace.yaml, b) prometheus-operator/kustomization.yaml, and c) prometheus/kustomization.yaml.

How Metrics are Scraped

  • This article provides details of how Prometheus setup using the above operator actually scrapes metrics.

  • Basically, the config serviceMonitorNamespaceSelector: {} in the Prometheus CRD means all namespaces will be watched for service monitor objects. You can confirm that by running:

$ kubectl get prometheus -n <name-of-the-namespace> -o yaml
...
  serviceMonitorNamespaceSelector: {}
...
  • The serviceMonitorSelector: field in the above Prometheus object indicates the labels that should be put on ServiceMonitor objects. For e.g. one prometheus config looked as follows:
$ kubectl get prometheus -n <name-of-the-namespace> -o yaml
...
  serviceMonitorNamespaceSelector: {}
  serviceMonitorSelector:
    matchLabels:
      release: kube-prometheus-stack-1651295153

This meant that every ServiceMonitor object that prometheus is expected to scrape should have the label release: kube-prometheus-stack-1651295153.

  • The ServiceMonitor objects created in the knative-serving, knative-eventing or even in the application namespaces should have the above label. E.g.
$ kubectl get servicemonitor activator -o yaml -n knative-serving
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  generation: 3
  labels:
    release: kube-prometheus-stack-1651295153    <<<<-------- Same as serviceMonitorSelector
  name: activator
  namespace: knative-serving
spec:
  endpoints:
  - interval: 30s
    port: http-metrics
  namespaceSelector:
    matchNames:
    - knative-serving
  selector:
    matchLabels:
      serving.knative.dev/release: v0.22.1 <<<-- This label should be on K8s service objects.
  • Now, the ServiceMonitor objects indicate which Kubernetes Service objects they select.
  • The selector.matchLabels field in the above ServiceMonitor object does that. All the Kubernetes service objects created by kserve had the the label serving.knative.dev/release=v0.22.1. Hence the above example used that field in the matchLabels. But, it could be replaced with any other labels.
$ kubectl get svc -n knative-serving --show-labels
NAME                         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                           AGE   LABELS
activator-service            ClusterIP   10.124.9.157    <none>        9090/TCP,8008/TCP,80/TCP,81/TCP   25d   app=activator,serving.knative.dev/release=v0.22.1
autoscaler                   ClusterIP   10.124.8.155    <none>        9090/TCP,8008/TCP,8080/TCP        25d   app=autoscaler,serving.knative.dev/release=v0.22.1
autoscaler-bucket-00-of-01   ClusterIP   10.124.14.250   <none>        8080/TCP                          25d   <none>
controller                   ClusterIP   10.124.4.127    <none>        9090/TCP,8008/TCP                 25d   app=controller,serving.knative.dev/release=v0.22.1
istio-webhook                ClusterIP   10.124.0.96     <none>        9090/TCP,8008/TCP,443/TCP         25d   networking.knative.dev/ingress-provider=istio,role=istio-webhook,serving.knative.dev/release=v0.22.1
webhook                      ClusterIP   10.124.15.22    <none>        9090/TCP,8008/TCP,443/TCP         25d   role=webhook,serving.knative.dev/release=v0.22.1

Access Prometheus Metrics

In this section, we will use a v1beta1 InferenceService sample to demonstrate how to access Prometheus metrics that are automatically generated by Knative's queue-proxy container for your KFServing models.

  1. kubectl create ns kfserving-test
  2. cd docs/samples/v1beta1/sklearn
  3. kubectl apply -f sklearn.yaml -n kfserving-test
  4. If you are using a Minikube based cluster, then in a separate terminal, run minikube tunnel --cleanup and supply password if prompted.
  5. In a separate terminal, follow these instructions to find and set your ingress IP, host, and service hostname. Then, send prediction requests to the sklearn-iris model you created in Step 3. above as follows.
while clear; do \
  curl -v \
  -H "Host: ${SERVICE_HOSTNAME}" \
  -d @./iris-input.json \
  http://${INGRESS_HOST}:${INGRESS_PORT}/v2/models/sklearn-iris/infer
  sleep 0.3
done
  1. In a separate terminal, port forward the Prometheus service.
kubectl port-forward service/prometheus-operated -n kfserving-monitoring 9090:9090
  1. Access Prometheus UI in your browser at http://localhost:9090
  2. Access the number of prediction requests to the sklearn model, over the last 60 seconds. You can use the following query in the Prometheus UI:
sum(increase(revision_app_request_latencies_count{service_name=~"sklearn-iris-predictor-default"}[60s]))

You should see a response similar to the following.

Request count

  1. Access the mean latency for serving prediction requests for the same model as above, over the last 60 seconds. You can use the following query in the Prometheus UI:
sum(increase(revision_app_request_latencies_sum{service_name=~"sklearn-iris-predictor-default"}[60s]))/sum(increase(revision_app_request_latencies_count{service_name=~"sklearn-iris-predictor-default"}[60s]))

You should see a response similar to the following.

Request count

Metrics-driven experiments and progressive delivery

See Iter8 extensions for kfserving.

Removal

Remove Prometheus and Prometheus Operator as follows.

cd kfserving
kubectl delete -k docs/samples/metrics-and-monitoring/prometheus
kubectl delete -k docs/samples/metrics-and-monitoring/prometheus-operator