# Requirements
- Artefacts: 
    - Models are uploaded in a Cloud Storage or Repo
    - required Container are pushed to a Container Registry
- Clean Kubernetes Cluster (e.g. minikube, Microk8s, Azure AKS, AWS EKS, ...)
    - minikube start --cpus 10 --memory 17000 --kubernetes-version=v1.17.11 -p demo
- Python Dependencies: [requirements.txt](./requirements.txt)
- kubectl Access:
    - az aks get-credentials --resource-group myResourceGroup --name myAKSCluster
    - aws eks update-kubeconfig --name cluster_name

# Install KFServing Standalone
See: https://github.com/kubeflow/kfserving#install-kfserving
### Install Istio

In [1]:
%%writefile ./istio/istio_ns.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: istio-system
  labels:
    istio-injection: disabled

Overwriting ./istio/istio_ns.yaml


In [2]:
!kubectl apply -f ./istio/istio_ns.yaml

namespace/istio-system created


In [3]:
%%writefile ./istio/istio-minimal-operator.yaml
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
  values:
    global:
      proxy:
        autoInject: disabled
      useMCP: false
      # The third-party-jwt is not enabled on all k8s.
      # See: https://istio.io/docs/ops/best-practices/security/#configure-third-party-service-account-tokens
      jwtPolicy: first-party-jwt
  addonComponents:
    pilot:
      enabled: false
    tracing:
      enabled: false
    kiali:
      enabled: false
    prometheus:
      enabled: false
    grafana:
      enabled: false
  components:
    ingressGateways:
      - name: istio-ingressgateway
        enabled: true
      - name: cluster-local-gateway
        enabled: true
        label:
          istio: cluster-local-gateway
          app: cluster-local-gateway
        k8s:
          service:
            type: ClusterIP
            ports:
            - port: 15020
              name: status-port
            - port: 80
              name: http2
            - port: 443
              name: https

Overwriting ./istio/istio-minimal-operator.yaml


In [4]:
import time
import platform
import subprocess

os_system = platform.system()
os_machine = platform.machine()
start = time.time()

# Install Istio

if os_system == 'Windows':
    !curl -L https://github.com/istio/istio/releases/download/1.6.2/istioctl-1.6.2-win.zip -o istioctl-1.6.2-win.zip
    !tar -xf istioctl-1.6.2-win.zip
elif os_system == 'Linux':
    if os_machine == 'AMD64':
        !curl -L https://github.com/istio/istio/releases/download/1.6.2/istioctl-1.6.2-linux-amd64.tar.gz -o istioctl-1.6.2-linux.tar.gz
    if os_machine == 'armv7l':
        !curl -L https://github.com/istio/istio/releases/download/1.6.2/istioctl-1.6.2-linux-armv7.tar.gz -o istioctl-1.6.2-linux.tar.gz
    if os_machine == 'aarch64':
        print('Not supported')
    !tar -zxvf istioctl-1.6.2-linux.tar.gz


subprocess.run(["istioctl.exe", "manifest", "apply", "-f", "./istio/istio-minimal-operator.yaml"])
end = time.time()
print(end-start)

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100   628  100   628    0     0   2352      0 --:--:-- --:--:-- --:--:--  2343

  9 39.4M    9 3960k    0     0  3471k      0  0:00:11  0:00:01  0:00:10 3471k
 28 39.4M   28 11.2M    0     0  5367k      0  0:00:07  0:00:02  0:00:05 7521k
 51 39.4M   51 20.2M    0     0  6595k      0  0:00:06  0:00:03  0:00:03 8375k
 71 39.4M   71 28.1M    0     0  6939k      0  0:00:05  0:00:04  0:00:01 8255k
 89 39.4M   89 35.2M    0     0  7024k      0  0:00:05  0:00:05 --:--:-- 8039k
100 39.4M  100 39.4M    0     0  7229k      0  0:00:05  0:00:05 --:--:-- 8195k


38.83830952644348


In [5]:
start = time.time()

# Install Knative-Serving
!kubectl apply --filename https://github.com/knative/serving/releases/download/v0.18.0/serving-crds.yaml
!kubectl apply --filename https://github.com/knative/serving/releases/download/v0.18.0/serving-core.yaml
!kubectl apply --filename https://github.com/knative/net-istio/releases/download/v0.18.0/release.yaml

# Install Cert Manager
!kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.15.1/cert-manager.yaml
!kubectl wait --for=condition=available --timeout=600s deployment/cert-manager-webhook -n cert-manager

# Install KFServing
!kubectl apply -f https://raw.githubusercontent.com/kubeflow/kfserving/master/install/v0.5.0/kfserving_crds.yaml
!kubectl apply -f https://raw.githubusercontent.com/kubeflow/kfserving/master/install/v0.5.0/kfserving.yaml

# Install Knative-Eventing
!kubectl apply --filename https://github.com/knative/eventing/releases/download/v0.18.0/eventing.yaml

# Install Knative-Monitoring
!kubectl apply --filename https://github.com/knative/serving/releases/download/v0.18.0/monitoring.yaml

end = time.time()
print(end-start)

customresourcedefinition.apiextensions.k8s.io/certificates.networking.internal.knative.dev created
customresourcedefinition.apiextensions.k8s.io/configurations.serving.knative.dev created
customresourcedefinition.apiextensions.k8s.io/ingresses.networking.internal.knative.dev created
customresourcedefinition.apiextensions.k8s.io/metrics.autoscaling.internal.knative.dev created
customresourcedefinition.apiextensions.k8s.io/podautoscalers.autoscaling.internal.knative.dev created
customresourcedefinition.apiextensions.k8s.io/revisions.serving.knative.dev created
customresourcedefinition.apiextensions.k8s.io/routes.serving.knative.dev created
customresourcedefinition.apiextensions.k8s.io/serverlessservices.networking.internal.knative.dev created
customresourcedefinition.apiextensions.k8s.io/services.serving.knative.dev created
customresourcedefinition.apiextensions.k8s.io/images.caching.internal.knative.dev created
namespace/knative-serving created
clusterrole.rbac.authorization.k8s.io/knat

namespace/knative-eventing created
serviceaccount/eventing-controller created
clusterrolebinding.rbac.authorization.k8s.io/eventing-controller created
clusterrolebinding.rbac.authorization.k8s.io/eventing-controller-resolver created
clusterrolebinding.rbac.authorization.k8s.io/eventing-controller-source-observer created
clusterrolebinding.rbac.authorization.k8s.io/eventing-controller-sources-controller created
clusterrolebinding.rbac.authorization.k8s.io/eventing-controller-manipulator created
serviceaccount/pingsource-mt-adapter created
clusterrolebinding.rbac.authorization.k8s.io/knative-eventing-pingsource-mt-adapter created
serviceaccount/eventing-webhook created
clusterrolebinding.rbac.authorization.k8s.io/eventing-webhook created
clusterrolebinding.rbac.authorization.k8s.io/eventing-webhook-resolver created
clusterrolebinding.rbac.authorization.k8s.io/eventing-webhook-podspecable-binding created
configmap/config-br-default-channel created
configmap/config-br-defaults created
conf

## Deploy InfluxDB with Helm

In [6]:
start = time.time()

!helm repo add influxdata https://helm.influxdata.com/
!helm repo update
!helm search repo influxdata
!helm install --name-template release-influxdb stable/influxdb

end = time.time()
print(end-start)

"influxdata" already exists with the same configuration, skipping
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "influxdata" chart repository
...Successfully got an update from the "prometheus-community" chart repository
...Successfully got an update from the "stable" chart repository
Update Complete. âŽˆHappy Helming!âŽˆ
NAME                          	CHART VERSION	APP VERSION	DESCRIPTION                                       
influxdata/chronograf         	1.1.24       	1.8.9.1    	Open-source web application written in Go and R...
influxdata/influxdb           	4.9.14       	1.8.4      	Scalable datastore for metrics, events, and rea...
influxdata/influxdb-enterprise	0.1.12       	1.8.0      	Run InfluxDB Enterprise on Kubernetes             
influxdata/influxdb2          	2.0.0        	2.0.4      	A Helm chart for InfluxDB v2                      
influxdata/kapacitor          	1.3.1        	1.5.4      	InfluxDB's native 



## Deploy ServiceAccount to store AWS Credentials for S3 Bucket Access

In [7]:
import os
from IPython.core.magic import register_line_cell_magic
from dotenv import load_dotenv
load_dotenv()

@register_line_cell_magic
def writetemplate(line, cell):
    with open(line, 'w') as f:
        f.write(cell.format(**globals()))
        
AWS_ACCESS_KEY_ID = os.environ['AWS_ACCESS_KEY_ID']
AWS_SECRET_ACCESS_KEY = os.environ['AWS_SECRET_ACCESS_KEY']

In [8]:
%%writetemplate ./credentials/aws-secret_serviceaccount.yaml
apiVersion: v1
kind: Secret
metadata:
  name: aws-secret
  namespace: kfserving-test
type: Opaque
stringData:
  AWS_ACCESS_KEY_ID: {AWS_ACCESS_KEY_ID}
  AWS_SECRET_ACCESS_KEY: {AWS_SECRET_ACCESS_KEY}
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: sa
  namespace: kfserving-test
secrets:
  - name: aws-secret

In [9]:
!kubectl create ns kfserving-test
!kubectl apply -f ./credentials/aws-secret_serviceaccount.yaml

namespace/kfserving-test created
secret/aws-secret created
serviceaccount/sa created


## Deploy docker-registry secret to access the private Gitlab Container Registry

In [10]:
!kubectl create secret docker-registry gitlab \
    --docker-server=https://registry.gitlab.com/\
    --docker-username=%DOCKER_USERNAME%\
    --docker-password=%DOCKER_PASSWORD%\
    -n kfserving-test

secret/gitlab created


## Architektur:
<img src="./architektur.png" width="650">

## Deploy Knative Broker

In [11]:
%%writefile ./broker.yaml
apiVersion: eventing.knative.dev/v1
kind: broker
metadata:
 name: product-recommender
 namespace: kfserving-test

Overwriting ./broker.yaml


In [12]:
!kubectl create -f ./broker.yaml

broker.eventing.knative.dev/product-recommender created


# Deploy Product Recommender

In [13]:
%%writefile ./tf-deployment-recommender.yaml
apiVersion: "serving.kubeflow.org/v1beta1"
kind: "InferenceService"
metadata:
  namespace: "kfserving-test"
  name: "product-recommender"
spec:
  transformer:
        containers:
        - image: registry.gitlab.com/felix.exel/container_registry/kfserving/model-performance-monitoring
          name: user-container
          imagePullPolicy: Always
        imagePullSecrets:
          - name: gitlab
  predictor:
    serviceAccountName: "sa" # service account for aws credentials
    minReplicas: 1 # if 0: replica will scale down to 0 when there are no requests
    tensorflow:
      runtimeVersion: "2.4.0" #TensorFlow Serving Version
      storageUri: "s3://bucket-fex/0/719f2437c2a147d89ab6268cf7379cda/artifacts/saved_model/tfmodel/" # subfolder must contain numbers only for tf serving
    logger:
      mode: all
      url: http://broker-ingress.knative-eventing.svc.cluster.local/kfserving-test/product-recommender

Overwriting ./tf-deployment-recommender.yaml


In [14]:
!kubectl apply -f ./tf-deployment-recommender.yaml

inferenceservice.serving.kubeflow.org/product-recommender created


# Deploy Anomaly Detection (Autoencoder)

In [15]:
%%writefile ./outlier_detection/outlier-detection.yaml
apiVersion: serving.kubeflow.org/v1beta1
kind: InferenceService
metadata:
  namespace: kfserving-test
  name: autoencoder-recommender
spec:
  transformer:
        containers:
        - image: registry.gitlab.com/felix.exel/container_registry/kfserving/outlier-detection
          name: user-container
          imagePullPolicy: Always
        imagePullSecrets:
          - name: gitlab

  predictor:
    serviceAccountName: "sa" # service account for aws credentials
    minReplicas: 1 # if 0: replica will scale down to 0 when there are no requests
    tensorflow:
      runtimeVersion: "2.4.0" #TensorFlow Serving Version
      storageUri: "s3://bucket-fex/autoencoder_recommender/d052e637a7314c14a092585baf512672/" # subfolder must contain numbers only for tf serving

Overwriting ./outlier_detection/outlier-detection.yaml


In [16]:
!kubectl apply -f ./outlier_detection/outlier-detection.yaml

inferenceservice.serving.kubeflow.org/autoencoder-recommender created


### Trigger Anomaly Detection (Autoencoder)

In [17]:
%%writefile ./outlier_detection/trigger.yaml
apiVersion: eventing.knative.dev/v1
kind: Trigger
metadata:
  name: outlier-trigger
  namespace: kfserving-test
spec:
  broker: product-recommender
  filter:
    attributes:
      type: org.kubeflow.serving.inference.request
  subscriber:
    uri: http://autoencoder-recommender-transformer-default.kfserving-test/v1/models/autoencoder-recommender:predict

Overwriting ./outlier_detection/trigger.yaml


In [18]:
!kubectl apply -f ./outlier_detection/trigger.yaml

trigger.eventing.knative.dev/outlier-trigger created


<img src="./architektur.png" width="650">

# Grafana

In [19]:
%%writefile ./istio/loadbalancer.yaml
apiVersion: v1
kind: Service
metadata:
  name: grafana-load-balancer
  namespace: knative-monitoring
spec:
  type: LoadBalancer
  selector:
    app: grafana
  ports:
    - protocol: TCP
      port: 3000
      targetPort: 3000

Overwriting ./istio/loadbalancer.yaml


In [20]:
!kubectl apply -f ./istio/loadbalancer.yaml

service/grafana-load-balancer created


In [21]:
cluster_type = 'aws' # 'azure', 'aws', 'local'

if cluster_type == 'azure': # azure aks
    INGRESS_HOST_LIST = !kubectl -n istio-system get service istio-ingressgateway -o jsonpath={.status.loadBalancer.ingress[0].ip}
    INGRESS_HOST =  INGRESS_HOST_LIST[0]
    INGRESS_PORT = 80
    GRAFANA_HOST_LIST = !kubectl -n knative-monitoring get service grafana-load-balancer -o jsonpath={.status.loadBalancer.ingress[0].ip}
    GRAFANA_HOST = GRAFANA_HOST_LIST[0]
    GRAFANA_PORT = 3000

elif cluster_type == 'aws': # aws eks
    INGRESS_HOST_LIST = !kubectl -n istio-system get service istio-ingressgateway -o jsonpath={.status.loadBalancer.ingress[0].hostname}
    INGRESS_HOST =  INGRESS_HOST_LIST[0]
    INGRESS_PORT = 80
    GRAFANA_HOST_LIST = !kubectl -n knative-monitoring get service grafana-load-balancer -o jsonpath={.status.loadBalancer.ingress[0].hostname}
    GRAFANA_HOST = GRAFANA_HOST_LIST[0]
    GRAFANA_PORT = 3000
    
elif cluster_type == 'local': # e.g. minikube or microk8s
    INGRESS_HOST_LIST = !kubectl get po -l istio=ingressgateway -n istio-system -o jsonpath={.items[0].status.hostIP}
    INGRESS_HOST =  INGRESS_HOST_LIST[0] #eg. '192.168.52.86'
    INGRESS_PORT_LIST = !kubectl get svc -l istio=ingressgateway -n istio-system -o jsonpath={.items[0].spec.ports[1].nodePort}
    INGRESS_PORT = int(INGRESS_PORT_LIST[0])
    GRAFANA_HOST = INGRESS_HOST
    GRAFANA_PORT_LIST = !kubectl -n knative-monitoring get service grafana-load-balancer -o jsonpath={.spec.ports[0].nodePort}
    GRAFANA_PORT = GRAFANA_PORT_LIST[0]

print(f"http://{GRAFANA_HOST}:{GRAFANA_PORT}/d/drTDt1LGz/model-performance?orgId=1&refresh=10s&from=now-5m&to=now")

http://ab7ed9338b6944825893606e97c342fe-1340349124.eu-central-1.elb.amazonaws.com:3000/d/drTDt1LGz/model-performance?orgId=1&refresh=10s&from=now-5m&to=now


# Test the ML-Service
### Load Test Data

In [22]:
import pandas as pd
import numpy as np
import time
import json
import requests
import urllib3
from IPython.core.interactiveshell import InteractiveShell

urllib3.disable_warnings()
InteractiveShell.ast_node_interactivity = "all"
np.set_printoptions(precision=5)

sessions_padded = np.load('list_sessions_padded.npy')
print(sessions_padded.shape)

last_clicked = np.load('list_last_clicked.npy')
print(last_clicked.shape)

id_mapping = pd.read_csv('ID_Mapping.csv')

(30941, 30, 52)
(30941,)


In [23]:
def request_kf_serving_http(np_array, ground_truth, MODEL_NAME, NAMESPACE, INGRESS_HOST, INGRESS_PORT):
    data = json.dumps({"instances": np_array.tolist(),
                       'id': ground_truth.tolist()})
    
    headers = {"content-type": "application/json",
               'Host': f'{MODEL_NAME}.{NAMESPACE}.example.com'}
    
    json_response = requests.post(
        f'http://{INGRESS_HOST}:{INGRESS_PORT}/v1/models/{MODEL_NAME}:predict',
        data=data, headers=headers)

    try:
        predictions = json.loads(json_response.text)['predictions']
    except Exception as e:
        print(json_response.text)
        raise e
    return np.array(predictions).astype(np.float32)


NAMESPACE = 'kfserving-test'
MODEL_NAME = 'product-recommender'

## HTTP Request

In [27]:
idx = 15 # 15, 169, 14 anomaly: 169

start = time.time()
pred = request_kf_serving_http(sessions_padded[idx:idx+1], last_clicked[idx:idx+1],
                          MODEL_NAME, NAMESPACE, INGRESS_HOST, INGRESS_PORT)
end = time.time()
print(f'Time required in Seconds: {end - start}')

# top 5 predictions
top = pred.argsort()[0][::-1][:5]

print("Session:")
session = pd.DataFrame()
session['category_code'] = [id_mapping['category_code'][int(i)-1] for i in sessions_padded[idx,:,0] if i>0]
session['Item_ID'] = [id_mapping['Item_ID'][int(i)-1] for i in sessions_padded[idx,:,0] if i>0]
session['Item_ID_Mapped'] = [int(i) for i in sessions_padded[idx,:,0] if i>0]
session


print("Prediction:")
prediction = pd.DataFrame()
prediction['category_code'] = [id_mapping['category_code'][int(i)-1] for i in top if i>0]
prediction['Item_ID'] = [id_mapping['Item_ID'][int(i)-1] for i in top if i>0]
prediction['Item_ID_Mapped'] = [int(i) for i in top if i>0]
prediction['probability'] = pred[0, top]
prediction
print("Ground Truth:", last_clicked[idx])

Time required in Seconds: 0.5353260040283203
Session:


Unnamed: 0,category_code,Item_ID,Item_ID_Mapped
0,appliances.environment.vacuum,3700338.0,47
1,appliances.environment.vacuum,3700338.0,47
2,appliances.environment.vacuum,3701056.0,48
3,appliances.environment.vacuum,3700338.0,47
4,appliances.environment.vacuum,3701056.0,48
5,appliances.environment.vacuum,3700338.0,47
6,appliances.environment.vacuum,3701056.0,48
7,appliances.environment.vacuum,3700278.0,49
8,appliances.environment.vacuum,3700777.0,50
9,appliances.environment.vacuum,3701162.0,51


Prediction:


Unnamed: 0,category_code,Item_ID,Item_ID_Mapped,probability
0,appliances.environment.vacuum,3700600.0,3444,0.641037
1,appliances.environment.vacuum,3700832.0,2898,0.045066
2,appliances.environment.vacuum,3700787.0,1691,0.044069
3,appliances.environment.vacuum,3700907.0,406,0.037994
4,appliances.environment.vacuum,3701164.0,293,0.017183


Ground Truth: 3444


## Grafana Dashboard

In [26]:
print(f"http://{GRAFANA_HOST}:{GRAFANA_PORT}/d/drTDt1LGz/model-performance?orgId=1&refresh=10s&from=now-5m&to=now")

http://ab7ed9338b6944825893606e97c342fe-1340349124.eu-central-1.elb.amazonaws.com:3000/d/drTDt1LGz/model-performance?orgId=1&refresh=10s&from=now-5m&to=now
