# OpenVINO Model Server demo in Kubernetes

All the code needed to follow the demo is included in 
[https://github.com/IntelAI/OpenVINO-model-server](https://github.com/IntelAI/OpenVINO-model-server) repo.
It assumes `kubectl` is configured and permissions are granted to create new deployment and service records.

In [1]:
!git clone https://github.com/IntelAI/OpenVINO-model-server.git

Cloning into 'OpenVINO-model-server'...
remote: Enumerating objects: 178, done.[K
remote: Counting objects: 100% (178/178), done.[K
remote: Compressing objects: 100% (125/125), done.[K
remote: Total 882 (delta 107), reused 89 (delta 53), pack-reused 704[K
Receiving objects: 100% (882/882), 2.61 MiB | 12.60 MiB/s, done.
Resolving deltas: 100% (516/516), done.


Here is exemplary deployment and service record to be added in Kubernetes.
It serves ResNet50 model quantizied to INT8 precision. It was converted to OpenVINO format based from Caffe framework.

In [3]:
!cat OpenVINO-model-server/example_k8s/openvino_model_server_resnet.yaml

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ovms
  labels:
    app: ovms
spec:
  selector:
    matchLabels:
       app: ovms
  replicas: 2
  template:
    metadata:
      labels:
        app: ovms
    spec:
      containers:
      - name: ovms-resnet
        image: intelaipg/openvino-model-server:latest
        ports:
        - containerPort: 80
        env:
        - name: LOG_LEVEL
          value: "DEBUG"
        command: ["/ie-serving-py/start_server.sh"]
        args: ["ie_serving", "model", "--model_path", "gs://intelai_public_models/resnet_50_i8", "--model_name", "resnet", "--port", "80", "--batch_size", "auto"]
---
apiVersion: v1
kind: Service
metadata:
  name: ovms-resnet
spec:
  selector:
    app: ovms
  ports:
  - protocol: TCP
    port: 80
    targetPort: 80


In [4]:
!kubectl apply -n kubeflow -f OpenVINO-model-server/example_k8s/openvino_model_server_resnet.yaml

deployment.apps/ovms unchanged
service/ovms-resnet unchanged


Now 2 replicas of OpenVINO Model Server are deployed and service ovms-renset is running.

In [5]:
!kubectl get service -n kubeflow | grep ovms-resnet

ovms-resnet                              ClusterIP   10.47.249.38    <none>        80/TCP              1h


In [6]:
!kubectl get deployment -n kubeflow ovms

NAME   DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
ovms   2         2         2            2           1h


To demonstrate the usage of OVMS service, it is used by a client application in python. 
Identical client can submit the inference requests to TensorFlow Serving.
The client is submitting for classification JPEG images from the input file. 

In [7]:
!cd OpenVINO-model-server/example_client && cat input_images.txt

images/airliner.jpeg 404
images/arctic-fox.jpeg 279
images/bee.jpeg 309
images/golden_retriever.jpeg 207
images/gorilla.jpeg 366
images/magnetic_compass.jpeg 635
images/peacock.jpeg 84
images/pelican.jpeg 144
images/snail.jpeg 113
images/zebra.jpeg 340

airliner.jpeg ![example image](https://github.com/IntelAI/OpenVINO-model-server/raw/master/example_client/images/airliner.jpeg)
arctic-fox.jpeg ![example image](https://github.com/IntelAI/OpenVINO-model-server/raw/master/example_client/images/arctic-fox.jpeg)


Client dependencies needs to be installed:

In [8]:
!pip install -q -r OpenVINO-model-server/example_client/client_requirements.txt

Client `get_serving_meta.py` display information about the served model:

In [9]:
!cd OpenVINO-model-server/example_client/ \
&& python get_serving_meta.py --grpc_address ovms-resnet.kubeflow --grpc_port 80 --model_name resnet

('Getting model metadata for model:', 'resnet')
Inputs metadata:
	Input name: data; shape: [1L, 3L, 224L, 224L]; dtype: DT_FLOAT
Outputs metadata:
	Output name: prob; shape: [1L, 1000L]; dtype: DT_FLOAT


Client `jpeg_classification.py` run the inference requests with images listed in the input file.
Classification results along with expected labels are included in the output.

In the output summary there are listed model metrics calculated based on the execution:
    - accuracy
    - average latency

In [10]:
!cd OpenVINO-model-server/example_client/ \
&& python jpeg_classification.py --images_list input_images.txt \
--grpc_address ovms-resnet.kubeflow --grpc_port 80 \
--input_name data --output_name prob --size 224 --model_name resnet

Start processing:
	Model name: resnet
	Images list file: input_images.txt
('images/airliner.jpeg', (1, 3, 224, 224), '; data range:', 0.0, ':', 255.0)
('Processing time: 383.00 ms; speed 2.00 fps', 2.61)
('Detected:', 404, ' Should be:', '404')
('images/arctic-fox.jpeg', (1, 3, 224, 224), '; data range:', 0.0, ':', 255.0)
('Processing time: 511.00 ms; speed 2.00 fps', 1.96)
('Detected:', 279, ' Should be:', '279')
('images/bee.jpeg', (1, 3, 224, 224), '; data range:', 0.0, ':', 255.0)
('Processing time: 192.00 ms; speed 2.00 fps', 5.22)
('Detected:', 309, ' Should be:', '309')
('images/golden_retriever.jpeg', (1, 3, 224, 224), '; data range:', 0.0, ':', 255.0)
('Processing time: 197.00 ms; speed 2.00 fps', 5.07)
('Detected:', 207, ' Should be:', '207')
('images/gorilla.jpeg', (1, 3, 224, 224), '; data range:', 0.0, ':', 255.0)
('Processing time: 308.00 ms; speed 2.00 fps', 3.25)
('Detected:', 366, ' Should be:', '366')
('images/magnetic_compass.jpeg', (1, 3, 224, 224), '; data range:',

For a reference, info about the CPU spec on the Jupyter host is shown below.

In [11]:
!lscpu

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  2
Core(s) per socket:  1
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               79
Model name:          Intel(R) Xeon(R) CPU @ 2.20GHz
Stepping:            0
CPU MHz:             2200.000
BogoMIPS:            4400.00
Hypervisor vendor:   KVM
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            56320K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3d