# Scheduler K8S Test

Assumes you have a Kind cluster running with metallb. No other dependencies required.
You can use [Seldon Ansible Kind Playbook](https://github.com/SeldonIO/ansible-k8s-collection/blob/master/playbooks/kind.yaml)

## Setup

* `make kind-image-install-all`
* `make deploy` 


In [241]:
SCHEDULER_IP=!kubectl get svc seldon-scheduler -n seldon-mesh -o jsonpath='{.status.loadBalancer.ingress[0].ip}'
SCHEDULER_IP=SCHEDULER_IP[0]
import os
os.environ['SCHEDULER_IP'] = SCHEDULER_IP

In [242]:
MESH_IP=!kubectl get svc seldon-mesh -n seldon-mesh -o jsonpath='{.status.loadBalancer.ingress[0].ip}'
MESH_IP=MESH_IP[0]
import os
os.environ['MESH_IP'] = MESH_IP

## No Auth Example

In [152]:
!grpcurl -d '{"model":{ \
              "meta":{"name":"iris"},\
              "modelSpec":{"uri":"gs://seldon-models/mlserver/iris",\
                           "requirements":["sklearn"],\
                           "memoryBytes":50000},\
              "deploymentSpec":{"replicas":2}}}' \
         -plaintext \
         -import-path ../../apis \
         -proto ../../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/LoadModel

{
  
}


In [153]:
!grpcurl -d '{"model":{"name":"iris"}}' \
         -plaintext \
         -import-path ../../apis \
         -proto ../../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/ModelStatus

{
  "modelName": "iris",
  "versions": [
    {
      "version": 1,
      "serverName": "mlserver",
      "modelReplicaState": {
        "1": {
          "state": "Available",
          "lastChangeTimestamp": "2021-12-24T19:11:11.756546197Z"
        },
        "2": {
          "state": "Available",
          "lastChangeTimestamp": "2021-12-24T19:11:11.756629528Z"
        }
      },
      "state": {
        "state": "ModelAvailable",
        "availableReplicas": 2,
        "lastChangeTimestamp": "2021-12-24T19:11:11.756629528Z"
      }
    }
  ]
}


In [154]:
!curl -v http://${MESH_IP}/v2/models/iris/infer -H "Content-Type: application/json" -H "Host: iris"\
        -d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'

* Expire in 0 ms for 6 (transfer 0x55f00ce8b4f0)
*   Trying 172.18.255.3...
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x55f00ce8b4f0)
* Connected to 172.18.255.3 (172.18.255.3) port 80 (#0)
> POST /v2/models/iris/infer HTTP/1.1
> Host: iris
> User-Agent: curl/7.64.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 94
> 
* upload completely sent off: 94 out of 94 bytes
< HTTP/1.1 200 OK
< date: Fri, 24 Dec 2021 19:11:20 GMT
< server: envoy
< content-length: 194
< content-type: application/json
< x-envoy-upstream-service-time: 1132
< 
* Connection #0 to host 172.18.255.3 left intact
{"model_name":"iris","model_version":"1","id":"8c68aa5a-8dbb-4f8f-be8b-5b80434d77b9","parameters":null,"outputs":[{"name":"predict","shape":[1],"datatype":"INT64","parameters":null,"data":[2]}]}

In [155]:
!cd ../v2 && \
    grpcurl -d '{"model_name":"iris","inputs":[{"name":"input","contents":{"fp32_contents":[1,2,3,4]},"datatype":"FP32","shape":[1,4]}]}' \
        -plaintext \
        -proto grpc_service.proto \
        -rpc-header Seldon:iris \
       ${MESH_IP}:80 inference.GRPCInferenceService/ModelInfer

{
  "modelName": "iris",
  "modelVersion": "1",
  "outputs": [
    {
      "name": "predict",
      "datatype": "INT64",
      "shape": [
        "1"
      ],
      "contents": {
        "int64Contents": [
          "2"
        ]
      }
    }
  ]
}


In [156]:
!grpcurl -d '{"name":"mlserver"}' \
         -plaintext \
         -import-path ../../apis \
         -proto ../../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/ServerStatus

{
  "serverName": "mlserver",
  "resources": [
    {
      "memory": "100000",
      "availableMemoryBytes": "50000"
    },
    {
      "memory": "100000",
      "availableMemoryBytes": "50000"
    },
    {
      "memory": "100000",
      "availableMemoryBytes": "100000"
    }
  ]
}


In [157]:
!grpcurl -d '{"model":{"name":"iris"}}' \
         -plaintext \
         -import-path ../../apis \
         -proto ../../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/UnloadModel

{
  
}


In [158]:
!curl -v http://${MESH_IP}/v2/models/iris/infer -H "Content-Type: application/json" -H "Host: iris"\
        -d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'

* Expire in 0 ms for 6 (transfer 0x55e7e994c4f0)
*   Trying 172.18.255.3...
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x55e7e994c4f0)
* Connected to 172.18.255.3 (172.18.255.3) port 80 (#0)
> POST /v2/models/iris/infer HTTP/1.1
> Host: iris
> User-Agent: curl/7.64.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 94
> 
* upload completely sent off: 94 out of 94 bytes
< HTTP/1.1 404 Not Found
< date: Fri, 24 Dec 2021 19:11:49 GMT
< server: envoy
< connection: close
< content-length: 0
< 
* Closing connection 0


In [159]:
!grpcurl -d '{"model":{"name":"iris"}}' \
         -plaintext \
         -import-path ../../apis \
         -proto ../../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/ModelStatus

ERROR:
  Code: FailedPrecondition
  Message: Failed to find model iris


In [160]:
!grpcurl -d '{"name":"mlserver"}' \
         -plaintext \
         -import-path ../../apis \
         -proto ../../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/ServerStatus

{
  "serverName": "mlserver",
  "resources": [
    {
      "memory": "100000",
      "availableMemoryBytes": "100000"
    },
    {
      "memory": "100000",
      "availableMemoryBytes": "100000"
    },
    {
      "memory": "100000",
      "availableMemoryBytes": "100000"
    }
  ]
}


## Auth Preparation - Prepare Minio with Iris Model

Install minio in a kind cluster with Ansible.

Create rclone.conf changing the ip address for minio as appropriate

```
[s3]
type = s3
provider = minio
env_auth = false
access_key_id = minioadmin
secret_access_key = minioadmin
endpoint = http://172.18.255.1:9000
```
  
 Copy iris model to minio
 
 ```
 rclone --config ./rclone.conf copy mlrepo/iris s3://test
 ```

## Inline RClone Config Example

Before running this 
  * Update ip address for exposed minio ip address in example below
  


In [161]:
!grpcurl --format text -d '\
         model { \
            meta: { name:"iris"},\
            modelSpec: {uri:"s3://models/iris",\
                        storageConfig: { \
                        storageRcloneConfig: "{\"type\":\"s3\",\"name\":\"s3\",\"parameters\":{\"provider\":\"minio\",\"env_auth\":\"false\",\"access_key_id\":\"minioadmin\",\"secret_access_key\":\"minioadmin\",\"endpoint\":\"http://172.18.255.1:9000\"}}" \
     } \
                        requirements:["sklearn"],\
                        memoryBytes:500},\
            deploymentSpec:{replicas:1}}' \
         -plaintext \
         -import-path ../../apis \
         -proto ../../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/LoadModel




In [162]:
!grpcurl -d '{"name":"mlserver"}' \
         -plaintext \
         -import-path ../../apis \
         -proto ../../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/ServerStatus

{
  "serverName": "mlserver",
  "resources": [
    {
      "memory": "100000",
      "availableMemoryBytes": "99500"
    },
    {
      "memory": "100000",
      "availableMemoryBytes": "100000"
    },
    {
      "memory": "100000",
      "availableMemoryBytes": "100000"
    }
  ]
}


In [163]:
!curl -v http://${MESH_IP}/v2/models/iris/infer -H "Content-Type: application/json" -H "Host: iris"\
        -d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'

* Expire in 0 ms for 6 (transfer 0x55aeb90b84f0)
*   Trying 172.18.255.3...
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x55aeb90b84f0)
* Connected to 172.18.255.3 (172.18.255.3) port 80 (#0)
> POST /v2/models/iris/infer HTTP/1.1
> Host: iris
> User-Agent: curl/7.64.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 94
> 
* upload completely sent off: 94 out of 94 bytes
< HTTP/1.1 200 OK
< date: Fri, 24 Dec 2021 19:12:00 GMT
< server: envoy
< content-length: 194
< content-type: application/json
< x-envoy-upstream-service-time: 1015
< 
* Connection #0 to host 172.18.255.3 left intact
{"model_name":"iris","model_version":"1","id":"506618dc-9b16-44c2-b160-0b82e9c78e4d","parameters":null,"outputs":[{"name":"predict","shape":[1],"datatype":"INT64","parameters":null,"data":[2]}]}

In [164]:
!cd ../v2 && \
    grpcurl -d '{"model_name":"iris","inputs":[{"name":"input","contents":{"fp32_contents":[1,2,3,4]},"datatype":"FP32","shape":[1,4]}]}' \
        -plaintext \
        -proto grpc_service.proto \
        -rpc-header Seldon:iris \
       ${MESH_IP}:80 inference.GRPCInferenceService/ModelInfer

{
  "modelName": "iris",
  "modelVersion": "1",
  "outputs": [
    {
      "name": "predict",
      "datatype": "INT64",
      "shape": [
        "1"
      ],
      "contents": {
        "int64Contents": [
          "2"
        ]
      }
    }
  ]
}


In [165]:
!grpcurl -d '{"model":{"name":"iris"}}' \
         -plaintext \
         -import-path ../../apis \
         -proto ../../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/UnloadModel

{
  
}


## K8S Secret RClone Config Example

 * Update endpoint for minio below

In [166]:
%%writefile minio-secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: minio-secret
  namespace: seldon-mesh
type: Opaque
stringData:
  s3: |
    type: s3
    name: s3
    parameters:
      provider: minio
      env_auth: false
      access_key_id: minioadmin
      secret_access_key: minioadmin
      endpoint: http://minio.minio-system:9000

Overwriting minio-secret.yaml


In [167]:
!kubectl apply -f minio-secret.yaml

secret/minio-secret configured


In [168]:
!grpcurl --format text -d '\
         model { \
            meta: { name:"iris"\
                    kubernetesMeta: {namespace: "seldon-mesh"}},\
            modelSpec: {uri:"s3://models/iris",\
                        storageConfig: { \
                          storageSecretName: "minio-secret" \
                         } \
                        requirements:["sklearn"],\
                        memoryBytes:500},\
            deploymentSpec:{replicas:1}}' \
         -plaintext \
         -import-path ../../apis \
         -proto ../../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/LoadModel




In [169]:
!grpcurl -d '{"name":"mlserver"}' \
         -plaintext \
         -import-path ../../apis \
         -proto ../../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/ServerStatus

{
  "serverName": "mlserver",
  "resources": [
    {
      "memory": "100000",
      "availableMemoryBytes": "99500"
    },
    {
      "memory": "100000",
      "availableMemoryBytes": "100000"
    },
    {
      "memory": "100000",
      "availableMemoryBytes": "100000"
    }
  ]
}


In [170]:
!curl -v http://${MESH_IP}/v2/models/iris/infer -H "Content-Type: application/json" -H "Host: iris"\
        -d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'

* Expire in 0 ms for 6 (transfer 0x563c9b4be4f0)
*   Trying 172.18.255.3...
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x563c9b4be4f0)
* Connected to 172.18.255.3 (172.18.255.3) port 80 (#0)
> POST /v2/models/iris/infer HTTP/1.1
> Host: iris
> User-Agent: curl/7.64.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 94
> 
* upload completely sent off: 94 out of 94 bytes
< HTTP/1.1 200 OK
< date: Fri, 24 Dec 2021 19:12:24 GMT
< server: envoy
< content-length: 194
< content-type: application/json
< x-envoy-upstream-service-time: 1223
< 
* Connection #0 to host 172.18.255.3 left intact
{"model_name":"iris","model_version":"1","id":"8a9c59bb-fb53-4bf5-8812-ce4e86bcf1fb","parameters":null,"outputs":[{"name":"predict","shape":[1],"datatype":"INT64","parameters":null,"data":[2]}]}

In [171]:
!cd ../v2 && \
    grpcurl -d '{"model_name":"iris","inputs":[{"name":"input","contents":{"fp32_contents":[1,2,3,4]},"datatype":"FP32","shape":[1,4]}]}' \
        -plaintext \
        -proto grpc_service.proto \
        -rpc-header Seldon:iris \
       ${MESH_IP}:80 inference.GRPCInferenceService/ModelInfer

{
  "modelName": "iris",
  "modelVersion": "1",
  "outputs": [
    {
      "name": "predict",
      "datatype": "INT64",
      "shape": [
        "1"
      ],
      "contents": {
        "int64Contents": [
          "2"
        ]
      }
    }
  ]
}


In [172]:
!grpcurl -d '{"model":{"name":"iris"}}' \
         -plaintext \
         -import-path ../../apis \
         -proto ../../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/UnloadModel

{
  
}


## Using Agent ConfigMap

 * Edit the AgentConfig map to include the `minio-secret` as one of the rclone defaults, e.g.:
 
 ```
 apiVersion: v1
kind: ConfigMap
metadata:
  name: seldon-agent
data:
  agent.json: |-
    {
       "rclone" : {
           "config_secrets": ["seldon-rclone-gs-public","minio-secret"]
       },
    }

 ```

In [173]:
%%writefile minio-secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: minio-secret
  namespace: seldon-mesh
type: Opaque
stringData:
  s3: |
    type: s3
    name: s3
    parameters:
      provider: minio
      env_auth: false
      access_key_id: minioadmin
      secret_access_key: minioadmin
      endpoint: http://minio.minio-system:9000

Overwriting minio-secret.yaml


In [174]:
!kubectl apply -f minio-secret.yaml

secret/minio-secret configured


In [175]:
!grpcurl --format text -d '\
         model { \
            meta: { name:"iris"\
                    kubernetesMeta: {namespace: "seldon-mesh"}},\
            modelSpec: {uri:"s3://models/iris",\
                        requirements:["sklearn"],\
                        memoryBytes:500},\
            deploymentSpec:{replicas:1}}' \
         -plaintext \
         -import-path ../../apis \
         -proto ../../apis/mlops/scheduler/scheduler.proto   ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/LoadModel




In [176]:
!grpcurl -d '{"model":{"name":"iris"}}' \
         -plaintext \
         -import-path ../../apis \
         -proto ../../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/ModelStatus

{
  "modelName": "iris",
  "versions": [
    {
      "version": 1,
      "serverName": "mlserver",
      "kubernetesMeta": {
        "namespace": "seldon-mesh"
      },
      "modelReplicaState": {
        "1": {
          "state": "Available",
          "lastChangeTimestamp": "2021-12-24T19:13:00.389439933Z"
        }
      },
      "state": {
        "state": "ModelAvailable",
        "availableReplicas": 1,
        "lastChangeTimestamp": "2021-12-24T19:13:00.389439933Z"
      }
    }
  ]
}


In [177]:
!grpcurl -d '{"name":"mlserver"}' \
         -plaintext \
         -import-path ../../apis \
         -proto ../../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/ServerStatus

{
  "serverName": "mlserver",
  "resources": [
    {
      "memory": "100000",
      "availableMemoryBytes": "99500"
    },
    {
      "memory": "100000",
      "availableMemoryBytes": "100000"
    },
    {
      "memory": "100000",
      "availableMemoryBytes": "100000"
    }
  ]
}


In [178]:
!curl -v http://${MESH_IP}/v2/models/iris/infer -H "Content-Type: application/json" -H "Host: iris"\
        -d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'

* Expire in 0 ms for 6 (transfer 0x558e737214f0)
*   Trying 172.18.255.3...
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x558e737214f0)
* Connected to 172.18.255.3 (172.18.255.3) port 80 (#0)
> POST /v2/models/iris/infer HTTP/1.1
> Host: iris
> User-Agent: curl/7.64.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 94
> 
* upload completely sent off: 94 out of 94 bytes
< HTTP/1.1 200 OK
< date: Fri, 24 Dec 2021 19:13:05 GMT
< server: envoy
< content-length: 194
< content-type: application/json
< x-envoy-upstream-service-time: 1326
< 
* Connection #0 to host 172.18.255.3 left intact
{"model_name":"iris","model_version":"1","id":"5dbe5286-ea10-4c22-87cb-b6d549862119","parameters":null,"outputs":[{"name":"predict","shape":[1],"datatype":"INT64","parameters":null,"data":[2]}]}

In [179]:
!cd ../v2 && \
    grpcurl -d '{"model_name":"iris","inputs":[{"name":"input","contents":{"fp32_contents":[1,2,3,4]},"datatype":"FP32","shape":[1,4]}]}' \
        -plaintext \
        -proto grpc_service.proto \
        -rpc-header Seldon:iris \
       ${MESH_IP}:80 inference.GRPCInferenceService/ModelInfer

{
  "modelName": "iris",
  "modelVersion": "1",
  "outputs": [
    {
      "name": "predict",
      "datatype": "INT64",
      "shape": [
        "1"
      ],
      "contents": {
        "int64Contents": [
          "2"
        ]
      }
    }
  ]
}


In [180]:
!grpcurl -d '{"model":{"name":"iris"}}' \
         -plaintext \
         -import-path ../../apis \
         -proto ../../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/UnloadModel

{
  
}


## Kubernetes Resource Example

Do as above for scheduler setup.

Operator Setup: from operator folder run

 * `make kind-image-load`
 * `make deploy`

In [290]:
!kubectl create -f ../../operator/samples/models/sklearn-iris-gs.yaml

model.mlops.seldon.io/iris created


In [291]:
!kubectl wait --for condition=ready --timeout=300s model --all -n seldon-mesh

model.mlops.seldon.io/iris condition met


In [282]:
!grpcurl -d '{"model":{"name":"iris"}}' \
         -plaintext \
         -import-path ../../apis \
         -proto ../../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/ModelStatus

{
  "modelName": "iris",
  "versions": [
    {
      "version": 1,
      "serverName": "mlserver",
      "kubernetesMeta": {
        "namespace": "seldon-mesh",
        "generation": "1"
      },
      "modelReplicaState": {
        "2": {
          "state": "Available",
          "lastChangeTimestamp": "2021-12-25T10:51:22.788771514Z"
        }
      },
      "state": {
        "state": "ModelAvailable",
        "availableReplicas": 1,
        "lastChangeTimestamp": "2021-12-25T10:51:22.788771514Z"
      }
    }
  ]
}


In [283]:
!curl -v http://${MESH_IP}/v2/models/iris/infer -H "Content-Type: application/json" -H "Host: iris"\
        -d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'

* Expire in 0 ms for 6 (transfer 0x55656cb904f0)
*   Trying 172.18.255.3...
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x55656cb904f0)
* Connected to 172.18.255.3 (172.18.255.3) port 80 (#0)
> POST /v2/models/iris/infer HTTP/1.1
> Host: iris
> User-Agent: curl/7.64.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 94
> 
* upload completely sent off: 94 out of 94 bytes
< HTTP/1.1 200 OK
< date: Sat, 25 Dec 2021 10:51:24 GMT
< server: envoy
< content-length: 194
< content-type: application/json
< x-envoy-upstream-service-time: 1066
< 
* Connection #0 to host 172.18.255.3 left intact
{"model_name":"iris","model_version":"1","id":"f93f06c2-bf21-490f-9a21-e9a81157dfcb","parameters":null,"outputs":[{"name":"predict","shape":[1],"datatype":"INT64","parameters":null,"data":[2]}]}

In [284]:
!cd ../v2 && \
    grpcurl -d '{"model_name":"iris","inputs":[{"name":"input","contents":{"fp32_contents":[1,2,3,4]},"datatype":"FP32","shape":[1,4]}]}' \
        -plaintext \
        -proto grpc_service.proto \
        -rpc-header Seldon:iris \
       ${MESH_IP}:80 inference.GRPCInferenceService/ModelInfer

{
  "modelName": "iris",
  "modelVersion": "1",
  "outputs": [
    {
      "name": "predict",
      "datatype": "INT64",
      "shape": [
        "1"
      ],
      "contents": {
        "int64Contents": [
          "2"
        ]
      }
    }
  ]
}


In [292]:
!kubectl delete -f ../../operator/samples/models/sklearn-iris-gs.yaml

model.mlops.seldon.io "iris" deleted


# Versions Test

In [204]:
!grpcurl -d '{"model":{ \
              "meta":{"name":"iris"},\
              "modelSpec":{"uri":"gs://seldon-models/mlserver/iris",\
                           "requirements":["sklearn"],\
                           "memoryBytes":50000},\
              "deploymentSpec":{"replicas":2}}}' \
         -plaintext \
         -import-path ../../apis \
         -proto ../../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/LoadModel

{
  
}


In [205]:
!grpcurl -d '{"model":{"name":"iris"}}' \
         -plaintext \
         -proto ../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/ModelStatus

{
  "modelName": "iris",
  "versions": [
    {
      "version": 1,
      "serverName": "mlserver",
      "modelReplicaState": {
        "0": {
          "state": "Available",
          "lastChangeTimestamp": "2021-12-24T19:22:53.134882821Z"
        },
        "2": {
          "state": "Available",
          "lastChangeTimestamp": "2021-12-24T19:22:53.134770245Z"
        }
      },
      "state": {
        "state": "ModelAvailable",
        "availableReplicas": 2,
        "lastChangeTimestamp": "2021-12-24T19:22:53.134882821Z"
      }
    }
  ]
}


In [206]:
!curl -v http://${MESH_IP}/v2/models/iris/infer -H "Content-Type: application/json" -H "Host: iris"\
        -d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'

* Expire in 0 ms for 6 (transfer 0x55aa3d3d64f0)
*   Trying 172.18.255.3...
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x55aa3d3d64f0)
* Connected to 172.18.255.3 (172.18.255.3) port 80 (#0)
> POST /v2/models/iris/infer HTTP/1.1
> Host: iris
> User-Agent: curl/7.64.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 94
> 
* upload completely sent off: 94 out of 94 bytes
< HTTP/1.1 200 OK
< date: Fri, 24 Dec 2021 19:22:55 GMT
< server: envoy
< content-length: 194
< content-type: application/json
< x-envoy-upstream-service-time: 1082
< 
* Connection #0 to host 172.18.255.3 left intact
{"model_name":"iris","model_version":"1","id":"84a3e9a5-c96a-45bf-bdc7-a6a399f22949","parameters":null,"outputs":[{"name":"predict","shape":[1],"datatype":"INT64","parameters":null,"data":[2]}]}

In [207]:
!cd ../v2 && \
    grpcurl -d '{"model_name":"iris","inputs":[{"name":"input","contents":{"fp32_contents":[1,2,3,4]},"datatype":"FP32","shape":[1,4]}]}' \
        -plaintext \
        -proto grpc_service.proto \
        -rpc-header Seldon:iris \
       ${MESH_IP}:80 inference.GRPCInferenceService/ModelInfer

{
  "modelName": "iris",
  "modelVersion": "1",
  "outputs": [
    {
      "name": "predict",
      "datatype": "INT64",
      "shape": [
        "1"
      ],
      "contents": {
        "int64Contents": [
          "2"
        ]
      }
    }
  ]
}


In [208]:
!grpcurl -d '{"name":"mlserver"}' \
         -plaintext \
         -proto ../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/ServerStatus

{
  "serverName": "mlserver",
  "resources": [
    {
      "memory": "100000",
      "availableMemoryBytes": "50000"
    },
    {
      "memory": "100000",
      "availableMemoryBytes": "100000"
    },
    {
      "memory": "100000",
      "availableMemoryBytes": "50000"
    }
  ]
}


In [209]:
!grpcurl -d '{"model":{ \
              "meta":{"name":"iris"},\
              "modelSpec":{"uri":"gs://seldon-models/mlserver/iris",\
                           "requirements":["sklearn"],\
                           "memoryBytes":20000},\
              "deploymentSpec":{"replicas":1}}}' \
         -plaintext \
         -import-path ../../apis \
         -proto ../../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/LoadModel

{
  
}


In [210]:
!grpcurl -d '{"model":{"name":"iris"},"allVersions":true}' \
         -plaintext \
         -proto ../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/ModelStatus

{
  "modelName": "iris",
  "versions": [
    {
      "version": 1,
      "serverName": "mlserver",
      "modelReplicaState": {
        "0": {
          "state": "Unloaded",
          "lastChangeTimestamp": "2021-12-24T19:23:02.626285043Z"
        },
        "2": {
          "state": "Unloaded",
          "lastChangeTimestamp": "2021-12-24T19:23:02.622571235Z"
        }
      },
      "state": {
        "state": "ModelTerminated",
        "lastChangeTimestamp": "2021-12-24T19:23:02.626285043Z"
      }
    },
    {
      "version": 2,
      "serverName": "mlserver",
      "modelReplicaState": {
        "1": {
          "state": "Available",
          "lastChangeTimestamp": "2021-12-24T19:23:02.626877637Z"
        }
      },
      "state": {
        "state": "ModelAvailable",
        "availableReplicas": 1,
        "lastChangeTimestamp": "2021-12-24T19:23:02.626877637Z"
      }
    }
  ]
}


In [211]:
!curl -v http://${MESH_IP}/v2/models/iris/infer -H "Content-Type: application/json" -H "Host: iris"\
        -d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'

* Expire in 0 ms for 6 (transfer 0x55ebc489a4f0)
*   Trying 172.18.255.3...
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x55ebc489a4f0)
* Connected to 172.18.255.3 (172.18.255.3) port 80 (#0)
> POST /v2/models/iris/infer HTTP/1.1
> Host: iris
> User-Agent: curl/7.64.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 94
> 
* upload completely sent off: 94 out of 94 bytes
< HTTP/1.1 200 OK
< date: Fri, 24 Dec 2021 19:23:05 GMT
< server: envoy
< content-length: 194
< content-type: application/json
< x-envoy-upstream-service-time: 1066
< 
* Connection #0 to host 172.18.255.3 left intact
{"model_name":"iris","model_version":"2","id":"c9b04a7b-3351-43d0-9427-de69a7cbcffd","parameters":null,"outputs":[{"name":"predict","shape":[1],"datatype":"INT64","parameters":null,"data":[2]}]}

In [212]:
!cd ../v2 && \
    grpcurl -d '{"model_name":"iris","inputs":[{"name":"input","contents":{"fp32_contents":[1,2,3,4]},"datatype":"FP32","shape":[1,4]}]}' \
        -plaintext \
        -proto grpc_service.proto \
        -rpc-header Seldon:iris \
       ${MESH_IP}:80 inference.GRPCInferenceService/ModelInfer

{
  "modelName": "iris",
  "modelVersion": "2",
  "outputs": [
    {
      "name": "predict",
      "datatype": "INT64",
      "shape": [
        "1"
      ],
      "contents": {
        "int64Contents": [
          "2"
        ]
      }
    }
  ]
}


In [213]:
!grpcurl -d '{"name":"mlserver"}' \
         -plaintext \
         -proto ../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/ServerStatus

{
  "serverName": "mlserver",
  "resources": [
    {
      "memory": "100000",
      "availableMemoryBytes": "100000"
    },
    {
      "memory": "100000",
      "availableMemoryBytes": "80000"
    },
    {
      "memory": "100000",
      "availableMemoryBytes": "100000"
    }
  ]
}


In [214]:
!grpcurl -d '{"model":{"name":"iris"}}' \
         -plaintext \
         -proto ../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/UnloadModel

{
  
}


In [215]:
!curl -v http://${MESH_IP}/v2/models/iris/infer -H "Content-Type: application/json" -H "Host: iris"\
        -d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'

* Expire in 0 ms for 6 (transfer 0x55b4a31ab4f0)
*   Trying 172.18.255.3...
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x55b4a31ab4f0)
* Connected to 172.18.255.3 (172.18.255.3) port 80 (#0)
> POST /v2/models/iris/infer HTTP/1.1
> Host: iris
> User-Agent: curl/7.64.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 94
> 
* upload completely sent off: 94 out of 94 bytes
< HTTP/1.1 404 Not Found
< date: Fri, 24 Dec 2021 19:23:15 GMT
< server: envoy
< connection: close
< content-length: 0
< 
* Closing connection 0


In [216]:
!grpcurl -d '{"model":{"name":"iris"},"allVersions":true}' \
         -plaintext \
         -proto ../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/ModelStatus

ERROR:
  Code: FailedPrecondition
  Message: Failed to find model iris


In [217]:
!grpcurl -d '{"name":"mlserver"}' \
         -plaintext \
         -proto ../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/ServerStatus

{
  "serverName": "mlserver",
  "resources": [
    {
      "memory": "100000",
      "availableMemoryBytes": "100000"
    },
    {
      "memory": "100000",
      "availableMemoryBytes": "100000"
    },
    {
      "memory": "100000",
      "availableMemoryBytes": "100000"
    }
  ]
}
