# Increasing the Maximum Message Size for gRPC


## Running this notebook

You will need to start Jupyter with settings to allow for large payloads, for example:

```
jupyter notebook --NotebookApp.iopub_data_rate_limit=1000000000
```

## Setup Seldon Core

Use the setup notebook to [Setup Cluster](seldon_core_setup.ipynb#Setup-Cluster) with [Ambassador Ingress](seldon_core_setup.ipynb#Ambassador) and [Install Seldon Core](seldon_core_setup.ipynb#Install-Seldon-Core). Instructions [also online](./seldon_core_setup.html).

In [None]:
!kubectl create namespace seldon

In [None]:
!kubectl config set-context $(kubectl config current-context) --namespace=seldon

In [None]:
!pygmentize resources/model_long_timeouts.json

## Create Seldon Deployment

Deploy the runtime graph to kubernetes.

In [None]:
!kubectl apply -f resources/model_long_timeouts.json -n seldon

In [None]:
!kubectl rollout status deploy/model-long-timeout-test-0

## Get predictions

In [4]:
from seldon_core.seldon_client import SeldonClient
sc = SeldonClient(deployment_name="model-long-timeout",namespace="seldon", 
                  grpc_max_send_message_length=50 * 1024 * 1024, grpc_max_receive_message_length=50 * 1024 * 1024)

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])


Send a small request which should suceed.

In [None]:
r = sc.predict(gateway="ambassador",transport="grpc")
assert(r.success==True)
print(r)

Send a large request which will fail as the default for the model will be 4G.

In [None]:
r = sc.predict(gateway="ambassador",transport="grpc",shape=(1000000,1))
print(r.success,r.msg)

In [None]:
!kubectl delete -f resources/model_long_timeouts.json

## Allowing larger gRPC messages

Now we change our SeldonDeployment to include a annotation for max grpx message size.

In [7]:
!pygmentize resources/model_grpc_size.json

{
    [34;01m"apiVersion"[39;49;00m: [33m"machinelearning.seldon.io/v1alpha2"[39;49;00m,
    [34;01m"kind"[39;49;00m: [33m"SeldonDeployment"[39;49;00m,
    [34;01m"metadata"[39;49;00m: {
        [34;01m"labels"[39;49;00m: {
            [34;01m"app"[39;49;00m: [33m"seldon"[39;49;00m
        },
        [34;01m"name"[39;49;00m: [33m"seldon-model"[39;49;00m
    },
    [34;01m"spec"[39;49;00m: {
        [34;01m"annotations"[39;49;00m: {
	    [34;01m"seldon.io/grpc-max-message-size"[39;49;00m:[33m"10000000"[39;49;00m,
	    [34;01m"seldon.io/rest-timeout"[39;49;00m:[33m"100000"[39;49;00m,
	    [34;01m"seldon.io/grpc-timeout"[39;49;00m:[33m"100000"[39;49;00m
        },
        [34;01m"name"[39;49;00m: [33m"test-deployment"[39;49;00m,
        [34;01m"predictors"[39;49;00m: [
            {
                [34;01m"componentSpecs"[39;49;00m: [{
                    [34;01m"spec"[39;49;00m: {
                        [34;01m"container

In [19]:
!kubectl create -f resources/model_grpc_size.json -n seldon

seldondeployment.machinelearning.seldon.io/seldon-model created


In [20]:
!kubectl rollout status deploy/seldon-model-grpc-size-0 

Waiting for deployment "seldon-model-grpc-size-0" rollout to finish: 0 of 1 updated replicas are available...
deployment "seldon-model-grpc-size-0" successfully rolled out


Send a request via ambassador. This should succeed.

In [21]:
sc = SeldonClient(deployment_name="seldon-model",namespace="seldon",
                  grpc_max_send_message_length=50 * 1024 * 1024, grpc_max_receive_message_length=50 * 1024 * 1024)
r = sc.predict(gateway="ambassador",transport="grpc",shape=(1000000,1))
assert(r.success==True)
print(r.success)

True


In [22]:
!kubectl delete -f resources/model_grpc_size.json -n seldon

seldondeployment.machinelearning.seldon.io "seldon-model" deleted
