Merge branch 'kserve:main' into main

kserve · Apr 2, 2024 · 94a9d51 · 94a9d51
2 parents b697ff4 + e27b5e1
commit 94a9d51
Show file tree

Hide file tree

Showing 14 changed files with 542 additions and 21 deletions.
diff --git a/docs/admin/kubernetes_deployment.md b/docs/admin/kubernetes_deployment.md
@@ -7,9 +7,9 @@ Kubernetes version.
 ## Recommended Version Matrix
 | Kubernetes Version | Recommended Istio Version |
 |:-------------------|:--------------------------|
-| 1.25               | 1.15, 1.16                |
-| 1.26               | 1.17                      |
-| 1.27               | 1.17, 1.18                |
+| 1.27               | 1.18, 1.19                |
+| 1.28               | 1.19, 1.20                |
+| 1.29               | 1.20, 1.21                |
 
 ## 1. Install Istio 
 
@@ -46,14 +46,14 @@ The minimally required Cert Manager version is 1.9.0 and you can refer to [Cert
 
 === "kubectl"
     ```bash
-    kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.11.0/kserve.yaml
+    kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.12.0/kserve.yaml
     ```
 
 Install KServe default serving runtimes:
 
 === "kubectl"
     ```bash
-    kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.11.0/kserve-runtimes.yaml
+    kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.12.0/kserve-runtimes.yaml
     ```
 
 **ii. Change default deployment mode and ingress option**

diff --git a/docs/admin/serverless/serverless.md b/docs/admin/serverless/serverless.md
@@ -8,9 +8,9 @@ Kubernetes version.
 ## Recommended Version Matrix
 | Kubernetes Version | Recommended Istio Version | Recommended Knative Version |
 |:-------------------|:--------------------------|:----------------------------|
-| 1.25               | 1.15, 1.16                | 1.4-1.9                     |
-| 1.26               | 1.17                      | 1.7-1.11                    |
-| 1.27               | 1.17,1.18                 | 1.9-1.11                    |
+| 1.27               | 1.18,1.19                 | 1.10,1.11                   |
+| 1.28               | 1.19,1.20                 | 1.11,1.12.4                 |
+| 1.29               | 1.20,1.21                 | 1.12.4,1.13.1               |
 
 ## 1. Install Knative Serving
 Please refer to [Knative Serving install guide](https://knative.dev/docs/admin/install/serving/install-serving-with-yaml/).
@@ -20,7 +20,7 @@ Please refer to [Knative Serving install guide](https://knative.dev/docs/admin/i
     you need to turn on the corresponding [feature flags](https://knative.dev/docs/admin/serving/feature-flags) in your Knative configuration.
 
 !!! warning
-    In Knative 1.8, The cluster domain suffix is changed to `svc.cluster.local` as the default domain. As routes using the cluster domain suffix are not exposed through Ingress, you will need to [configure DNS](https://knative.dev/docs/install/yaml-install/serving/install-serving-with-yaml/#configure-dns) in order to expose their services (most users probably already are).
+    Knative 1.13.1 requires Istio 1.20+, gRPC routing does not work with previous Istio releases, see [release notes](https://github.com/knative/serving/releases/tag/knative-v1.13.1).
 
 ## 2. Install Networking Layer
 The recommended networking layer for KServe is [Istio](https://istio.io/) as currently it works best with KServe, please refer to the [Istio install guide](https://knative.dev/docs/admin/install/installing-istio).
@@ -35,14 +35,14 @@ The minimally required Cert Manager version is 1.9.0 and you can refer to [Cert
 ## 4. Install KServe
 === "kubectl"
     ```bash
-    kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.11.0/kserve.yaml
+    kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.12.0/kserve.yaml
     ```
 
 ## 5. Install KServe Built-in ClusterServingRuntimes
 
 === "kubectl"
     ```bash
-    kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.11.0/kserve-runtimes.yaml
+    kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.12.0/kserve-runtimes.yaml
     ```
 
 !!! note

diff --git a/docs/modelserving/autoscaling/autoscale-gpu-new.yaml b/docs/modelserving/autoscaling/autoscale-gpu-new.yaml
@@ -4,6 +4,8 @@ metadata:
   name: "flowers-sample-gpu"
 spec:
   predictor:
+    scaleTarget: 1
+    scaleMetric: concurrency
     model:
       modelFormat:
         name: tensorflow

diff --git a/docs/modelserving/autoscaling/autoscale-new.yaml b/docs/modelserving/autoscaling/autoscale-new.yaml
@@ -2,10 +2,10 @@ apiVersion: "serving.kserve.io/v1beta1"
 kind: "InferenceService"
 metadata:
   name: "flowers-sample"
-  annotations:
-    autoscaling.knative.dev/target: "1"
 spec:
   predictor:
+    scaleTarget: 1
+    scaleMetric: concurrency
     model:
       modelFormat:
         name: tensorflow

diff --git a/docs/modelserving/autoscaling/autoscaling.md b/docs/modelserving/autoscaling/autoscaling.md
@@ -248,6 +248,8 @@ Apply the tensorflow gpu example CR
       name: "flowers-sample-gpu"
     spec:
       predictor:
+        scaleTarget: 1
+        scaleMetric: concurrency
         model:
           modelFormat:
             name: tensorflow
@@ -265,6 +267,8 @@ Apply the tensorflow gpu example CR
     kind: "InferenceService"
     metadata:
       name: "flowers-sample-gpu"
+      annotations:
+        autoscaling.knative.dev/target: "1"
     spec:
       predictor:
         tensorflow:

diff --git a/docs/modelserving/certificate/images/cert-global-ca-bundle.png b/docs/modelserving/certificate/images/cert-global-ca-bundle.png
diff --git a/docs/modelserving/certificate/images/cert-global-way.png b/docs/modelserving/certificate/images/cert-global-way.png
diff --git a/docs/modelserving/certificate/images/cert-local-ca-bundle.png b/docs/modelserving/certificate/images/cert-local-ca-bundle.png
diff --git a/docs/modelserving/certificate/kserve.md b/docs/modelserving/certificate/kserve.md
@@ -0,0 +1,144 @@
+# KServe with Self Signed Certificate Model Registry
+
+If you are using a model registry with a self-signed certificate, you must either skip ssl verify or apply the appropriate CA bundle to the storage-initializer to create a connection with the registry.
+This document explains three methods that can be used in KServe, described below:
+
+- Configure CA bundle for storage-initializer
+    - Global configuration
+    - Namespace scope configuration(Using `storage-config` Secret)
+        - json
+        - annotation
+- Skip SSL Verification
+
+(NOTE) This is only available for `RawDeployment` and `ServerlessDeployment`. For modelmesh, you should add ca bundle content into [`certificate` parameter in `storage-config`](https://github.com/kserve/modelmesh-serving/blob/bba0cec8ca8c6c6f19958696f39b27b5b49cadd8/docs/predictors/setup-storage.md?plain=1#L65)
+## Configure CA bundle for storage-initializer  
+### Global Configuration
+
+KServe use `inferenceservice-config` ConfigMap for default configuration. If you want to add `cabundle` cert for every inference service, you can set `caBundleConfigMapName` in the ConfigMap. Before updating the ConfigMap, you have to create a ConfigMap for CA bundle certificate in the namespace that KServe controller is running and the data key in the ConfigMap must be `cabundle.crt`. 
+
+![Image1](./images/cert-global-way.png)
+
+
+- Create CA ConfigMap with the CA bundle cert
+  ~~~
+  kubectl create configmap cabundle --from-file=/path/to/cabundle.crt
+
+  kubectl get configmap cabundle -o yaml
+  apiVersion: v1
+  data:
+    cabundle.crt: XXXXX
+  kind: ConfigMap
+  metadata:
+    name: cabundle
+    namespace: kserve
+  ~~~
+- Update `inferenceservice-config` ConfigMap 
+  ~~~
+    storageInitializer: |-
+    {
+        ...
+        "caBundleConfigMapName": "cabundle",
+        ...
+    }
+  ~~~
+
+Afeter you update this configuration, please restart KServe controller pod to pick up the change.
+
+When you create a inference service, then the ca bundle will be copied to your user namespace and it will be attached to the storage-initializer container.
+
+![Image2](./images/cert-global-ca-bundle.png){ style="display: block; margin: 0 auto" }
+
+### Using storage-config Secret
+
+If you want to apply the cabundle only to a specific inferenceservice, you can use a specific annotation or variable(`cabundle_configmap`) on the `storage-config` Secret used by the inferenceservice.
+In this case, you have to create the cabundle ConfigMap in the user namespace before you create the inferenceservice.
+
+![Image3](./images/cert-local-ca-bundle.png){ style="display: block; margin: 0 auto" }
+
+
+- Create a ConfigMap with the cabundle cert
+  ~~~
+  kubectl create configmap local-cabundle --from-file=/path/to/cabundle.crt
+
+  kubectl get configmap cabundle -o yaml
+  apiVersion: v1
+  data:
+    cabundle.crt: XXXXX
+  kind: ConfigMap
+  metadata:
+    name: local-cabundle
+    namespace: kserve-demo
+  ~~~
+
+- Add an annotation `serving.kserve.io/s3-cabundle-configmap` to `storage-config` Secret
+  ~~~
+  apiVersion: v1
+  data:
+    AWS_ACCESS_KEY_ID: VEhFQUNDRVNTS0VZ
+    AWS_SECRET_ACCESS_KEY: VEhFUEFTU1dPUkQ=
+  kind: Secret
+  metadata:
+    annotations:
+      serving.kserve.io/s3-cabundle-configmap: local-cabundle
+      ...
+    name: storage-config
+    namespace: kserve-demo
+  type: Opaque
+  ~~~
+
+- Or, set a variable `cabundle_configmap` to `storage-config` Secret
+  ~~~
+  apiVersion: v1
+  stringData:
+    localMinIO: |
+    {
+      "type": "s3",
+      ....
+      "cabundle_configmap": "local-cabundle"
+    }
+  kind: Secret
+  metadata:
+    name: storage-config
+    namespace: kserve-demo
+  type: Opaque
+  ~~~
+
+## Skip SSL Verification
+
+For testing purposes or when there is no cabundle, you can easily create an SSL connection by disabling SSL verification.
+This can also be used by adding an annotation or setting a variable in `secret-config` Secret.
+
+- Add an annotation(`serving.kserve.io/s3-verifyssl`) to `storage-config` Secret
+  ~~~
+  apiVersion: v1
+  data:
+    AWS_ACCESS_KEY_ID: VEhFQUNDRVNTS0VZ
+    AWS_SECRET_ACCESS_KEY: VEhFUEFTU1dPUkQ=
+  kind: Secret
+  metadata:
+    annotations:
+         serving.kserve.io/s3-verifyssl: "0" # 1 is true, 0 is false
+      ...
+    name: storage-config
+    namespace: kserve-demo
+  type: Opaque
+  ~~~
+
+- Or, set a variable (`verify_ssl`) to `storage-config` Secret
+  ~~~
+  apiVersion: v1
+  stringData:
+    localMinIO: |
+      {
+        "type": "s3",
+        ...
+        "verify_ssl": "0"  # 1 is true, 0 is false  (You can set True/true/False/false too)
+      }
+  kind: Secret
+  metadata:
+    name: storage-config
+    namespace: kserve-demo
+  type: Opaque
+  ~~~
+
+[Full Demo Scripts](./full-demo.md)
diff --git a/docs/modelserving/storage/gcs/gcs.md b/docs/modelserving/storage/gcs/gcs.md
@@ -0,0 +1,114 @@
+# Deploy InferenceService with a saved model on Google Cloud Storage (GCS)
+
+## Using Public GCS Bucket
+
+If no credential is provided, anonymous client will be used to download the artifact from GCS bucket.
+The uri is in the following format:
+
+
+```
+gs://${BUCKET_ NAME}/${PATH}
+```
+
+e.g. ```gs://kfserving-examples/models/tensorflow/flowers```
+
+
+## Using Private GCS bucket
+
+KServe supports authenticating using Google Service Account Key
+
+### Create a Service Account Key
+
+* To create a Service Account Key follow the steps [here](https://cloud.google.com/iam/docs/keys-create-delete#iam-service-account-keys-create-console).
+* Base64 encode the generated Service Account Key file
+
+
+## Create Google Secret
+
+### Create secret
+=== "yaml"
+```yaml
+apiVersion: v1
+kind: Secret
+metadata:
+  name: storage-config
+type: Opaque
+stringData:
+  gcs: |
+    {
+      "type": "gs",
+      "bucket": "mlpipeline",
+      "base64_service_account": "c2VydmljZWFjY291bnQ=" # base64 encoded value of the credential file
+    }
+```
+
+=== "kubectl"
+```bash
+kubectl apply -f create-gcs-secret.yaml
+```
+
+## Deploy the model on GCS with `InferenceService`
+
+Create the InferenceService with the Google service account credential
+=== "yaml"
+```yaml
+apiVersion: serving.kserve.io/v1beta1
+kind: InferenceService
+metadata:
+    name: sklearn-gcs
+spec:
+  predictor:
+      sklearn:
+        storage:
+          key: gcs
+          path: models/tensorflow/flowers
+          parameters: # Parameters to override the default values
+            bucket: kfserving-examples
+```
+
+Apply the `sklearn-gcs.yaml`.
+
+=== "kubectl"
+```bash
+kubectl apply -f sklearn-gcs.yaml
+```
+
+## Run a prediction
+
+Now, the ingress can be accessed at `${INGRESS_HOST}:${INGRESS_PORT}` or follow [this instruction](../../../get_started/first_isvc.md#4-determine-the-ingress-ip-and-ports)
+to find out the ingress IP and port.
+
+```bash
+SERVICE_HOSTNAME=$(kubectl get inferenceservice sklearn-gcs -o jsonpath='{.status.url}' | cut -d "/" -f 3)
+
+MODEL_NAME=sklearn-gcs
+INPUT_PATH=@./input.json
+curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
+```
+
+!!! success "Expected Output"
+
+    ```{ .bash .no-copy }
+    *   Trying 127.0.0.1:8080...
+    * TCP_NODELAY set
+    * Connected to localhost (127.0.0.1) port 8080 (#0)
+    > POST /v1/models/sklearn-gcs:predict HTTP/1.1
+    > Host: sklearn-gcs.default.example.com
+    > User-Agent: curl/7.68.0
+    > Accept: */*
+    > Content-Length: 84
+    > Content-Type: application/x-www-form-urlencoded
+    >
+    * upload completely sent off: 84 out of 84 bytes
+    * Mark bundle as not supporting multiuse
+    < HTTP/1.1 200 OK
+    < content-length: 23
+    < content-type: application/json; charset=UTF-8
+    < date: Mon, 20 Sep 2021 04:55:50 GMT
+    < server: istio-envoy
+    < x-envoy-upstream-service-time: 6
+    <
+    * Connection #0 to host localhost left intact
+    {"predictions": [1, 1]}
+    ```
+