Skip to content

Commit

Permalink
Merge branch 'kserve:main' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
Syntax-Error-1337 committed Apr 2, 2024
2 parents b697ff4 + e27b5e1 commit 94a9d51
Show file tree
Hide file tree
Showing 14 changed files with 542 additions and 21 deletions.
10 changes: 5 additions & 5 deletions docs/admin/kubernetes_deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@ Kubernetes version.
## Recommended Version Matrix
| Kubernetes Version | Recommended Istio Version |
|:-------------------|:--------------------------|
| 1.25 | 1.15, 1.16 |
| 1.26 | 1.17 |
| 1.27 | 1.17, 1.18 |
| 1.27 | 1.18, 1.19 |
| 1.28 | 1.19, 1.20 |
| 1.29 | 1.20, 1.21 |

## 1. Install Istio

Expand Down Expand Up @@ -46,14 +46,14 @@ The minimally required Cert Manager version is 1.9.0 and you can refer to [Cert

=== "kubectl"
```bash
kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.11.0/kserve.yaml
kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.12.0/kserve.yaml
```

Install KServe default serving runtimes:

=== "kubectl"
```bash
kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.11.0/kserve-runtimes.yaml
kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.12.0/kserve-runtimes.yaml
```

**ii. Change default deployment mode and ingress option**
Expand Down
12 changes: 6 additions & 6 deletions docs/admin/serverless/serverless.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@ Kubernetes version.
## Recommended Version Matrix
| Kubernetes Version | Recommended Istio Version | Recommended Knative Version |
|:-------------------|:--------------------------|:----------------------------|
| 1.25 | 1.15, 1.16 | 1.4-1.9 |
| 1.26 | 1.17 | 1.7-1.11 |
| 1.27 | 1.17,1.18 | 1.9-1.11 |
| 1.27 | 1.18,1.19 | 1.10,1.11 |
| 1.28 | 1.19,1.20 | 1.11,1.12.4 |
| 1.29 | 1.20,1.21 | 1.12.4,1.13.1 |

## 1. Install Knative Serving
Please refer to [Knative Serving install guide](https://knative.dev/docs/admin/install/serving/install-serving-with-yaml/).
Expand All @@ -20,7 +20,7 @@ Please refer to [Knative Serving install guide](https://knative.dev/docs/admin/i
you need to turn on the corresponding [feature flags](https://knative.dev/docs/admin/serving/feature-flags) in your Knative configuration.

!!! warning
In Knative 1.8, The cluster domain suffix is changed to `svc.cluster.local` as the default domain. As routes using the cluster domain suffix are not exposed through Ingress, you will need to [configure DNS](https://knative.dev/docs/install/yaml-install/serving/install-serving-with-yaml/#configure-dns) in order to expose their services (most users probably already are).
Knative 1.13.1 requires Istio 1.20+, gRPC routing does not work with previous Istio releases, see [release notes](https://github.com/knative/serving/releases/tag/knative-v1.13.1).

## 2. Install Networking Layer
The recommended networking layer for KServe is [Istio](https://istio.io/) as currently it works best with KServe, please refer to the [Istio install guide](https://knative.dev/docs/admin/install/installing-istio).
Expand All @@ -35,14 +35,14 @@ The minimally required Cert Manager version is 1.9.0 and you can refer to [Cert
## 4. Install KServe
=== "kubectl"
```bash
kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.11.0/kserve.yaml
kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.12.0/kserve.yaml
```

## 5. Install KServe Built-in ClusterServingRuntimes

=== "kubectl"
```bash
kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.11.0/kserve-runtimes.yaml
kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.12.0/kserve-runtimes.yaml
```

!!! note
Expand Down
2 changes: 2 additions & 0 deletions docs/modelserving/autoscaling/autoscale-gpu-new.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ metadata:
name: "flowers-sample-gpu"
spec:
predictor:
scaleTarget: 1
scaleMetric: concurrency
model:
modelFormat:
name: tensorflow
Expand Down
4 changes: 2 additions & 2 deletions docs/modelserving/autoscaling/autoscale-new.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@ apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
name: "flowers-sample"
annotations:
autoscaling.knative.dev/target: "1"
spec:
predictor:
scaleTarget: 1
scaleMetric: concurrency
model:
modelFormat:
name: tensorflow
Expand Down
4 changes: 4 additions & 0 deletions docs/modelserving/autoscaling/autoscaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -248,6 +248,8 @@ Apply the tensorflow gpu example CR
name: "flowers-sample-gpu"
spec:
predictor:
scaleTarget: 1
scaleMetric: concurrency
model:
modelFormat:
name: tensorflow
Expand All @@ -265,6 +267,8 @@ Apply the tensorflow gpu example CR
kind: "InferenceService"
metadata:
name: "flowers-sample-gpu"
annotations:
autoscaling.knative.dev/target: "1"
spec:
predictor:
tensorflow:
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
144 changes: 144 additions & 0 deletions docs/modelserving/certificate/kserve.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
# KServe with Self Signed Certificate Model Registry

If you are using a model registry with a self-signed certificate, you must either skip ssl verify or apply the appropriate CA bundle to the storage-initializer to create a connection with the registry.
This document explains three methods that can be used in KServe, described below:

- Configure CA bundle for storage-initializer
- Global configuration
- Namespace scope configuration(Using `storage-config` Secret)
- json
- annotation
- Skip SSL Verification

(NOTE) This is only available for `RawDeployment` and `ServerlessDeployment`. For modelmesh, you should add ca bundle content into [`certificate` parameter in `storage-config`](https://github.com/kserve/modelmesh-serving/blob/bba0cec8ca8c6c6f19958696f39b27b5b49cadd8/docs/predictors/setup-storage.md?plain=1#L65)
## Configure CA bundle for storage-initializer
### Global Configuration

KServe use `inferenceservice-config` ConfigMap for default configuration. If you want to add `cabundle` cert for every inference service, you can set `caBundleConfigMapName` in the ConfigMap. Before updating the ConfigMap, you have to create a ConfigMap for CA bundle certificate in the namespace that KServe controller is running and the data key in the ConfigMap must be `cabundle.crt`.

![Image1](./images/cert-global-way.png)


- Create CA ConfigMap with the CA bundle cert
~~~
kubectl create configmap cabundle --from-file=/path/to/cabundle.crt
kubectl get configmap cabundle -o yaml
apiVersion: v1
data:
cabundle.crt: XXXXX
kind: ConfigMap
metadata:
name: cabundle
namespace: kserve
~~~
- Update `inferenceservice-config` ConfigMap
~~~
storageInitializer: |-
{
...
"caBundleConfigMapName": "cabundle",
...
}
~~~

Afeter you update this configuration, please restart KServe controller pod to pick up the change.

When you create a inference service, then the ca bundle will be copied to your user namespace and it will be attached to the storage-initializer container.

![Image2](./images/cert-global-ca-bundle.png){ style="display: block; margin: 0 auto" }

### Using storage-config Secret

If you want to apply the cabundle only to a specific inferenceservice, you can use a specific annotation or variable(`cabundle_configmap`) on the `storage-config` Secret used by the inferenceservice.
In this case, you have to create the cabundle ConfigMap in the user namespace before you create the inferenceservice.

![Image3](./images/cert-local-ca-bundle.png){ style="display: block; margin: 0 auto" }


- Create a ConfigMap with the cabundle cert
~~~
kubectl create configmap local-cabundle --from-file=/path/to/cabundle.crt
kubectl get configmap cabundle -o yaml
apiVersion: v1
data:
cabundle.crt: XXXXX
kind: ConfigMap
metadata:
name: local-cabundle
namespace: kserve-demo
~~~

- Add an annotation `serving.kserve.io/s3-cabundle-configmap` to `storage-config` Secret
~~~
apiVersion: v1
data:
AWS_ACCESS_KEY_ID: VEhFQUNDRVNTS0VZ
AWS_SECRET_ACCESS_KEY: VEhFUEFTU1dPUkQ=
kind: Secret
metadata:
annotations:
serving.kserve.io/s3-cabundle-configmap: local-cabundle
...
name: storage-config
namespace: kserve-demo
type: Opaque
~~~

- Or, set a variable `cabundle_configmap` to `storage-config` Secret
~~~
apiVersion: v1
stringData:
localMinIO: |
{
"type": "s3",
....
"cabundle_configmap": "local-cabundle"
}
kind: Secret
metadata:
name: storage-config
namespace: kserve-demo
type: Opaque
~~~

## Skip SSL Verification

For testing purposes or when there is no cabundle, you can easily create an SSL connection by disabling SSL verification.
This can also be used by adding an annotation or setting a variable in `secret-config` Secret.

- Add an annotation(`serving.kserve.io/s3-verifyssl`) to `storage-config` Secret
~~~
apiVersion: v1
data:
AWS_ACCESS_KEY_ID: VEhFQUNDRVNTS0VZ
AWS_SECRET_ACCESS_KEY: VEhFUEFTU1dPUkQ=
kind: Secret
metadata:
annotations:
serving.kserve.io/s3-verifyssl: "0" # 1 is true, 0 is false
...
name: storage-config
namespace: kserve-demo
type: Opaque
~~~

- Or, set a variable (`verify_ssl`) to `storage-config` Secret
~~~
apiVersion: v1
stringData:
localMinIO: |
{
"type": "s3",
...
"verify_ssl": "0" # 1 is true, 0 is false (You can set True/true/False/false too)
}
kind: Secret
metadata:
name: storage-config
namespace: kserve-demo
type: Opaque
~~~

[Full Demo Scripts](./full-demo.md)
114 changes: 114 additions & 0 deletions docs/modelserving/storage/gcs/gcs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# Deploy InferenceService with a saved model on Google Cloud Storage (GCS)

## Using Public GCS Bucket

If no credential is provided, anonymous client will be used to download the artifact from GCS bucket.
The uri is in the following format:


```
gs://${BUCKET_ NAME}/${PATH}
```

e.g. ```gs://kfserving-examples/models/tensorflow/flowers```


## Using Private GCS bucket

KServe supports authenticating using Google Service Account Key

### Create a Service Account Key

* To create a Service Account Key follow the steps [here](https://cloud.google.com/iam/docs/keys-create-delete#iam-service-account-keys-create-console).
* Base64 encode the generated Service Account Key file


## Create Google Secret

### Create secret
=== "yaml"
```yaml
apiVersion: v1
kind: Secret
metadata:
name: storage-config
type: Opaque
stringData:
gcs: |
{
"type": "gs",
"bucket": "mlpipeline",
"base64_service_account": "c2VydmljZWFjY291bnQ=" # base64 encoded value of the credential file
}
```

=== "kubectl"
```bash
kubectl apply -f create-gcs-secret.yaml
```

## Deploy the model on GCS with `InferenceService`

Create the InferenceService with the Google service account credential
=== "yaml"
```yaml
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
name: sklearn-gcs
spec:
predictor:
sklearn:
storage:
key: gcs
path: models/tensorflow/flowers
parameters: # Parameters to override the default values
bucket: kfserving-examples
```

Apply the `sklearn-gcs.yaml`.

=== "kubectl"
```bash
kubectl apply -f sklearn-gcs.yaml
```

## Run a prediction

Now, the ingress can be accessed at `${INGRESS_HOST}:${INGRESS_PORT}` or follow [this instruction](../../../get_started/first_isvc.md#4-determine-the-ingress-ip-and-ports)
to find out the ingress IP and port.

```bash
SERVICE_HOSTNAME=$(kubectl get inferenceservice sklearn-gcs -o jsonpath='{.status.url}' | cut -d "/" -f 3)

MODEL_NAME=sklearn-gcs
INPUT_PATH=@./input.json
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/$MODEL_NAME:predict -d $INPUT_PATH
```

!!! success "Expected Output"

```{ .bash .no-copy }
* Trying 127.0.0.1:8080...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8080 (#0)
> POST /v1/models/sklearn-gcs:predict HTTP/1.1
> Host: sklearn-gcs.default.example.com
> User-Agent: curl/7.68.0
> Accept: */*
> Content-Length: 84
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 84 out of 84 bytes
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< content-length: 23
< content-type: application/json; charset=UTF-8
< date: Mon, 20 Sep 2021 04:55:50 GMT
< server: istio-envoy
< x-envoy-upstream-service-time: 6
<
* Connection #0 to host localhost left intact
{"predictions": [1, 1]}
```

Loading

0 comments on commit 94a9d51

Please sign in to comment.