Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework | Administration Guide #2

Merged
merged 1 commit into from
Feb 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified .DS_Store
Binary file not shown.
6 changes: 3 additions & 3 deletions New website/docs/Getting_Started/README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Getting Started with KServe
## Before you begin
!!! warning
KServe Quickstart Environments are for experimentation use only. For production installation, see our [Administrator's Guide](../admin/serverless/serverless.md)

:::warning
KServe Quickstart Environments are for experimentation use only. For production installation, see our [Administrator's Guide](../admin/serverless/serverlesss)
:::
Before you can get started with a KServe Quickstart deployment you must install kind and the Kubernetes CLI.

### Install Kind (Kubernetes in Docker)
Expand Down
4 changes: 2 additions & 2 deletions New website/docs/admin/migration.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@

This doc explains how to migrate existing inference services from KFServing to KServe without downtime.

!!! note
:::note
The migration job will by default delete the leftover KFServing installation after migrating the inference services from
`serving.kubeflow.org` to `serving.kserve.io`.

:::

### Migrating from standalone KFServing

Expand Down
33 changes: 11 additions & 22 deletions New website/docs/admin/serverless/kourier_networking/README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,12 @@
Certainly! Below is the Markdown content reformatted for Docusaurus:

```markdown
---
id: deploy-inference-service-kourier
title: Deploy InferenceService with Kourier
sidebar_label: Deploy InferenceService with Kourier
---

KServe creates the top-level `Istio Virtual Service` for routing to `InferenceService` components based on the virtual host or path-based routing.
Now KServe provides an option for disabling the top-level virtual service to allow configuring other networking layers Knative supports.
For example, [Kourier](https://developers.redhat.com/blog/2020/06/30/kourier-a-lightweight-knative-serving-ingress) is an alternative networking layer, and the following steps show how you can deploy KServe with `Kourier`.
# Kourier Networking Layer
## Deploy InferenceService with Alternative Networking Layer
KServe creates the top level `Istio Virtual Service` for routing to `InferenceService` components based on the virtual host or path based routing.
Now KServe provides an option for disabling the top level virtual service to allow configuring other networking layers Knative supports.
For example, [Kourier](https://developers.redhat.com/blog/2020/06/30/kourier-a-lightweight-knative-serving-ingress) is an alternative networking layer and
the following steps show how you can deploy KServe with `Kourier`.

## Install Kourier Networking Layer

Please refer to the [Serverless Installation Guide](../serverless.md) and change the second step to install `Kourier` instead of `Istio`.
Please refer to the [Serverless Installation Guide](../../serverless/serverlesss) and change the second step to install `Kourier` instead of `Istio`.

1. Install the Kourier networking layer:

Expand Down Expand Up @@ -51,7 +44,7 @@ Please refer to the [Serverless Installation Guide](../serverless.md) and change
3scale-kourier-gateway-54c49c8ff5-x8tgn 1/1 Running 0 10m
```

4. Edit `inferenceservice-config` configmap to disable Istio top-level virtual host:
4. Edit `inferenceservice-config` configmap to disable Istio top level virtual host:

```bash
kubectl edit configmap/inferenceservice-config --namespace kserve
Expand Down Expand Up @@ -110,10 +103,10 @@ kubectl apply -f pmml.yaml

### Run a Prediction

Note that when setting `INGRESS_HOST` and `INGRESS_PORT` following the [determining the ingress IP and ports](../../../get_started/first_isvc.md#4-determine-the-ingress-ip-and-ports) guide you
Note that when setting `INGRESS_HOST` and `INGRESS_PORT` following the [determining the ingress IP and ports](../../../Getting_Started/first_isvc.mdx#4-determine-the-ingress-ip-and-ports) guide you
need to replace `istio-ingressgateway` with `kourier-gateway`.

For example, if you choose to do `Port Forward` for testing, you need to select the `kourier-gateway` pod as following.
For example if you choose to do `Port Forward` for testing you need to select the `kourier-gateway` pod as following.

```bash
kubectl port-forward --namespace kourier-system \
Expand Down Expand Up @@ -162,10 +155,6 @@ curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" http:
< server: envoy
< x-envoy-upstream-service-time: 58
<
* Connection #0

to host localhost left intact
* Connection #0 to host localhost left intact
{"predictions": [{"Species": "setosa", "Probability_setosa": 1.0, "Probability_versicolor": 0.0, "Probability_virginica": 0.0, "Node_Id": "2"}]}
```

This documentation provides steps to deploy KServe with Kourier as the networking layer and test the InferenceService with a PMML model.
Original file line number Diff line number Diff line change
@@ -1,4 +1,11 @@

# Serverless Installation Guide
## Serverless Installation Guide

import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';


KServe Serverless installation enables autoscaling based on request volume and supports scale down to and from zero. It also supports revision management
and canary rollout based on revisions.

Expand All @@ -15,12 +22,13 @@ Kubernetes version.
## 1. Install Knative Serving
Please refer to [Knative Serving install guide](https://knative.dev/docs/admin/install/serving/install-serving-with-yaml/).

!!! note
:::note
If you are looking to use PodSpec fields such as nodeSelector, affinity or tolerations which are now supported in the v1beta1 API spec,
you need to turn on the corresponding [feature flags](https://knative.dev/docs/admin/serving/feature-flags) in your Knative configuration.
!!! warning
:::
:::warning
In Knative 1.8, The cluster domain suffix is changed to `svc.cluster.local` as the default domain. As routes using the cluster domain suffix are not exposed through Ingress, you will need to [configure DNS](https://knative.dev/docs/install/yaml-install/serving/install-serving-with-yaml/#configure-dns) in order to expose their services (most users probably already are).
:::

## 2. Install Networking Layer
The recommended networking layer for KServe is [Istio](https://istio.io/) as currently it works best with KServe, please refer to the [Istio install guide](https://knative.dev/docs/admin/install/installing-istio).
Expand All @@ -29,21 +37,35 @@ Alternatively you can also choose other networking layers like [Kourier](https:/
## 3. Install Cert Manager
The minimally required Cert Manager version is 1.9.0 and you can refer to [Cert Manager](https://cert-manager.io/docs/installation/).

!!! note
:::note
Cert manager is required to provision webhook certs for production grade installation, alternatively you can run self signed certs generation script.
:::
## 4. Install KServe
=== "kubectl"

<Tabs>
<TabItem value="kubectl" label="kubectl" default>

```bash
kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.11.0/kserve.yaml
```
</TabItem>
</Tabs>



## 5. Install KServe Built-in ClusterServingRuntimes
<Tabs>
<TabItem value="kubectl" label="kubectl" default>

=== "kubectl"
```bash
kubectl apply -f https://github.com/kserve/kserve/releases/download/v0.11.0/kserve-runtimes.yaml
```
</TabItem>
</Tabs>

!!! note

:::note
**ClusterServingRuntimes** are required to create InferenceService for built-in model serving runtimes with KServe v0.8.0 or higher.
:::


Original file line number Diff line number Diff line change
@@ -1,4 +1,11 @@
# Secure InferenceService with ServiceMesh

import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';


# Istio Service Mesh
## Secure InferenceService with ServiceMesh

A service mesh is a dedicated infrastructure layer that you can add to your InferenceService to allow you to transparently add capabilities like observability, traffic management and security.
In this example we show how you can turn on the Istio service mesh mode to provide a uniform and efficient way to secure service-to-service communication in a cluster with TLS encryption, strong
identity-based authentication and authorization.
Expand All @@ -17,9 +24,12 @@ kubectl create namespace user1

- When activator is on the request path, the rule checks the source namespace `knative-serving` namespace as the request is proxied through activator.

!!! warning


:::warning

Currently when activator is on the request path, it is not able to check the originated namespace or original identity due to the [net-istio issue](https://github.com/knative-sandbox/net-istio/issues/554).
:::

```yaml
apiVersion: security.istio.io/v1beta1
Expand Down Expand Up @@ -77,10 +87,13 @@ kubectl edit configmap/inferenceservice-config --namespace kserve

ingress : |- {
"disableIstioVirtualHost": true
}
}

```

## Deploy InferenceService with Istio sidecar injection


First label the namespace with `istio-injection=enabled` to turn on the sidecar injection for the namespace.

```bash
Expand All @@ -92,7 +105,8 @@ When `autoscaling.knative.dev/targetBurstCapacity` is set to 0,
Knative removes the activator from the request path so the test service can directly establish the mTLS connection to the `InferenceService` and
the authorization policy can check the original namespace of the request to lock down the traffic for namespace isolation.

=== "InferenceService with activator on path"
<Tabs>
<TabItem value="InferenceService with activator on path" label="InferenceService with activator on path" default>

```yaml
apiVersion: "serving.kserve.io/v1beta1"
Expand All @@ -109,8 +123,8 @@ the authorization policy can check the original namespace of the request to lock
name: sklearn
storageUri: "gs://kfserving-examples/models/sklearn/1.0/model"
```
=== "InferenceService without activator on path"

</TabItem>
<TabItem value="InferenceService without activator on path" label="InferenceService without activator on path">
```yaml
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
Expand All @@ -128,11 +142,16 @@ the authorization policy can check the original namespace of the request to lock
storageUri: "gs://kfserving-examples/models/sklearn/1.0/model"
```

</TabItem>
</Tabs>



```bash
kubectl apply -f sklearn_iris.yaml
```

!!! success "Expected Output"
:::success "Expected Output"

```{ .bash .no-copy }
$ inferenceservice.serving.kserve.io/sklearn-iris created
Expand All @@ -144,6 +163,7 @@ kubectl apply -f sklearn_iris.yaml
sklearn-iris-burst-predictor-default-00001-deployment-5685n46f6 3/3 Running 0 12h
sklearn-iris-predictor-default-00001-deployment-985d5cd46-zzw4x 3/3 Running 0 12h
```
:::

## Run a prediction from the same namespace
Deploy a test service in `user1` namespace with [httpbin.yaml](./httpbin.yaml).
Expand All @@ -157,7 +177,7 @@ Run a prediction request to the `sklearn-iris` InferenceService without activato
kubectl exec -it httpbin-6484879498-qxqj8 -c istio-proxy -n user1 -- curl -v sklearn-iris-predictor-default.user1.svc.cluster.local/v1/models/sklearn-iris
```

!!! success "Expected Output"
:::success "Expected Output"

```{ .bash .no-copy }
* Connected to sklearn-iris-predictor-default.user1.svc.cluster.local (10.96.137.152) port 80 (#0)
Expand All @@ -177,13 +197,13 @@ kubectl exec -it httpbin-6484879498-qxqj8 -c istio-proxy -n user1 -- curl -v skl
* Connection #0 to host sklearn-iris-predictor-default.user1.svc.cluster.local left intact
{"name":"sklearn-iris","ready":true}
```

:::
Run a prediction request to the `sklearn-iris-burst` InferenceService with activator on the path, you are expected to get HTTP 200 as the authorization rule allows traffic from `knative-serving` namespace.
```bash
kubectl exec -it httpbin-6484879498-qxqj8 -c istio-proxy -n user1 -- curl -v sklearn-iris-burst-predictor-default.user1.svc.cluster.local/v1/models/sklearn-iris-burst
```

!!! success "Expected Output"
:::success "Expected Output"

```{ .bash .no-copy }
* Connected to sklearn-iris-burst-predictor-default.user1.svc.cluster.local (10.96.137.152) port 80 (#0)
Expand All @@ -203,7 +223,7 @@ kubectl exec -it httpbin-6484879498-qxqj8 -c istio-proxy -n user1 -- curl -v skl
* Connection #0 to host sklearn-iris-burst-predictor-default.user1.svc.cluster.local left intact
{"name":"sklearn-iris-burst","ready":true}
```

:::
## Run a prediction from a different namespace
Deploy a test service in `default` namespace with [sleep.yaml](./sleep.yaml) which is different from the namespace the `InferenceService` is deployed to.

Expand All @@ -217,7 +237,7 @@ allows the traffic from the same namespace `user1` where the InferenceService is
kubectl exec -it sleep-6d6b49d8b8-6ths6 -- curl -v sklearn-iris-predictor-default.user1.svc.cluster.local/v1/models/sklearn-iris
```

!!! success "Expected Output"
:::success "Expected Output"

```{ .bash .no-copy }
* Connected to sklearn-iris-predictor-default.user1.svc.cluster.local (10.96.137.152) port 80 (#0)
Expand All @@ -236,15 +256,15 @@ kubectl exec -it sleep-6d6b49d8b8-6ths6 -- curl -v sklearn-iris-predictor-defa
<
* Connection #0 to host sklearn-iris-predictor-default.user1.svc.cluster.local left intact
```

:::
When you send a prediction request to the `sklearn-iris-burst` InferenceService with activator on the request path from a different namespace, you actually get HTTP 200 response due to the above limitation as the authorization policy is
not able to lock down the traffic only from the same namespace as the request is proxied through activator in `knative-serving` namespace, we expect to get HTTP 403 once upstream Knative `net-istio` is fixed.

```bash
kubectl exec -it sleep-6d6b49d8b8-6ths6 -- curl -v sklearn-iris-burst-predictor-default.user1.svc.cluster.local/v1/models/sklearn-iris-burst
```

!!! success "Expected Output"
:::success "Expected Output"

```{ .bash .no-copy }
* Connected to sklearn-iris-burst-predictor-default.user1.svc.cluster.local (10.96.137.152) port 80 (#0)
Expand All @@ -263,4 +283,5 @@ kubectl exec -it sleep-6d6b49d8b8-6ths6 -- curl -v sklearn-iris-burst-predictor-
<
* Connection #0 to host sklearn-iris-burst-predictor-default.user1.svc.cluster.local left intact
{"name":"sklearn-iris-burst","ready":true}
```
```
:::
8 changes: 7 additions & 1 deletion New website/docusaurus.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -67,10 +67,16 @@ const config: Config = {
items: [
{
type: 'docSidebar',
sidebarId: 'tutorialSidebar',
sidebarId: 'GettingStartedSidebar',
position: 'left',
label: 'Getting Started',
},
{
type: 'docSidebar',
sidebarId: 'adminSidebar',
position: 'left',
label: 'Administration Guide',
},
{to: '/blog', label: 'Blog', position: 'left'},

{
Expand Down
1 change: 1 addition & 0 deletions New website/package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions New website/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
"@headlessui/tailwindcss": "^0.2.0",
"@mdx-js/react": "^3.0.0",
"clsx": "^2.0.0",
"micromark-extension-mdxjs-esm": "^3.0.0",
"prism-react-renderer": "^2.3.0",
"react": "^18.0.0",
"react-dom": "^18.0.0"
Expand Down
24 changes: 21 additions & 3 deletions New website/sidebars.ts
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ const sidebars: SidebarsConfig = {

// But you can create a sidebar manually

tutorialSidebar: [
GettingStartedSidebar: [

{
type: 'category',
Expand All @@ -29,7 +29,25 @@ const sidebars: SidebarsConfig = {
},

],

adminSidebar: [
{
type: 'category',
label: 'Administration Guide',
items: [
{
type: 'category',
label: 'Serverless',
items: [
'admin/serverless/serverlesss',
'admin/serverless/servicemesh/README',
'admin/serverless/kourier_networking/README',
],
},
'admin/modelmesh',
'admin/kubernetes_deployment',
'admin/migration',
],
},
],
};

export default sidebars;
10 changes: 10 additions & 0 deletions New website/src/theme/SearchBar.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
import React from 'react';
import SearchBar from '@theme-original/SearchBar';

export default function SearchBarWrapper(props) {
return (
<>
<SearchBar {...props} />
</>
);
}
Binary file added docs/.DS_Store
Binary file not shown.
Binary file added docs/admin/.DS_Store
Binary file not shown.