Model Registry Custom Storage Initializer

Welcome 👋!

Here, you'll find an example implementation of a Kserve custom storage initializer tailored for the model-registry:// URI format. This functionality aligns seamlessly with the specifications outlined in the ClusterStorageContainer CRD.

This implementation is intended to work with any model registry service that exposes a REST interface compatible with the Opendatahub OpenAPI spec. Explore the possibilities and enhance your model-serving experience with this powerful integration with a generic Model Registry.

Development

Build container image:

make docker-build

By default the container image name is quay.io/${USER}/model-registry-storage-initializer:latest but it can be overridden providing the IMG env variable, e.g., make IMG=quay.io/ORG/NAME:TAG docker-build.

Push the generated image:

make [IMG=..] docker-push

Workflow

The following diagram showcase the interactions among all participants/actors involved in the model deployment process using the proposed Model Registry Storage Initializer.

sequenceDiagram
    actor U as User
    participant MR as Model Registry
    participant KC as KServe Controller
    participant MD as Model Deployment (Pod)
    participant MRSI as Model Registry Storage Initializer
    U->>+MR: Register ML Model
    MR-->>-U: Indexed Model
    U->>U: Create InferenceService CR
    Note right of U: The InferenceService should<br/>point to the model registry<br/>indexed model, e.g.,:<br/> model-registry://<model>/<version>
    KC->>KC: React to InferenceService creation
    KC->>+MD: Create Model Deployment
    MD->>+MRSI: Initialization (Download Model)
    MRSI->>MRSI: Parse URI
    MRSI->>+MR: Fetch Model Metadata
    MR-->>-MRSI: Model Metadata
    Note over MR,MRSI: The main information that is fetched is the artifact URI which specifies the real model location, e.g.,: https://.. or s3://...
    MRSI->>MRSI: Download Model
    Note right of MRSI: The storage initializer will use<br/> the KServe default providers<br/> to download the model<br/> based on the artifact URI
    MRSI-->>-MD: Downloaded Model
    MD->>-MD: Deploy Model

The same diagram is also available as exported image.

Quickstart

Embark on your journey with this custom storage initializer by exploring a simple hello-world example. Learn how to seamlessly integrate and leverage the power of our tool in just a few steps.

This quickstart is heavily inspired by the Getting Started with Kserve.

Prerequisites

Install Kind (Kubernetes in Docker)¶ to run local Kubernetes cluster with Docker container nodes.
Install the Kubernetes CLI (kubectl), which allows you to run commands against Kubernetes clusters.

Environment Preparation

Create the environment

After having kind installed, create a kind cluster with:

kind create cluster

Configure kubectl to use kind context

kubectl config use-context kind-kind

Setup local deployment of Kserve using the provided Kserve quick installation script

curl -s "https://raw.githubusercontent.com/kserve/kserve/release-0.11/hack/quick_install.sh" | bash

Install model registry in the local cluster

curl -s "https://raw.githubusercontent.com/lampajr/model-registry-storage-initializer/v0.0.1/hack/install_model_registry.sh" | bash

First InferenceService

In this tutorial, you will deploy an InferenceService with a predictor that will load a model indexed into the model registry, the indexed model refers to a scikit-learn model trained with the iris dataset. This dataset has three output class: Iris Setosa, Iris Versicolour, and Iris Virginica.

You will then send an inference request to your deployed model in order to get a prediction for the class of iris plant your request corresponds to.

Since your model is being deployed as an InferenceService, not a raw Kubernetes Service, you just need to provide the storage location of the model using the model-registry:// URI format and it gets some super powers out of the box.

Index the `Model` into the registry

Apply Port Forward to the model registry service in order to being able to interact with it from the outside of the cluster.

MODEL_REGISTRY_SERVICE=$(kubectl get svc -n model-registry --selector="component=model-registry" --output jsonpath='{.items[0].metadata.name}')
kubectl port-forward --namespace model-registry svc/${MODEL_REGISTRY_SERVICE} 8080:8080

And then:

export MR_HOSTNAME=localhost:8080

Register an empty RegisteredModel

curl --silent -X 'POST' \
  "$MR_HOSTNAME/api/model_registry/v1alpha1/registered_models" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "description": "Iris scikit-learn model",
  "name": "iris"
}'

Register the first ModelVersion

curl --silent -X 'POST' \
  "$MR_HOSTNAME/api/model_registry/v1alpha1/model_versions" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "description": "Iris model version v1",
  "name": "v1",
  "registeredModelID": "1"
}'

Register the raw ModelArtifact

This artifact defines where the actual trained model is stored, i.e., gs://kfserving-examples/models/sklearn/1.0/model

curl --silent -X 'POST' \
  "$MR_HOSTNAME/api/model_registry/v1alpha1/model_versions/2/artifacts" \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "description": "Model artifact for Iris v1",
  "uri": "gs://kfserving-examples/models/sklearn/1.0/model",
  "state": "UNKNOWN",
  "name": "iris-model-v1",
  "modelFormatName": "sklearn",
  "modelFormatVersion": "1",
  "artifactType": "model-artifact"
}'

NOTE: double check the provided IDs are the expected ones.

Apply the `ClusterStorageContainer` resource

Retrieve the model registry service and MLMD port:

MODEL_REGISTRY_SERVICE=$(kubectl get svc -n model-registry --selector="component=model-registry" --output jsonpath='{.items[0].metadata.name}')
MODEL_REGISTRY_REST_PORT=$(kubectl get svc -n model-registry --selector="component=model-registry" --output jsonpath='{.items[0].spec.ports[1].targetPort}')

Apply the cluster-scoped ClusterStorageContainer CR to setup configure the model registry storage initilizer for model-registry:// URI formats.

kubectl apply -f - <<EOF
apiVersion: "serving.kserve.io/v1alpha1"
kind: ClusterStorageContainer
metadata:
  name: mr-initializer
spec:
  container:
    name: storage-initializer
    image: quay.io/alampare/model-registry-storage-initializer:latest
    env:
    - name: MODEL_REGISTRY_BASE_URL
      value: "$MODEL_REGISTRY_SERVICE.model-registry.svc.cluster.local:$MODEL_REGISTRY_REST_PORT"
    - name: MODEL_REGISTRY_SCHEME
      value: "http"
    resources:
      requests:
        memory: 100Mi
        cpu: 100m
      limits:
        memory: 1Gi
        cpu: "1"
  supportedUriFormats:
    - prefix: model-registry://

EOF

Create an `InferenceService`

Create a namespace

kubectl create namespace kserve-test

Create the InferenceService

kubectl apply -n kserve-test -f - <<EOF
apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "iris-model"
spec:
  predictor:
    model:
      modelFormat:
        name: sklearn
      storageUri: "model-registry://iris/v1"
EOF

Check InferenceService status

kubectl get inferenceservices iris-model -n kserve-test

Determine the ingress IP and ports

kubectl get svc istio-ingressgateway -n istio-system

And then:

INGRESS_GATEWAY_SERVICE=$(kubectl get svc --namespace istio-system --selector="app=istio-ingressgateway" --output jsonpath='{.items[0].metadata.name}')
kubectl port-forward --namespace istio-system svc/${INGRESS_GATEWAY_SERVICE} 8081:80

After that (in another terminal):

export INGRESS_HOST=localhost
export INGRESS_PORT=8081

Perform the inference request

Prepare the input data:

cat <<EOF > "./iris-input.json"
{
  "instances": [
    [6.8,  2.8,  4.8,  1.4],
    [6.0,  3.4,  4.5,  1.6]
  ]
}
EOF

If you do not have DNS, you can still curl with the ingress gateway external IP using the HOST Header.

SERVICE_HOSTNAME=$(kubectl get inferenceservice iris-model -n kserve-test -o jsonpath='{.status.url}' | cut -d "/" -f 3)
curl -v -H "Host: ${SERVICE_HOSTNAME}" -H "Content-Type: application/json" "http://${INGRESS_HOST}:${INGRESS_PORT}/v1/models/iris-v1:predict" -d @./iris-input.json

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
docs		docs
hack		hack
pkg/storage		pkg/storage
samples		samples
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
main.go		main.go

License

lampajr/model-registry-storage-initializer

Folders and files

Latest commit

History

Repository files navigation

Model Registry Custom Storage Initializer

Development

Workflow

Quickstart

Prerequisites

Environment Preparation

Create the environment

First InferenceService

Index the Model into the registry

Apply the ClusterStorageContainer resource

Create an InferenceService

About

Resources

License

Stars

Watchers

Forks

Languages

Index the `Model` into the registry

Apply the `ClusterStorageContainer` resource

Create an `InferenceService`