Train prompt tuning configuration against LLM and Serve them using Kubeflow Pipelines

This repository demonstrates how Kubeflow could be leveraged for prompt tuning a foundational large language model (LLM) and serving the prompt-tunned model. Specifically:

Train ( with the use of PEFT and RAFT dataset) a prompt tuning configuration against a LLM - bigscience/bloomz-560m
Publish the trained configuration to HuggingFace.
Serve the prompt tuning configuration along with HuggingFace open source LLM.
Automate the above steps with Kubeflow Pipelines

Prerequisites
Kubeflow Installation
KServe Modelmesh Installation
Service Account Permissions
HuggingFace Token
Access Kubeflow UI
Create PodDefault resource
Run prompt tuning pipeline
- Upload IR yaml file
- Import Notebook into Kubeflow JupyterLab

Prerequisites

To successfully run the example provided in this repository, Kubeflow cluster and KServe ModelMesh need to be brought up. Before installing them, you must have the following dependencies installed in a local environment.

Python 3.9+
Docker
Kubectl
Kustomize 5.0.0
Install Kubernetes locally (could be done via kind, minikube, docker desktop for Mac)

Kubeflow Installation

This example has been tested with Kubeflow version 1.8.

To install Kubeflow, clone the Manifests repo and run the installation using kustomize

git clone --branch v1.8-branch https://github.com/kubeflow/manifests.git && cd manifests
while ! kustomize build example | awk '!/well-defined/' | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done

Before continuing, make sure to wait for all pods in kubeflow namespace to become ready

kubectl -n kubeflow wait --for=condition=Ready pods --all --timeout=1200s

KServe Modelmesh Installation

ModelMesh Serving is the Controller for managing ModelMesh, a general-purpose model serving management/routing layer. The instructions for installing it are provided below. For more detailed information on how to get started, check out this link

Clone Modelmesh serving repository

RELEASE="release-0.11"
git clone -b $RELEASE --depth 1 --single-branch https://github.com/kserve/modelmesh-serving.git
cd modelmesh-serving

Create a namespace called modelmesh-serving to deploy ModelMesh to.

kubectl create namespace modelmesh-serving
./scripts/install.sh --namespace-scope-mode --namespace modelmesh-serving --quickstart

Adjust service account permissions

Give our current service account kubeflow-user-example-com:default-editor permissions to manipulate inferenceservices and servingruntimes within modelmesh-serving namespace:

kubectl create clusterrole servicemesh-editor --verb=get,create,delete,list,watch,patch --resource=inferenceservices,servingruntime
kubectl create rolebinding servicemesh-editor --serviceaccount=kubeflow-user-example-com:default-editor --clusterrole=servicemesh-editor -n modelmesh-serving

Create k8s secret with your Hugging Face account token

Modify and execute the following command to store your Hugging Face account token as a secret in the Kubeflow cluster. This secret is used by the pipeline to publish the prompt tuning configuration to Hugging Face after the training. You can obtain the Hugging Face account token with WRITE permission on their website.

kubectl create secret generic huggingface-secret --from-literal='token=<HuggingFace_WRITE_Token>' -n kubeflow

Access Kubeflow UI

After successful installation of Kubeflow, use port-forward to expose the istio-ingressgateway service by running the following

kubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80

Navigate to localhost:8080 and login using user@example.com and password 12341234

Run PromptTuning Pipeline

Use IR (Intermediate Representation) Yaml file

You could create the prompt tuning pipeline by uploading the ready to use IR yaml file through the Kubeflow Dashboard. Go to "Pipelines" -> "Upload Pipelines" and select "Upload from file". Then create a run from the pipeline.

Use the Jupiter Notebook

You could leverage the notebook to create the prompt tuning pipeline. To do so, first you need to

Create PodDefault resource to inject ServiceAccount token volume into your Pods. Once created and configured correctly with your notebook, this will allow all pods created by the notebook to access kubeflow pipelines.

kubectl apply -f - <<EOF
apiVersion: kubeflow.org/v1alpha1
kind: PodDefault
metadata:
  name: access-kf-pipeline
  namespace: kubeflow-user-example-com
spec:
  desc: Allow access to KFP
  selector:
    matchLabels:
      access-kf-pipeline: "true"
  volumeMounts:
    - mountPath: /var/run/secrets/kubeflow/pipelines
      name: volume-kf-pipeline-token
      readOnly: true
  volumes:
    - name: volume-kf-pipeline-token
      projected:
        sources:
          - serviceAccountToken:
              path: token
              expirationSeconds: 7200
              audience: pipelines.kubeflow.org
  env:
    - name: KF_PIPELINES_SA_TOKEN_PATH
      value: /var/run/secrets/kubeflow/pipelines/token
EOF

Import Notebook into Kubeflow JupyterLab

Once logged in to the Kubeflow dashboard, navigate to "Notebooks" to create a new JupyterLab notebook with kubeflownotebookswg/jupyter-tensorflow-full:v1.8.0-rc.0 image and configuration "Allow access to Kubeflow Pipelines" enabled (available in "Advanced options").

After the notebook is running, connect to the notebook and upload the notebook file notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
LICENSE		LICENSE
README.md		README.md
llm-prompt_tuning_pipeline.yaml		llm-prompt_tuning_pipeline.yaml
prompt-tuning-pipeline.py		prompt-tuning-pipeline.py
prompt_tunning_pipeline.ipynb		prompt_tunning_pipeline.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Train prompt tuning configuration against LLM and Serve them using Kubeflow Pipelines

Prerequisites

Kubeflow Installation

KServe Modelmesh Installation

Adjust service account permissions

Create k8s secret with your Hugging Face account token

Access Kubeflow UI

Run PromptTuning Pipeline

Use IR (Intermediate Representation) Yaml file

Use the Jupiter Notebook

About

Releases

Packages

Languages

License

difince/vml-2023

Folders and files

Latest commit

History

Repository files navigation

Train prompt tuning configuration against LLM and Serve them using Kubeflow Pipelines

Prerequisites

Kubeflow Installation

KServe Modelmesh Installation

Adjust service account permissions

Create k8s secret with your Hugging Face account token

Access Kubeflow UI

Run PromptTuning Pipeline

Use IR (Intermediate Representation) Yaml file

Use the Jupiter Notebook

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages