Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
index.md	index.md
iris_classifier.py	iris_classifier.py
main.py	main.py
service.yaml	service.yaml

Hello World - Python BentoML

A simple machine learning model with API serving that is written in python and using BentoML. BentoML is an open source framework for high performance ML model serving, which supports all major machine learning frameworks including Keras, Tensorflow, PyTorch, Fast.ai, XGBoost and etc.

This sample will walk you through the steps of creating and deploying a machine learning model using python. It will use BentoML to package a classifier model trained on the Iris dataset. Afterward, it will create a container image and deploy the image to Knative.

Knative deployment guide with BentoML is also available in the BentoML documentation

Before you begin

A Kubernetes cluster with Knative installed. Follow the Knative installation instructions if you need to create one.
Docker installed and running on your local machine, and a Docker Hub account configured. Docker Hub will be used for a container registry).
Python 3.6 or above installed and running on your local machine.
- Install scikit-learn and bentoml packages:
```
pip install scikit-learn
pip install bentoml
```

Recreating sample code

Run the following code on your local machine, to train a machine learning model and deploy it as API endpoint with Knative Serving.

BentoML creates a model API server, via prediction service abstraction. In iris_classifier.py, it defines a prediction service that requires a scikit-learn model, asks BentoML to figure out the required pip dependencies, also defines an API, which is the entry point for accessing this machine learning service.

import bentoml
import joblib


@bentoml.service
class IrisClassifier:
    iris_model = bentoml.models.get("iris_classifier:latest")

    def __init__(self):
        self.model = joblib.load(self.iris_model.path_of("model.pkl"))

    @bentoml.api
    def predict(self, df):
        return self.artifacts.model.predict(df)

In main.py, it uses the classic iris flower data set to train a classification model which can predict the species of an iris flower with given data and then save the model with BentoML to local disk.

import joblib
from sklearn import svm
from sklearn import datasets

import bentoml

if __name__ == "__main__":
    # Load training data
    iris = datasets.load_iris()
    X, y = iris.data, iris.target

    # Model Training
    clf = svm.SVC(gamma='scale')
    clf.fit(X, y)

    with bentoml.models.create("iris_classifier") as bento_model:
        joblib.dump(clf, bento_model.path_of("model.pkl"))
    print(f"Model saved: {bento_model}")

Run the main.py file to train and save the model:
```
python main.py
```

Use BentoML CLI to check saved model's information.

bentoml get iris_classifier:latest

Example:

> bentoml get iris_classifier:latest -o json
{
  "service": "iris_classifier:IrisClassifier",
  "name": "iris_classifier",
  "version": "ar67rxqxqcrqi7ol",
  "bentoml_version": "1.2.16",
  "creation_time": "2024-05-21T14:40:20.737900+00:00",
  "labels": {
    "owner": "bentoml-team",
    "project": "gallery"
  },
  "models": [],
  "runners": [],
  "entry_service": "IrisClassifier",
  "services": [
    {
      "name": "IrisClassifier",
      "service": "",
      "models": [
        {
          "tag": "iris_sklearn:ml5evdaxpwrqi7ol",
          "module": "",
          "creation_time": "2024-05-21T14:21:17.070059+00:00"
        }
      ],
      "dependencies": [],
      "config": {}
    }
  ],
  "envs": [],
  "schema": {
    "name": "IrisClassifier",
    "type": "service",
    "routes": [
      {
        "name": "predict",
        "route": "/predict",
        "batchable": false,
        "input": {
          "properties": {
            "df": {
              "title": "Df"
            }
          },
          "required": [
            "df"
          ],
          "title": "Input",
          "type": "object"
        },
        "output": {
          "title": "AnyIODescriptor"
        }
      }
    ]
  },
  "apis": [],
  "docker": {
    "distro": "debian",
    "python_version": "3.11",
    "cuda_version": null,
    "env": null,
    "system_packages": null,
    "setup_script": null,
    "base_image": null,
    "dockerfile_template": null
  },
  "python": {
    "requirements_txt": "./requirements.txt",
    "packages": null,
    "lock_packages": true,
    "pack_git_packages": true,
    "index_url": null,
    "no_index": null,
    "trusted_host": null,
    "find_links": null,
    "extra_index_url": null,
    "pip_args": null,
    "wheels": null
  },
  "conda": {
    "environment_yml": null,
    "channels": null,
    "dependencies": null,
    "pip": null
  }
}

Test run API server. BentoML can start an API server from the saved model. Use BentoML CLI command to start an API server locally and test it with the curl command.
```
bentoml serve iris_classifier:latest
```
In another terminal window, make curl request with sample data to the API server and get prediction result:
```
curl -v -i \
--header "Content-Type: application/json" \
--request POST \
--data '[[5.1, 3.5, 1.4, 0.2]]' \
127.0.0.1:5000/predict
```

Building and deploying the sample

BentoML supports creating an API server docker image from its saved model directory, where a Dockerfile is automatically generated when saving the model.

To build an API model server docker image, replace {username} with your Docker Hub username and run the following commands.

# Build and push the container on your local machine.
bentoml containerize iris_classifier:latest -t "{username}/iris-classifier" --push

In service.yaml, replace {username} with your Docker hub username:

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: iris-classifier
  namespace: default
spec:
  template:
    spec:
      containers:
        - image: docker.io/{username}/iris-classifier
          ports:
          - containerPort: 5000 # Port to route to
          livenessProbe:
            httpGet:
              path: /healthz
            initialDelaySeconds: 3
            periodSeconds: 5
          readinessProbe:
            httpGet:
              path: /healthz
            initialDelaySeconds: 3
            periodSeconds: 5
            failureThreshold: 3
            timeoutSeconds: 60

Deploy the Service to Knative Serving with kubectl by running the command:
```
kubectl apply --filename service.yaml
```
Now that your service is created, Knative performs the following steps:
- Create a new immutable revision for this version of the app.
- Network programming to create a route, ingress, service, and load balance for your application.
- Automatically scale your pods up and down (including to zero active pods).

Run the following command to find the domain URL for your service:

kubectl get ksvc iris-classifier --output=custom-columns=NAME:.metadata.name,URL:.status.url

NAME              URL
iris-classifier   http://iris-classifer.default.example.com

Replace the request URL with the URL return in the previous command, and execute the command to get prediction result from the deployed model API endpoint.

curl -v -i \
  --header "Content-Type: application/json" \
  --request POST \
  --data '[[5.1, 3.5, 1.4, 0.2]]' \
  http://iris-classifier.default.example.com/predict

[0]

Removing the sample app deployment

To remove the application from your cluster, delete the service record:

kubectl delete --filename service.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

machinelearning-python-bentoml

machinelearning-python-bentoml

README.md

Hello World - Python BentoML

Before you begin

Recreating sample code

Building and deploying the sample

Removing the sample app deployment

Files

machinelearning-python-bentoml

Directory actions

More options

Directory actions

More options

Latest commit

History

machinelearning-python-bentoml

Folders and files

parent directory

README.md

Hello World - Python BentoML

Before you begin

Recreating sample code

Building and deploying the sample

Removing the sample app deployment