# Getting Started with BentoML

[BentoML](http://bentoml.ai) is an open source framework for serving and deploying machine learning models. It provides high-level APIs for defining a prediction service and packaging trained models, source code, dependencies, and configurations into a production-system-friendly format that is ready for production deployment.

This is a quick tutorial on how to use BentoML to create a prediction service with a trained sklearn model, serving the model via a REST API server and deploy it to [AWS Lambda](https://aws.amazon.com/lambda/) as a serverless endpoint.

![Impression](https://www.google-analytics.com/collect?v=1&tid=UA-112879361-3&cid=555&t=event&ec=guides&ea=bentoml-quick-start-guide&dt=bentoml-quick-start-guide)

In [None]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

In [None]:
# Install BentoML
!pip install bentoml

# Install scikit-learn, we will use a sklean model as an example
!pip install pandas sklearn

Let's get started with a simple scikit-learn model as an example:

In [None]:
from sklearn import svm
from sklearn import datasets

clf = svm.SVC(gamma='scale')
iris = datasets.load_iris()
X, y = iris.data, iris.target
clf.fit(X, y)

## Create BentoService for model serving

To package this trained model for model serving in production, you will need to create a custom BentoService class:

In [None]:
%%writefile iris_classifier.py
from bentoml import BentoService, api, env, artifacts
from bentoml.artifact import SklearnModelArtifact
from bentoml.handlers import DataframeHandler

@artifacts([SklearnModelArtifact('model')])
@env(pip_dependencies=["scikit-learn"])
class IrisClassifier(BentoService):

    @api(DataframeHandler)
    def predict(self, df):
        return self.artifacts.model.predict(df)

The `@artifacts` decorator here tells BentoML what artifacts are required when 
packaging this BentoService. Besides `SklearnModelArtifact`, BentoML also provides
`KerasModelArtifact`, `PytorchModelArtifact`, `FastaiModelArtifact` and 
`PickleArtifact` etc.

`@env` is designed for specifying the desired system environment in order for this
BentoService to load. If you already have a requirement.txt file listing all python 
libraries you need:
```python
@env(requirement_txt='../myproject/requirement.txt')
```

Lastly `@api` adds an entry point for accessing this BentoService. Each
`api` will be translated into a REST endpoint when [deploying as API
server](#serving-via-rest-api), or a CLI command when [running as a CLI
tool](#use-as-cli-tool).

Each API also requires a `Handler` for defining the expected input format. In
this case, `DataframeHandler` will transform either an HTTP request or CLI
command arguments into a pandas Dataframe and pass it down to the user defined
API function. BentoML also supports `JsonHandler`, `ImageHandler` and
`TensorHandler`.


## Save BentoService to file archive

In [None]:
# 1) import the custom BentoService defined above
from iris_classifier import IrisClassifier

# 2) `pack` it with required artifacts
svc = IrisClassifier.pack(model=clf)

# 3) save BentoSerivce to a BentoML bundle
saved_path = svc.save()

_That's it._ You've just created your first BentoML Bundle. It's a versioned file archive, containing the BentoService you defined, including the trained model, dependencies and configurations etc, everything it needs to deploy the exact same service in production.

## Model Serving via REST API

Use the `bentoml serve` command to start a REST API server from a saved BentoML bundle. This allows application developers to easily intergrate with the ML model you are developing.

Note that REST API serving **does not work in Google Colab**, due to unable to access Colab's VM. You may download the notebook and run it locally to play with the BentoML API server.

In [None]:
!bentoml serve {saved_path}

#### View documentations for REST APIs

Open http://127.0.0.1:5000 to see more information about the REST APIs server in your
browser.

#### Send prediction request to REST API server

*Run the following command in terminal to make a HTTP request to the API server*
```bash
curl -i \
--header "Content-Type: application/json" \
--request POST \
--data '[[5.1, 3.5, 1.4, 0.2]]' \
localhost:5000/predict
```

Note you must ensure the pip and conda dependencies are available in your python
environment when using `bentoml serve` command. More commonly we recommend using
BentoML API server with Docker:

## Run REST API server with Docker

BentoML supports building Docker Image for your REST API model server.
Simply use the BentoML bundle directory as the docker build context:

In [None]:
!cd {saved_path} && docker build -t iris-classifier .

Note that `docker` is __note available in Google Colab__, download the notebook, ensure docker is installed and try it locally.

Next, you can `docker push` the image to your choice of registry for deployment,
or run it locally for development and testing:

In [None]:
!docker run -p 5000:5000 iris-classifier

## Load saved BentoService

`bentoml.load` is the enssential API for loading a Bento into your
python application:

In [None]:
import bentoml
import pandas as pd

bento_svc = bentoml.load(saved_path)

# Test loaded bentoml service:
bento_svc.predict([X[0]])

## "pip install" a BentoML bundle

BentoML also supports distributing a BentoService as PyPI package, with the
generated `setup.py` file. A Bento directory can be installed with `pip`:

In [None]:
!pip install {saved_path}

Now you can import your ML service as a regular python package:

In [None]:
import IrisClassifier

installed_svc = IrisClassifier.load()
installed_svc.predict([X[0]])

A Bento PyPI package can also be uploaded to pypi.org
as a public python package, or to your organization's private PyPI index for all
developers in your organization to use:

`cd {saved_path} & python setup.py sdist upload`

*You will need a ".pypirc" config file before doing this: https://docs.python.org/2/distutils/packageindex.html*


# CLI access

`pip install {saved_path}` also installs a CLI tool for accessing the BentoML service, print CLI help document with `--help`:


In [None]:
!IrisClassifier --help

Printing more information about this ML service with `info` command:

In [None]:
!IrisClassifier info

You can also print help and docs on individual commands:

In [None]:
!IrisClassifier predict --help

Each service API you defined in the BentoService will be exposed as a CLI command with the same name as the API function:

In [None]:
!IrisClassifier predict --input='[[5.1, 3.5, 1.4, 0.2]]'

BentoML cli also supports reading input data from `csv` or `json` files, in either local machine or remote HTTP/S3 location:

In [None]:
# Writing test data to a csv file
pd.DataFrame(iris.data).to_csv('iris_data.csv', index=False)

# Invoke predict from command lien
!IrisClassifier predict --input='./iris_data.csv'

Alternatively, you can also use the `bentoml` cli to load and run a BentoML service archive without installing it:

In [None]:
!bentoml info {saved_path}

In [None]:
!bentoml predict {saved_path} --input='[[5.1, 3.5, 1.4, 0.2]]'

# Deploying to AWS Lambda

AWS Lambda is a serverless computing platform provided by Amazon Web Services. BentoML service archive can be easily deployed to AWS Lambda as a REST API endpoint.

In order to run this demo, make sure to configure your AWS credentials via either `aws configure` command or setting the environment variables below:

In [None]:
%env AWS_ACCESS_KEY_ID=
%env AWS_SECRET_ACCESS_KEY=

Make sure you have [nodejs](https://nodejs.org) installed on your machine:

In [None]:
!node --version

Now, you can deploy the BentML bundle you just created to AWS Lambda with one command:

In [None]:
!bentoml deployment create quick-start-guide-deployment \
    --bento=IrisClassifier:{svc.version} \
    --api-name=predict \
    --platform=aws-lambda \
    --region=us-west-2

Here the 'quick-starrt-guide-deployment' is the deployment name, you can reference the deployment by this name and query its status. For example, to get current deployment status:

In [None]:
!bentoml deployment describe quick-start-guide-deployment

To view your deployment configurations:

In [None]:
!bentoml deployment get quick-start-guide-deployment

And to delete an active deployment:

In [None]:
!bentoml deployment delete quick-start-guide-deployment

BentoML by default stores the deployment metadata on the local machine. For team settings, we recommend hosting a shared BentoML Yatai server for your entire team to track all the BentoML bundle and deployments they've created in a central place.

# Summary

This is what it looks like when using BentoML to create and deploy a machine learning service, all the way from training notebook to deployment in production. BentoML also supports many other Machine Learning frameworks, as well as many other deployment platforms. Take a look at other BentoML examples [here](https://github.com/bentoml/BentoML#examples).