# BentoML Example: Time-Series Statistical Model 


**BentoML makes moving trained ML models to production easy:**

* Package models trained with **any ML framework** and reproduce them for model serving in production
* **Deploy anywhere** for online API serving or offline batch serving
* High-Performance API model server with *adaptive micro-batching* support
* Central hub for managing models and deployment process via Web UI and APIs
* Modular and flexible design making it *adaptable to your infrastrcuture*

BentoML is a framework for serving, managing, and deploying machine learning models. It is aiming to bridge the gap between Data Science and DevOps, and enable teams to deliver prediction services in a fast, repeatable, and scalable way.

Before reading this example project, be sure to check out the [Getting started guide](https://github.com/bentoml/BentoML/blob/master/guides/quick-start/bentoml-quick-start-guide.ipynb) to learn about the basic concepts in BentoML.


Make sure to __use GPU runtime when running this notebook in Google Colab__, you can set it in top menu: `Runtime > Change Runtime Type > Hardware accelerator`.

![Impression](https://www.google-analytics.com/collect?v=1&tid=UA-112879361-3&cid=555&t=event&ec=statsmodel&ea=statsmodel-timeseries&dt=statsmodel-timeseries)


In [1]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

In [2]:
!pip install -q bentoml statsmodels==0.10.1

You should consider upgrading via the '/usr/local/anaconda3/envs/dev-py3/bin/python -m pip install --upgrade pip' command.[0m


In [3]:
%%writefile holt.py

# holt.py
from bentoml import env, artifacts, api, BentoService
from bentoml.adapters import DataframeInput
from bentoml.service.artifacts.common import PickleArtifact
import numpy as np

@env(pip_dependencies=["statsmodels==0.10.1","joblib","numpy"])
@artifacts([PickleArtifact('model')])
class holt(BentoService):
    @api(input=DataframeInput(), batch=True)
    def predict(self, df):

        # Printing the dataframe to cross-cjheck
        print(df.head())

        # Parsing the dataframe values
        weeks=int(df.iat[0,0])
        print(weeks)
        return((self.artifacts.model).forecast(weeks))
  

Overwriting holt.py


The bentoml.api decorator defines a service API, which is the entry point for accessing the prediction service. The DataframeInput here denotes that this service API will convert HTTP JSON request into pandas.DataFrame object before passing it to the user-defined API function code for inference.

The pip_dependencies specify the libraries that you would need in your code. Any library specified here will automatically get added to the requirements.txt folder . 

Here we're using the PickleArtifact. However,  BentoML also provide model artifact for other frameworks such as PytorchModelArtifact, KerasModelArtifact, FastaiModelArtifact, and XgboostModelArtifact etc.

The following code trains a scikit-learn model and bundles the trained model with an Holt instance. The Holt instance is then saved to disk in the BentoML SavedBundle format, which is a versioned file archive that is ready for production models serving deployment,we've considered a shampoo sales data which is available publicly.

In [5]:
import pandas as pd
from statsmodels.tsa.holtwinters import ExponentialSmoothing
import numpy as np
import joblib
from holt import holt

df=pd.read_csv('https://raw.githubusercontent.com/jbrownlee/Datasets/master/shampoo.csv')

#Taking a test-train split of 80 %
train=df[0:int(len(df)*0.8)] 
test=df[int(len(df)*0.8):]

#Pre-processing the  Month  field
train.Timestamp = pd.to_datetime(train.Month,format='%m-%d') 
train.index = train.Timestamp 
test.Timestamp = pd.to_datetime(test.Month,format='%m-%d') 
test.index = test.Timestamp 

#fitting the model based on  optimal parameters
model = ExponentialSmoothing(np.asarray(train['Sales']) ,seasonal_periods=7 ,trend='add', seasonal='add',).fit()

#creating an instance of the holt class
holt_obj = holt()


# Pack the newly trained model artifact
holt_obj.pack('model', model)
saved_path = holt_obj.save()

[2020-09-22 21:28:56,042] INFO - Using default docker base image: `None` specified inBentoML config file or env var. User must make sure that the docker base image either has Python 3.7 or conda installed.


  
  app.launch_new_instance()


[2020-09-22 21:28:57,412] INFO - Detected non-PyPI-released BentoML installed, copying local BentoML modulefiles to target saved bundle path..


  normalized_version,
no previously-included directories found matching 'e2e_tests'
no previously-included directories found matching 'tests'
no previously-included directories found matching 'benchmark'


UPDATING BentoML-0.9.0rc0+6.g4beee0a8.dirty/bentoml/_version.py
set BentoML-0.9.0rc0+6.g4beee0a8.dirty/bentoml/_version.py to '0.9.0.pre+6.g4beee0a8.dirty'
[2020-09-22 21:29:01,475] INFO - BentoService bundle 'holt:20200922212857_02D15F' saved to: /Users/bozhaoyu/bentoml/repository/holt/20200922212857_02D15F


## REST API Model Serving


To start a REST API model server with the BentoService saved above, use the bentoml serve command:

In [6]:
!bentoml serve holt:latest

[2020-09-22 21:29:25,793] INFO - Getting latest version holt:20200922212857_02D15F
[2020-09-22 21:29:25,793] INFO - Starting BentoML API server in development mode..
[2020-09-22 21:29:26,424] INFO - Using default docker base image: `None` specified inBentoML config file or env var. User must make sure that the docker base image either has Python 3.7 or conda installed.
 * Serving Flask app "holt" (lazy loading)
 * Environment: production
[2m   Use a production WSGI server instead.[0m
 * Debug mode: off
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
^C


If you are running this notebook from Google Colab, you can start the dev server with `--run-with-ngrok` option, to gain acccess to the API endpoint via a public endpoint managed by [ngrok](https://ngrok.com/):


In [None]:
!bentoml serve holt:latest --run-with-ngrok

Open http://127.0.0.1:5000 to see more information about the REST APIs server in your
browser.


### Send prediction requeset to the REST API server

Run the following `curl` command to send the image to REST API server and get a prediction result:

```
curl -i \
  --header "Content-Type: application/json" \
  --request POST \
  --data '[[2]]' \
  http://localhost:5000/predict
```

## Containerize model server with Docker


One common way of distributing this model API server for production deployment, is via Docker containers. And BentoML provides a convenient way to do that.

Note that docker is **not available in Google Colab**. You will need to download and run this notebook locally to try out this containerization with docker feature.

If you already have docker configured, simply run the follow command to product a docker container serving the IrisClassifier prediction service created above:

In [7]:
!bentoml containerize holt:latest 

[2020-09-22 21:30:48,575] INFO - Getting latest version holt:20200922212857_02D15F
[39mFound Bento: /Users/bozhaoyu/bentoml/repository/holt/20200922212857_02D15F[0m
[39mTag not specified, using tag parsed from BentoService: 'holt:20200922212857_02D15F'[0m
Building Docker image holt:20200922212857_02D15F from holt:latest 
-we in here
processed docker file
(None, None)
root in create archive /Users/bozhaoyu/bentoml/repository/holt/20200922212857_02D15F ['Dockerfile', 'MANIFEST.in', 'README.md', 'bentoml-init.sh', 'bentoml.yml', 'bundled_pip_dependencies', 'bundled_pip_dependencies/BentoML-0.9.0rc0+6.g4beee0a8.dirty.tar.gz', 'docker-entrypoint.sh', 'environment.yml', 'holt', 'holt/__init__.py', 'holt/__pycache__', 'holt/__pycache__/holt.cpython-37.pyc', 'holt/artifacts', 'holt/artifacts/__init__.py', 'holt/artifacts/model.pkl', 'holt/bentoml.yml', 'holt/holt.py', 'python_version', 'requirements.txt', 'setup.py']
about to build
about to upgrade params
check each param and update
if use

[39mCollecting patsy>=0.4.0[0m
[39m  Downloading patsy-0.5.1-py2.py3-none-any.whl (231 kB)[0m
/[39mCollecting scipy>=0.18[0m
|[39m  Downloading scipy-1.5.2-cp37-cp37m-manylinux1_x86_64.whl (25.9 MB)[0m
/[39mCollecting pytz>=2011k[0m
[39m  Downloading pytz-2020.1-py2.py3-none-any.whl (510 kB)[0m
-[39mInstalling collected packages: pytz, numpy, pandas, patsy, scipy, statsmodels, joblib[0m
|[39m  Attempting uninstall: numpy[0m
[39m    Found existing installation: numpy 1.19.2[0m
-[39m    Uninstalling numpy-1.19.2:[0m
\[39m      Successfully uninstalled numpy-1.19.2[0m
\[39mSuccessfully installed joblib-0.14.1 numpy-1.18.4 pandas-0.24.2 patsy-0.5.1 pytz-2020.1 scipy-1.5.2 statsmodels-0.10.1[0m
-[39m ---> 47503115aacd[0m
[39mStep 8/15 : COPY . /bento[0m
|[39m ---> 683ecd7746c3[0m
[39mStep 9/15 : RUN if [ -d /bento/bundled_pip_dependencies ]; then pip install -U bundled_pip_dependencies/* ;fi[0m
\[39m ---> Running in 75c0de695bbf[0m
/[39mProcess

[39mBuilding wheels for collected packages: BentoML[0m
[39m  Building wheel for BentoML (PEP 517): started[0m
|[39m  Building wheel for BentoML (PEP 517): finished with status 'done'[0m
[39m  Created wheel for BentoML: filename=BentoML-0.9.0rc0+6.g4beee0a8.dirty-py3-none-any.whl size=3058751 sha256=70111534774d574d512f67e82acc43c08e3255b8f20a6268093b77ec2d730339
  Stored in directory: /root/.cache/pip/wheels/04/cb/be/16eecf14d2539252672c27d69a3b88c96604cf94c4c32a2ba7[0m
[39mSuccessfully built BentoML[0m
|[39mInstalling collected packages: BentoML
  Attempting uninstall: BentoML[0m
[39m    Found existing installation: BentoML 0.9.0rc0[0m
-[39m    Uninstalling BentoML-0.9.0rc0:[0m
\[39m      Successfully uninstalled BentoML-0.9.0rc0[0m
/[39mSuccessfully installed BentoML-0.9.0rc0+6.g4beee0a8.dirty[0m
/[39m ---> 860141a5b29e[0m
[39mStep 10/15 : ENV PORT 5000[0m
|[39m ---> Running in f4ea0beb335d[0m
-[39m ---> 9e8069c86aae[0m
[39mStep 11/15 : EXPOSE $PO

In [8]:
!docker run -p 5000:5000 holt:20200922212857_02D15F

[2020-09-23 04:32:30,915] INFO - Starting BentoML API server in production mode..
[2020-09-23 04:32:31,267] INFO - get_gunicorn_num_of_workers: 3, calculated by cpu count
[2020-09-23 04:32:31,277] INFO - Running micro batch service on :5000
[2020-09-23 04:32:31 +0000] [11] [INFO] Starting gunicorn 20.0.4
[2020-09-23 04:32:31 +0000] [1] [INFO] Starting gunicorn 20.0.4
[2020-09-23 04:32:31 +0000] [11] [INFO] Listening at: http://0.0.0.0:5000 (11)
[2020-09-23 04:32:31 +0000] [1] [INFO] Listening at: http://0.0.0.0:59489 (1)
[2020-09-23 04:32:31 +0000] [1] [INFO] Using worker: sync
[2020-09-23 04:32:31 +0000] [11] [INFO] Using worker: aiohttp.worker.GunicornWebWorker
[2020-09-23 04:32:31 +0000] [13] [INFO] Booting worker with pid: 13
[2020-09-23 04:32:31 +0000] [12] [INFO] Booting worker with pid: 12
[2020-09-23 04:32:31 +0000] [14] [INFO] Booting worker with pid: 14
[2020-09-23 04:32:31,355] INFO - Micro batch enabled for API `predict`
[2020-09-23 04:32:31,355] INFO - Your system nofile l

## Load saved BentoService

bentoml.load is the API for loading a BentoML packaged model in python:

In [13]:
from bentoml import load
import pandas as pd

loaded_svc = load(saved_path)

print(loaded_svc.predict(pd.DataFrame(data=[[2]])))

   0
0  2
2
[487.86681173 415.82743026]


## Launch inference job from CLI

BentoML cli supports loading and running a packaged model from CLI. With the DataframeInput adapter, the CLI command supports reading input Dataframe data from CLI argument or local csv or json files:

In [10]:
!bentoml run holt:latest predict --input '[[2]]'

[2020-09-22 21:33:21,460] INFO - Getting latest version holt:20200922212857_02D15F
[2020-09-22 21:33:21,831] INFO - Using default docker base image: `None` specified inBentoML config file or env var. User must make sure that the docker base image either has Python 3.7 or conda installed.
   0
0  2
2
[2020-09-22 21:33:26,555] INFO - {'service_name': 'holt', 'service_version': '20200922212857_02D15F', 'api': 'predict', 'task': {'data': {}, 'task_id': 'f6ab7f63-0dcc-42c1-8bf3-769572ad51dc', 'batch': 1, 'cli_args': ('--input', '[[2]]')}, 'result': {'data': '[487.8668117296382]', 'http_status': 200, 'http_headers': (('Content-Type', 'application/json'),)}, 'request_id': 'f6ab7f63-0dcc-42c1-8bf3-769572ad51dc'}
[487.8668117296382]


# Deployment Options

If you are at a small team with limited engineering or DevOps resources, try out automated deployment with BentoML CLI, currently supporting AWS Lambda, AWS SageMaker, and Azure Functions:
- [AWS Lambda Deployment Guide](https://docs.bentoml.org/en/latest/deployment/aws_lambda.html)
- [AWS SageMaker Deployment Guide](https://docs.bentoml.org/en/latest/deployment/aws_sagemaker.html)
- [Azure Functions Deployment Guide](https://docs.bentoml.org/en/latest/deployment/azure_functions.html)

If the cloud platform you are working with is not on the list above, try out these step-by-step guide on manually deploying BentoML packaged model to cloud platforms:
- [AWS ECS Deployment](https://docs.bentoml.org/en/latest/deployment/aws_ecs.html)
- [Google Cloud Run Deployment](https://docs.bentoml.org/en/latest/deployment/google_cloud_run.html)
- [Azure container instance Deployment](https://docs.bentoml.org/en/latest/deployment/azure_container_instance.html)
- [Heroku Deployment](https://docs.bentoml.org/en/latest/deployment/heroku.html)

Lastly, if you have a DevOps or ML Engineering team who's operating a Kubernetes or OpenShift cluster, use the following guides as references for implementating your deployment strategy:
- [Kubernetes Deployment](https://docs.bentoml.org/en/latest/deployment/kubernetes.html)
- [Knative Deployment](https://docs.bentoml.org/en/latest/deployment/knative.html)
- [Kubeflow Deployment](https://docs.bentoml.org/en/latest/deployment/kubeflow.html)
- [KFServing Deployment](https://docs.bentoml.org/en/latest/deployment/kfserving.html)
- [Clipper.ai Deployment Guide](https://docs.bentoml.org/en/latest/deployment/clipper.html)

