# BentoML Example:  Keras Text Classification

[BentoML](http://bentoml.ai) is an open source platform for machine learning model serving and deployment. 

This notebook demonstrates how to use BentoML to turn a Keras model into a docker image containing a REST API server serving this model, how to use your ML service built with BentoML as a CLI tool, and how to distribute it a pypi package.

This notebook is built based on Keras's IMDB LSTM tutorial [here](https://github.com/keras-team/keras/blob/master/examples/imdb_lstm.py).

![Impression](https://www.google-analytics.com/collect?v=1&tid=UA-112879361-3&cid=555&t=event&ec=keras&ea=keras-text-classification&dt=keras-text-classification)

In [1]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

In [None]:
!pip install bentoml
!pip install tensorflow
!pip install numpy

In [2]:
from __future__ import absolute_import, division, print_function

import numpy as np
import tensorflow as tf
print("Tensorflow Version: %s" % tf.__version__)

from tensorflow import keras
from tensorflow.keras.preprocessing import sequence
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Embedding
from tensorflow.keras.layers import LSTM
from tensorflow.keras.datasets import imdb

import bentoml
print("BentoML Version: %s" % bentoml.__version__)

Tensorflow Version: 1.13.1
BentoML Version: 0.4.1


In [3]:
max_features = 1000
maxlen = 80 # cut texts after this number of words (among top max_features most common words)
batch_size = 300
index_from=3 # word index offset

# Prepare Dataset
Download the IMDB dataset

In [4]:
# A dictionary mapping words to an integer index
imdb.load_data(num_words=max_features)
word_index = imdb.get_word_index()

# The first indices are reserved
word_index = {k:(v+index_from) for k,v in word_index.items()} 
word_index["<PAD>"] = 0
word_index["<START>"] = 1
word_index["<UNK>"] = 2  # unknown

# Use decode_review to look at original review text in training/testing data
reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])
def decode_review(encoded_text):
    return ' '.join([reverse_word_index.get(i, '?') for i in encoded_text])

In [5]:
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features, index_from=index_from)

In [6]:
x_train = sequence.pad_sequences(x_train,
                                 value=word_index["<PAD>"],
                                 padding='post',
                                 maxlen=maxlen)

x_test = sequence.pad_sequences(x_test,
                                value=word_index["<PAD>"],
                                padding='post',
                                maxlen=maxlen)

# Model Training & Evaluation

In [7]:
model = Sequential()
model.add(Embedding(max_features, 128))
model.add(LSTM(128, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(1, activation='sigmoid'))

model.summary()

Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding (Embedding)        (None, None, 128)         128000    
_________________________________________________________________
lstm (LSTM)                  (None, 128)               131584    
_________________________________________________________________
dense (Dense)                (None, 1)                 129       
Total params: 259,713
Trainable params: 259,713
Non-trainable params: 0
_________________________________________________________________


In [8]:
model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=1, # for demo purpose :P
          validation_data=(x_test, y_test))

Train on 25000 samples, validate on 25000 samples
Instructions for updating:
Use tf.cast instead.


<tensorflow.python.keras.callbacks.History at 0xb2a85c5f8>

In [9]:
score, acc = model.evaluate(x_test, y_test,
                            batch_size=batch_size)

print('Test score:', score)
print('Test accuracy:', acc)

Test score: 0.45802617967128756
Test accuracy: 0.79452


## Define BentoService for model serving

In [10]:
%%writefile text_classification_service.py
import pandas as pd
import numpy as np
from tensorflow import keras
from tensorflow.keras.preprocessing import sequence, text
from bentoml import api, env, BentoService, artifacts
from bentoml.artifact import KerasModelArtifact, PickleArtifact
from bentoml.handlers import JsonHandler

max_features = 1000

@artifacts([
    KerasModelArtifact('model'),
    PickleArtifact('word_index')
])
@env(pip_dependencies=['tensorflow', 'numpy', 'pandas'])
class TextClassificationService(BentoService):
   
    def word_to_index(self, word):
        if word in self.artifacts.word_index and self.artifacts.word_index[word] <= max_features:
            return self.artifacts.word_index[word]
        else:
            return self.artifacts.word_index["<UNK>"]
    
    def preprocessing(self, text_str):
        sequence = text.text_to_word_sequence(text_str)
        return list(map(self.word_to_index, sequence))
    
    @api(JsonHandler)
    def predict(self, parsed_json):
        if type(parsed_json) == list:
            input_data = list(map(self.preprocessing, parsed_json))
        else: # expecting type(parsed_json) == dict:
            input_data = [self.preprocessing(parsed_json['text'])]

        input_data = sequence.pad_sequences(input_data,
                                            value=self.artifacts.word_index["<PAD>"],
                                            padding='post',
                                            maxlen=80)

        return self.artifacts.model.predict_classes(input_data)

Overwriting text_classification_service.py


## Save BentoService to file archive

In [11]:
# 1) import the custom BentoService defined above
from text_classification_service import TextClassificationService

# 2) `pack` it with required artifacts
bento_svc = TextClassificationService.pack(model=model, word_index=word_index)

# 3) save your BentoSerivce
saved_path = bento_svc.save()

[2019-09-19 15:00:20,515] INFO - Successfully saved Bento 'TextClassificationService:2019_09_19_cbc66362' to path: /Users/chaoyuyang/bentoml/repository/TextClassificationService/2019_09_19_cbc66362


### Test packed BentoML service

In [12]:
bento_svc.predict({ 'text': 'bad worst terrible' })

array([[0]], dtype=int32)

In [13]:
bento_svc.predict(['the best movie I have ever seen', 'This is a bad movie'])

array([[1],
       [1]], dtype=int32)

# Load BentoML Service from archive

In [14]:
import bentoml

loaded_bento_svc = bentoml.load(saved_path)



In [15]:
loaded_bento_svc.predict({ "text": "the best movie I have ever seen" })

array([[1]], dtype=int32)

In [16]:
loaded_bento_svc.predict(['the best movie I have ever seen', 'This is a bad movie'])

array([[1],
       [1]], dtype=int32)

# Run REST API server locally

A saved BentoML service archive can be loaded as a REST API server with bentoml cli:

In [25]:
!bentoml serve {saved_path}

2019-09-19 15:12:07.575527: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
Instructions for updating:
Use tf.cast instead.
 * Serving Flask app "TextClassificationService" (lazy loading)
 * Environment: production
[2m   Use a production WSGI server instead.[0m
 * Debug mode: off
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
127.0.0.1 - - [19/Sep/2019 15:12:17] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [19/Sep/2019 15:12:21] "[37mPOST /predict HTTP/1.1[0m" 200 -
127.0.0.1 - - [19/Sep/2019 15:12:23] "[37mGET / HTTP/1.1[0m" 200 -
127.0.0.1 - - [19/Sep/2019 15:12:23] "[37mGET /docs.json HTTP/1.1[0m" 200 -
^C


### Send prediction request to REST API server

*Run the following command in terminal to make a HTTP request to the API server*
```bash
curl -i \
--header "Content-Type: application/json" \
--request POST \
--data '{"text": "best movie ever"}' \
localhost:5000/predict
```

# "pip install" a BentoML archive

BentoML user can directly pip install saved BentoML archive with `pip install $SAVED_PATH`,  and use it as a regular python package.

In [18]:
!pip install {saved_path}

Processing /Users/chaoyuyang/bentoml/repository/TextClassificationService/2019_09_19_cbc66362
Building wheels for collected packages: TextClassificationService
  Building wheel for TextClassificationService (setup.py) ... [?25ldone
[?25h  Stored in directory: /private/var/folders/ns/vc9qhmqx5dx_9fws7d869lqh0000gn/T/pip-ephem-wheel-cache-nnk_qa2n/wheels/2f/77/a3/7ae039c8679d8c3ec4c368d63f070368dd96e19bf3ab69d269
Successfully built TextClassificationService
Installing collected packages: TextClassificationService
  Found existing installation: TextClassificationService 2019-07-16-014454ed
    Uninstalling TextClassificationService-2019-07-16-014454ed:
      Successfully uninstalled TextClassificationService-2019-07-16-014454ed
Successfully installed TextClassificationService-2019-09-19-cbc66362


In [19]:
import TextClassificationService

installed_svc = TextClassificationService.load()



In [20]:
installed_svc.predict({ 'text': 'the best movie I have ever seen' })

array([[1]], dtype=int32)

In [21]:
installed_svc.predict({ 'text': 'This is a bad movie' })

array([[1]], dtype=int32)

# CLI access

`pip install $SAVED_PATH` also installs a CLI tool for accessing the BentoML service

In [22]:
!TextClassificationService --help

Usage: TextClassificationService [OPTIONS] COMMAND [ARGS]...

  BentoML CLI tool

Options:
  -q, --quiet  Hide process logs and only print command results
  --verbose    Print verbose debugging information for BentoML developer
  --version    Show the version and exit.
  --help       Show this message and exit.

Commands:
  <API_NAME>      Run API function
  info            List APIs
  open-api-spec   Display OpenAPI/Swagger JSON specs
  serve           Start local rest server
  serve-gunicorn  Start local gunicorn server


### Print model service information:

In [23]:
!TextClassificationService info

[39m{
  "name": "TextClassificationService",
  "version": "2019_09_19_cbc66362",
  "created_at": "2019-09-19T22:00:20.503864Z",
  "env": {
    "conda_env": "name: bentoml-custom-conda-env\nchannels:\n- defaults\ndependencies:\n- python=3.7.3\n- pip\n- pip:\n  - bentoml[api_server]==0.4.1\n",
    "pip_dependencies": "bentoml==0.4.1\ntensorflow\nnumpy\npandas"
  },
  "artifacts": [
    {
      "name": "model",
      "artifact_type": "KerasModelArtifact"
    },
    {
      "name": "word_index",
      "artifact_type": "PickleArtifact"
    }
  ],
  "apis": [
    {
      "name": "predict",
      "handler_type": "JsonHandler",
      "docs": "BentoML generated API endpoint"
    }
  ]
}[0m


### Run 'predict' api with json data:

In [24]:
!TextClassificationService predict --input='{"text": "bad movie"}'

2019-09-19 15:11:58.365967: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
Instructions for updating:
Use tf.cast instead.
[[1]]
