# Fast AI with Tabular data

This notebook is based on fastai's cours v3 lesson 4.  We are going to train a model that predict salary range base on the data we provided.

![Impression](https://www.google-analytics.com/collect?v=1&tid=UA-112879361-3&cid=555&t=event&ec=nb&ea=open&el=gallery-example&dt=fastai-tabular-csv)

In [7]:
!pip install fastai
!pip install bentoml



In [1]:
from fastai.tabular import *

In [2]:
path = untar_data(URLs.ADULT_SAMPLE)
df = pd.read_csv(path/'adult.csv')

In [3]:
dep_var = 'salary'
cat_names = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race']
cont_names = ['age', 'fnlwgt', 'education-num']
procs = [FillMissing, Categorify, Normalize]

In [4]:
test = TabularList.from_df(df.iloc[800:1000].copy(), path=path, cat_names=cat_names, cont_names=cont_names)

In [5]:
data = (TabularList.from_df(df, path=path, cat_names=cat_names, cont_names=cont_names, procs=procs)
                           .split_by_idx(list(range(800,1000)))
                           .label_from_df(cols=dep_var)
                           .add_test(test)
                           .databunch())

In [6]:
data.show_batch(rows=10)

workclass,education,marital-status,occupation,relationship,race,education-num_na,age,fnlwgt,education-num,target
Self-emp-inc,Bachelors,Married-civ-spouse,Prof-specialty,Husband,White,False,-0.3362,-0.1323,1.1422,>=50k
Local-gov,Some-college,Married-civ-spouse,Transport-moving,Husband,White,False,-0.4828,0.134,-0.0312,<50k
Private,10th,Married-civ-spouse,Machine-op-inspct,Wife,White,False,-1.0692,0.3541,-1.5958,<50k
Private,Bachelors,Married-civ-spouse,Prof-specialty,Husband,White,False,-0.9226,2.0221,1.1422,>=50k
Private,Some-college,Married-civ-spouse,Exec-managerial,Husband,White,False,-0.8493,2.2141,-0.0312,<50k
Private,10th,Never-married,Handlers-cleaners,Own-child,White,False,-1.5823,-0.5793,-1.5958,<50k
Private,HS-grad,Married-civ-spouse,Handlers-cleaners,Husband,White,False,-0.6294,-0.0231,-0.4224,<50k
Private,10th,Never-married,Machine-op-inspct,Not-in-family,White,False,0.1769,-0.3601,-1.5958,<50k
Self-emp-not-inc,Bachelors,Married-civ-spouse,Prof-specialty,Husband,White,False,0.4701,2.1926,1.1422,>=50k
Self-emp-not-inc,Some-college,Married-civ-spouse,Farming-fishing,Husband,White,False,0.2502,-0.2172,-0.0312,<50k


In [7]:
learn = tabular_learner(data, layers=[200,100], metrics=accuracy)

In [8]:
learn.fit(1, 1e-2)

epoch,train_loss,valid_loss,accuracy,time
0,0.359373,0.380596,0.83,00:04


In [63]:
row = df.iloc[0]

In [64]:
learn.predict(row)

(Category >=50k, tensor(1), tensor([0.4896, 0.5104]))

# Save model as machine learning service with BentoML

In [65]:
%%writefile tabular_csv.py

from bentoml import env, api, artifacts, BentoService
from bentoml.artifact import FastaiModelArtifact
from bentoml.handlers import DataframeHandler


@env(conda_environment=['fastai'])
@artifacts([FastaiModelArtifact('model')])
class TabularModel(BentoService):
    
    @api(DataframeHandler)
    def predict(self, df):
        result = []
        for index, row in df.iterrows():            
            result.append(self.artifacts.model.predict(row))
        return str(result)

Overwriting tabular_csv.py


In [57]:
from tabular_csv import TabularModel

svc = TabularModel.pack(model=learn)
saved_path = svc.save('/tmp/bento_archive')
print(saved_path)

[2019-07-16 12:42:16,135] INFO - Searching for dependant modules of tabular_csv:/Users/bozhaoyu/src/bento_gallery/fast-ai/tabular-csv/tabular_csv.py
[2019-07-16 12:42:41,630] INFO - Copying local python module '/Users/bozhaoyu/src/bento_gallery/fast-ai/tabular-csv/tabular_csv.py'
[2019-07-16 12:42:41,632] INFO - Done copying local python dependant modules
[2019-07-16 12:42:41,769] INFO - BentoService TabularModel:2019_07_16_d20108b5 saved to /tmp/bento_archive/TabularModel/2019_07_16_d20108b5
/tmp/bento_archive/TabularModel/2019_07_16_d20108b5


## Use BentoML Archive as CLI TOOL

In [58]:
!pip install {saved_path}

Processing /tmp/bento_archive/TabularModel/2019_07_16_d20108b5
Building wheels for collected packages: TabularModel
  Building wheel for TabularModel (setup.py) ... [?25ldone
[?25h  Stored in directory: /private/var/folders/kn/xnc9k74x03567n1mx2tfqnpr0000gn/T/pip-ephem-wheel-cache-1kh9mi0d/wheels/44/e1/10/b7c4cadb3f8eaa5e8a52f84cbd49ad912e6d45fe6f15ed211f
Successfully built TabularModel
Installing collected packages: TabularModel
  Found existing installation: TabularModel 2019-07-15-6b40ea82
    Uninstalling TabularModel-2019-07-15-6b40ea82:
      Successfully uninstalled TabularModel-2019-07-15-6b40ea82
Successfully installed TabularModel-2019-07-16-d20108b5


In [66]:
# Use json data
!TabularModel predict --input=test.json

[(Category <50k, tensor(0), tensor([0.7297, 0.2703]))]
[(Category <50k, tensor(0), tensor([0.7297, 0.2703]))]


In [67]:
# Use CSV data
!TabularModel predict --input=test.csv

[(Category >=50k, tensor(1), tensor([0.4896, 0.5104]))]
[(Category >=50k, tensor(1), tensor([0.4896, 0.5104]))]


## Use it as REST API server


*Note: Running as local rest api server does not work with Google Colab, please copy this notebook to run it locally*

In [47]:
! python /Users/bozhaoyu/src/bento/bentoml/cli/__init__.py serve {saved_path}

 * Serving Flask app "TabularModel" (lazy loading)
 * Environment: production
[2m   Use a production WSGI server instead.[0m
 * Debug mode: off
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
(Category >=50k, tensor(1), tensor([0.4896, 0.5104]))
response from handler <Response 55 bytes [200 OK]>
127.0.0.1 - - [16/Jul/2019 12:14:11] "[37mPOST /predict HTTP/1.1[0m" 200 -
^C


## Make request to REST API server

#### Post as JSON

```bash
curl -X POST \
  http://localhost:5000/predict \
  -H 'Content-Type: application/json' \
  -d '[{
  "age": 49,
  "workclass": "Private",
  "fnlwgt": 101320,
  "education": "Assoc-acdm",
  "education-num": 12.0,
  "marital-status": "Married-civ-spouse",
  "occupation": "",
  "relationship": "Wift",
  "race": "White",
  "sex": "Female",
  "capital-gain": 0,
  "capital-loss": 1902,
  "hours-per-week": 40,
  "native-country": "United-States",
  "salary": ">=50k"
}]'
```

#### Post as CSV

```bash
curl -X POST \
  http://localhost:5000/predict \
  -H 'Content-Type: text/csv' \
  -d 'age,workclass,fnlwgt,education,education-num,marital-status,occupation,relationship,race,sex,capital-gain,capital-loss,hours-per-week,native-country,salary
49, Private,101320, Assoc-acdm,12.0, Married-civ-spouse,, Wife, White, Female,0,1902,40, United-States,>=50k'
```