## BERT Based Sentiment Analysis Model Server
The model used here, was trained with the concept of transfer learning  i.e. taking huggingface transformers pretrained BERT model and further training it on a custom dataset of reviews. this yields a sentiment analysis model based on the prior knowledge of BERT. 
The model server is given a list of texts and outputs a list of labels corresponding to its prediction.
The labels express the sentiment of the writer towards the topic of the text:
0 for negative sentiment, 1 for neutral and 2 for positive.

The model file (~430 MB), can be downloaded to your local environment from: https://iguazio-sample-data.s3.amazonaws.com/models/model.pt

In [33]:
# nuclio: ignore
import nuclio

### function code 
below is the model architecture, implemented with pytorch whose main component is bert.

In [34]:
import torch
import torch.nn as nn
from transformers import BertModel, BertTokenizer
from cloudpickle import dumps

PRETRAINED_MODEL = 'bert-base-cased'
tokenizer = BertTokenizer.from_pretrained('bert-base-cased')

class BertSentimentClassifier(nn.Module):
    def __init__(self, n_classes):
        super(BertSentimentClassifier, self).__init__()
        self.bert = BertModel.from_pretrained(PRETRAINED_MODEL)
        self.dropout = nn.Dropout(p=0.2)
        self.out_linear = nn.Linear(self.bert.config.hidden_size, n_classes)
        self.softmax = nn.Softmax(dim=1)

    def forward(self, input_ids, attention_mask):
        _, pooled_out = self.bert(
            input_ids=input_ids,
            attention_mask=attention_mask
        )
        out = self.dropout(pooled_out)
        out = self.out_linear(out)
        return self.softmax(out)

#### serving interface
The load function essentialy instantiates our custom model with the architecture defined above.

In [35]:
import mlrun
class SentimentClassifierServing(mlrun.serving.V2ModelServer):
    def load(self):
        model_file, _ = self.get_model('.pt')
        device = torch.device('cuda:0') if torch.cuda.is_available() else torch.device('cpu')
        model = BertSentimentClassifier(n_classes=3)
        model.load_state_dict(torch.load(model_file, map_location=device))
        model.eval()
        self.model = model
    def predict(self, body):
        try:
            instances = body['inputs']
            enc = tokenizer.batch_encode_plus(instances, return_tensors='pt', pad_to_max_length=True)
            outputs = self.model(input_ids=enc['input_ids'], attention_mask=enc['attention_mask'])
            _, preds = torch.max(outputs, dim=1)
            return preds.cpu().tolist()
        except Exception as e:
            raise Exception("Failed to predict %s" % e)

In [36]:
# nuclio: end-code

### mlconfig

In [37]:
from mlrun import mlconf
import os

mlconf.dbpath = mlconf.dbpath or 'http://mlrun-api:8080'
mlconf.artifact_path = mlconf.artifact_path or f'{os.environ["HOME"]}/artifacts'

### test locally
You may change model_dir to point at the path where model.pt file is saved

In [38]:
# Run this to download the pre-trained model to your `models` directory
import os
model_location = 'https://iguazio-sample-data.s3.amazonaws.com/models/model.pt'
saved_models_directory = os.path.join(os.path.abspath('./'), 'models')

# Create paths
os.makedirs(saved_models_directory, exist_ok=1)
model_filepath = os.path.join(saved_models_directory, os.path.basename(model_location))
!wget -nc -P {saved_models_directory} {model_location} 

File ‘/User/myfunctions/functions/sentiment_analysis_serving/models/model.pt’ already there; not retrieving.



In [41]:
import mlrun
models_path = model_filepath
fn = mlrun.code_to_function('my_server', kind='serving')
# set the topology/router and add models
graph = fn.set_topology("router")
fn.add_model("model1", class_name='SentimentClassifierServing', model_path=models_path)
# create and use the graph simulator
server = fn.to_mock_server()
fn.export("function.yaml")

> 2021-03-15 12:59:59,405 [info] model model1 was loaded
> 2021-03-15 12:59:59,407 [info] Loaded ['model1']
> 2021-03-15 12:59:59,424 [info] function spec saved to path: function.yaml


<mlrun.runtimes.serving.ServingRuntime at 0x7f6bdc183490>

#### test 1
Here we test a pretty straightforward example for positive sentiment.

In [40]:
output = server.test("/v2/models/model1/infer", {"inputs":['I had a pleasure to work with such dedicated team. Looking forward to \
             cooperate with each and every one of them again.']})

assert output['outputs'] == [2]
print(output['outputs'])

[2]


#### test 2
Now we will test a couple more examples. These are arguably harder due to misleading words that express, on their own, an opposite sentiment comparing to the full text. 

In [24]:
output2 = server.test("/v2/models/model1/infer",{"inputs":['This app is amazingly useless.',
                     'As much as I hate to admit it, the new added feature is surprisingly user friendly.']})

print(output2['outputs'])
assert output['outputs'] == [2]
assert output2['outputs'] == [0,2]

[0, 2]


### remote activation
Create a function object with custom specification.

In [25]:
from mlrun import new_model_server, mount_v3io
import requests

In [26]:
fn = new_model_server('sentiment-analysis-serving',
                      model_class='SentimentClassifierServing')
fn.spec.description = "BERT based sentiment classification model"
fn.metadata.categories = ['serving', 'NLP', 'BERT', 'sentiment analysis']
fn.metadata.labels = {'author': 'roye', 'framework': "pytorch"}
fn.spec.max_replicas = 1
fn.export("function.yaml")

fn.add_model('bert_classifier_v1', model_filepath)

> 2021-03-14 14:49:20,494 [info] function spec saved to path: function.yaml


<mlrun.runtimes.function.RemoteRuntime at 0x7f6bd013f110>

In [31]:
from mlrun import code_to_function, mount_v3io
fn = code_to_function('sentiment-analysis-serving', kind='serving')
fn.add_model('m1', model_path=model_filepath, class_name='SentimentClassifierServing')
fn.apply(mount_v3io())
fn.export("function.yaml")

> 2021-03-14 15:04:48,376 [info] function spec saved to path: function.yaml


<mlrun.runtimes.serving.ServingRuntime at 0x7f6be6c6e150>

In [32]:
addr = fn.deploy(project='nlp-servers')


> 2021-03-14 15:04:49,573 [info] Starting remote function deploy
2021-03-14 15:04:49  (info) Deploying function
2021-03-14 15:04:49  (info) Building
2021-03-14 15:04:50  (info) Staging files and preparing base images
2021-03-14 15:04:50  (info) Building processor image
2021-03-14 15:04:53  (info) Build complete
Failed to deploy. Details:
Downloading: 100%|██████████| 213k/213k [00:01<00:00, 125kB/s]  
Downloading: 100%|██████████| 433/433 [00:00<00:00, 277kB/s]
Downloading: 100%|██████████| 436M/436M [00:08<00:00, 54.1MB/s] 
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/mlrun/serving/v2_serving.py", line 88, in _load_and_update_state
    self.load()
  File "/opt/nuclio/bert_sentiment_analysis_serving.py", line 34, in load
    model.load_state_dict(torch.load(model_file, map_location=device))
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1052, in load_state_dict
    self.__class__.__name__, "
	".join(error_msgs)))
Ru

RunError: cannot deploy Failed to deploy. Details:
Downloading:   0%|          | 0.00/213k [00:00<?, ?B/s]Downloading:   4%|▍         | 8.19k/213k [00:00<00:03, 58.9kB/s]Downloading:  10%|▉         | 20.5k/213k [00:00<00:02, 65.8kB/s]Downloading:  17%|█▋        | 36.9k/213k [00:00<00:02, 74.7kB/s]Downloading:  21%|██        | 45.1k/213k [00:00<00:02, 68.6kB/s]Downloading:  31%|███       | 65.5k/213k [00:00<00:01, 81.5kB/s]Downloading:  38%|███▊      | 81.9k/213k [00:00<00:01, 89.3kB/s]Downloading:  46%|████▌     | 98.3k/213k [00:00<00:01, 96.2kB/s]Downloading:  54%|█████▎    | 115k/213k [00:01<00:00, 101kB/s]  Downloading:  63%|██████▎   | 135k/213k [00:01<00:00, 111kB/s]Downloading:  71%|███████   | 152k/213k [00:01<00:00, 113kB/s]Downloading:  81%|████████  | 172k/213k [00:01<00:00, 121kB/s]Downloading:  90%|█████████ | 193k/213k [00:01<00:00, 127kB/s]Downloading: 100%|██████████| 213k/213k [00:01<00:00, 125kB/s]
Downloading:   0%|          | 0.00/433 [00:00<?, ?B/s]Downloading: 100%|██████████| 433/433 [00:00<00:00, 277kB/s]
Downloading:   0%|          | 0.00/436M [00:00<?, ?B/s]Downloading:   1%|          | 4.57M/436M [00:00<00:09, 45.7MB/s]Downloading:   2%|▏         | 9.88M/436M [00:00<00:08, 47.7MB/s]Downloading:   3%|▎         | 14.7M/436M [00:00<00:08, 47.8MB/s]Downloading:   5%|▍         | 20.0M/436M [00:00<00:08, 49.4MB/s]Downloading:   6%|▌         | 25.4M/436M [00:00<00:08, 50.7MB/s]Downloading:   7%|▋         | 30.9M/436M [00:00<00:07, 51.7MB/s]Downloading:   8%|▊         | 36.0M/436M [00:00<00:07, 51.6MB/s]Downloading:   9%|▉         | 40.8M/436M [00:00<00:07, 50.2MB/s]Downloading:  11%|█         | 45.9M/436M [00:00<00:07, 50.3MB/s]Downloading:  12%|█▏        | 51.3M/436M [00:01<00:07, 51.5MB/s]Downloading:  13%|█▎        | 56.8M/436M [00:01<00:07, 52.4MB/s]Downloading:  14%|█▍        | 62.3M/436M [00:01<00:07, 53.2MB/s]Downloading:  16%|█▌        | 67.7M/436M [00:01<00:06, 53.6MB/s]Downloading:  17%|█▋        | 73.2M/436M [00:01<00:06, 53.8MB/s]Downloading:  18%|█▊        | 78.7M/436M [00:01<00:06, 54.1MB/s]Downloading:  19%|█▉        | 84.1M/436M [00:01<00:06, 54.3MB/s]Downloading:  21%|██        | 89.6M/436M [00:01<00:06, 54.1MB/s]Downloading:  22%|██▏       | 95.0M/436M [00:01<00:06, 54.2MB/s]Downloading:  23%|██▎       | 100M/436M [00:01<00:06, 54.4MB/s] Downloading:  24%|██▍       | 106M/436M [00:02<00:06, 54.7MB/s]Downloading:  26%|██▌       | 112M/436M [00:02<00:05, 54.9MB/s]Downloading:  27%|██▋       | 117M/436M [00:02<00:05, 55.0MB/s]Downloading:  28%|██▊       | 123M/436M [00:02<00:05, 55.1MB/s]Downloading:  29%|██▉       | 128M/436M [00:02<00:05, 55.2MB/s]Downloading:  31%|███       | 134M/436M [00:02<00:05, 55.1MB/s]Downloading:  32%|███▏      | 139M/436M [00:02<00:05, 54.9MB/s]Downloading:  33%|███▎      | 145M/436M [00:02<00:05, 54.9MB/s]Downloading:  34%|███▍      | 150M/436M [00:02<00:05, 54.8MB/s]Downloading:  36%|███▌      | 156M/436M [00:02<00:05, 54.9MB/s]Downloading:  37%|███▋      | 161M/436M [00:03<00:05, 54.9MB/s]Downloading:  38%|███▊      | 167M/436M [00:03<00:04, 54.8MB/s]Downloading:  40%|███▉      | 172M/436M [00:03<00:04, 54.7MB/s]Downloading:  41%|████      | 178M/436M [00:03<00:04, 54.6MB/s]Downloading:  42%|████▏     | 183M/436M [00:03<00:04, 54.7MB/s]Downloading:  43%|████▎     | 189M/436M [00:03<00:04, 54.8MB/s]Downloading:  45%|████▍     | 194M/436M [00:03<00:04, 54.5MB/s]Downloading:  46%|████▌     | 200M/436M [00:03<00:04, 54.3MB/s]Downloading:  47%|████▋     | 205M/436M [00:03<00:04, 54.3MB/s]Downloading:  48%|████▊     | 210M/436M [00:03<00:04, 54.3MB/s]Downloading:  50%|████▉     | 216M/436M [00:04<00:04, 53.7MB/s]Downloading:  51%|█████     | 221M/436M [00:04<00:03, 53.9MB/s]Downloading:  52%|█████▏    | 227M/436M [00:04<00:03, 54.1MB/s]Downloading:  53%|█████▎    | 232M/436M [00:04<00:03, 54.3MB/s]Downloading:  55%|█████▍    | 238M/436M [00:04<00:03, 54.5MB/s]Downloading:  56%|█████▌    | 243M/436M [00:04<00:03, 54.5MB/s]Downloading:  57%|█████▋    | 249M/436M [00:04<00:03, 54.4MB/s]Downloading:  58%|█████▊    | 254M/436M [00:04<00:03, 54.4MB/s]Downloading:  60%|█████▉    | 260M/436M [00:04<00:03, 54.5MB/s]Downloading:  61%|██████    | 265M/436M [00:04<00:03, 54.4MB/s]Downloading:  62%|██████▏   | 270M/436M [00:05<00:03, 54.2MB/s]Downloading:  63%|██████▎   | 276M/436M [00:05<00:02, 54.3MB/s]Downloading:  65%|██████▍   | 281M/436M [00:05<00:02, 54.5MB/s]Downloading:  66%|██████▌   | 287M/436M [00:05<00:02, 54.5MB/s]Downloading:  67%|██████▋   | 292M/436M [00:05<00:02, 54.4MB/s]Downloading:  68%|██████▊   | 298M/436M [00:05<00:02, 54.2MB/s]Downloading:  70%|██████▉   | 303M/436M [00:05<00:02, 54.1MB/s]Downloading:  71%|███████   | 309M/436M [00:05<00:02, 54.1MB/s]Downloading:  72%|███████▏  | 314M/436M [00:05<00:02, 52.9MB/s]Downloading:  73%|███████▎  | 319M/436M [00:05<00:02, 53.3MB/s]Downloading:  75%|███████▍  | 325M/436M [00:06<00:02, 53.4MB/s]Downloading:  76%|███████▌  | 330M/436M [00:06<00:01, 53.9MB/s]Downloading:  77%|███████▋  | 336M/436M [00:06<00:01, 53.9MB/s]Downloading:  78%|███████▊  | 341M/436M [00:06<00:01, 53.8MB/s]Downloading:  80%|███████▉  | 346M/436M [00:06<00:01, 53.9MB/s]Downloading:  81%|████████  | 352M/436M [00:06<00:01, 54.0MB/s]Downloading:  82%|████████▏ | 357M/436M [00:06<00:01, 54.1MB/s]Downloading:  83%|████████▎ | 363M/436M [00:06<00:01, 54.1MB/s]Downloading:  84%|████████▍ | 368M/436M [00:06<00:01, 54.2MB/s]Downloading:  86%|████████▌ | 374M/436M [00:06<00:01, 54.3MB/s]Downloading:  87%|████████▋ | 379M/436M [00:07<00:01, 54.5MB/s]Downloading:  88%|████████▊ | 385M/436M [00:07<00:00, 54.6MB/s]Downloading:  90%|████████▉ | 390M/436M [00:07<00:00, 54.8MB/s]Downloading:  91%|█████████ | 396M/436M [00:07<00:00, 54.8MB/s]Downloading:  92%|█████████▏| 401M/436M [00:07<00:00, 54.9MB/s]Downloading:  93%|█████████▎| 407M/436M [00:07<00:00, 54.9MB/s]Downloading:  95%|█████████▍| 412M/436M [00:07<00:00, 54.7MB/s]Downloading:  96%|█████████▌| 418M/436M [00:07<00:00, 54.8MB/s]Downloading:  97%|█████████▋| 423M/436M [00:07<00:00, 54.9MB/s]Downloading:  98%|█████████▊| 429M/436M [00:07<00:00, 54.8MB/s]Downloading: 100%|█████████▉| 434M/436M [00:08<00:00, 54.8MB/s]Downloading: 100%|██████████| 436M/436M [00:08<00:00, 54.1MB/s]
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/mlrun/serving/v2_serving.py", line 88, in _load_and_update_state
    self.load()
  File "/opt/nuclio/bert_sentiment_analysis_serving.py", line 34, in load
    model.load_state_dict(torch.load(model_file, map_location=device))
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1052, in load_state_dict
    self.__class__.__name__, "
	".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for BertSentimentClassifier:
	Missing key(s) in state_dict: "bert.embeddings.position_ids". 
 [worker_id="0"]
Exception raised while running init_context [worker_id="0"]
Caught unhandled exception while initializing [err="failed to load model m1, Error(s) in loading state_dict for BertSentimentClassifier:
	Missing key(s) in state_dict: "bert.embeddings.position_ids". " || traceback="Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/mlrun/serving/v2_serving.py", line 88, in _load_and_update_state
    self.load()
  File "/opt/nuclio/bert_sentiment_analysis_serving.py", line 34, in load
    model.load_state_dict(torch.load(model_file, map_location=device))
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1052, in load_state_dict
    self.__class__.__name__, "
	".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for BertSentimentClassifier:
	Missing key(s) in state_dict: "bert.embeddings.position_ids". 

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/nuclio/_nuclio_wrapper.py", line 350, in run_wrapper
    args.trigger_name)
  File "/opt/nuclio/_nuclio_wrapper.py", line 80, in __init__
    getattr(entrypoint_module, 'init_context')(self._context)
  File "/opt/nuclio/bert_sentiment_analysis_serving.py", line 50, in init_context
    nuclio_init_hook(context, globals(), 'serving_v2')
  File "/opt/conda/lib/python3.7/site-packages/mlrun/runtimes/nuclio.py", line 31, in nuclio_init_hook
    v2_serving_init(context, data)
  File "/opt/conda/lib/python3.7/site-packages/mlrun/serving/server.py", line 234, in v2_serving_init
    serving_handler = server.init(context, namespace or get_caller_globals())
  File "/opt/conda/lib/python3.7/site-packages/mlrun/serving/server.py", line 153, in init
    self.graph.init_object(context, namespace, self.load_mode, reset=True)
  File "/opt/conda/lib/python3.7/site-packages/mlrun/serving/states.py", line 481, in init_object
    route.init_object(context, namespace, mode, reset=reset)
  File "/opt/conda/lib/python3.7/site-packages/mlrun/serving/states.py", line 354, in init_object
    self._post_init(mode)
  File "/opt/conda/lib/python3.7/site-packages/mlrun/serving/states.py", line 379, in _post_init
    self._object.post_init(mode)
  File "/opt/conda/lib/python3.7/site-packages/mlrun/serving/v2_serving.py", line 104, in post_init
    self._load_and_update_state()
  File "/opt/conda/lib/python3.7/site-packages/mlrun/serving/v2_serving.py", line 92, in _load_and_update_state
    raise RuntimeError(f"failed to load model {self.name}, {exc}")
RuntimeError: failed to load model m1, Error(s) in loading state_dict for BertSentimentClassifier:
	Missing key(s) in state_dict: "bert.embeddings.position_ids". 
" || worker_id="0"]

#### remote test
We will send a sentence to the model server via HTTP request. Note that the url below uses model server notation that directs our event to the predict function.

In [None]:
import json

event_data = {'instances': ['I had a somewhat ok experience buying at that store.']}

resp = requests.put(addr + '/bert_classifier_v1/predict', json=json.dumps(event_data))

In [None]:
print(resp.text)

The model server classified the sentence as neutral. 