## BERT Embeddings Serverless Function
This notebook presents deployment of pretrained BERT model that outputs embeddings for given textual sequences as a serverless function. Embeddings are meaningful, contextual representations of text in the form of ndarrays that are used frequently as input to various learning tasks in the field of NLP.

In [1]:
!/User/align_mlrun.sh

Both server & client are aligned (0.6.5).


## Running remotly

In [6]:
from mlrun import import_function, auto_mount

In [10]:
# importing function from the hub
fn = import_function("hub://bert_embeddings").apply(auto_mount())

In [12]:
# deploying the function
addr = fn.deploy(project = "function-marketplace")

> 2021-08-08 11:15:17,969 [info] Starting remote function deploy
2021-08-08 11:15:18  (info) Deploying function
2021-08-08 11:15:18  (info) Building
2021-08-08 11:15:18  (info) Staging files and preparing base images
2021-08-08 11:15:18  (info) Building processor image
2021-08-08 11:18:43  (info) Build complete
2021-08-08 11:19:15  (info) Function deploy complete
> 2021-08-08 11:19:17,341 [info] function deployed, address=default-tenant.app.dev39.lab.iguazeng.com:31819


In [13]:
# some sentences to do bert embeddings on
event_data = ['the quick brown fox jumps over the lazy dog',
              'Hello I am Jacob',
              'Daniel visited Tel-Aviv last month']

In [15]:
import requests
import json
# sending a request to the function endpoint to get the sentences' embeddings
resp = requests.post(addr, json=json.dumps(event_data))

In [19]:
import pickle
output_embeddings = pickle.loads(resp.content)

In [20]:
print(f'embeddings per token shape: {output_embeddings[0].shape}, pooled embeddings shape: {output_embeddings[1].shape}')

embeddings per token shape: (3, 11, 768), pooled embeddings shape: (3, 768)


we can see that the size of the first dimension of the outputs is three since we passed in three sequences. Also the intermediate dimension of the first output is the maximal number of tokens across all input sequences. Sequences with less tokens are padded with zero values.<br>
Note that the first input has an intermediate dimension of size 11 that corresponds to the number of max tokens in the input sequence after addition of two special tokens marking beginning and end of a sequence by the tokenizer.

## Running Locally 