## Fill Mask Inference using Online Endpoints

This sample shows how to deploy `fill-mask` type models to an online endpoint for inference.

### Task
`fill-mask` task is about predicting masked words in a sentence. Models that perform this have a good understanding of the language structure and domain of the dataset that they are trained on. `fill-mask` models are typically used as foundation models for more scenario oriented tasks such as `text-classification` or `token-classification`.

### Model
Models that can perform the `fill-mask` task are tagged with `task: fill-mask`. We will use the `bert-base-uncased` model in this notebook. If you opened this notebook from a specific model card, remember to replace the specific model name. If you don't find a model that suits your scenario or domain, you can discover and [import models from HuggingFace hub](../../import/import-model-from-huggingface.ipynb) and then use them for inference. 

### Inference data
We will use the [book corpus](https://huggingface.co/datasets/bookcorpus) dataset. A copy of this dataset is available in the [book-corpus-dataset](./book-corpus-dataset/) folder. 

### Outline
* Set up pre-requisites.
* Pick a model to deploy.
* Prepare data for inference. 
* Deploy the model for real time inference.
* Test the endpoint
* Clean up resources.

### 1. Set up pre-requisites
* Install dependencies
* Connect to AzureML Workspace. Learn more at [set up SDK authentication](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-setup-authentication?tabs=sdk). Replace  `<WORKSPACE_NAME>`, `<RESOURCE_GROUP>` and `<SUBSCRIPTION_ID>` below.
* Connect to `azureml` system registry

In [1]:
from azure.ai.ml import MLClient
from azure.identity import (
    DefaultAzureCredential,
    InteractiveBrowserCredential,
    ClientSecretCredential,
)
from azure.ai.ml.entities import AmlCompute
import time

try:
    credential = DefaultAzureCredential()
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    credential = InteractiveBrowserCredential()

workspace_ml_client = MLClient(
    credential,
    subscription_id="ea4faa5b-5e44-4236-91f6-5483d5b17d14",
    resource_group_name="amyharrispersonal",
    workspace_name="amyharris-canary",
)
# The models, fine tuning pipelines and environments are available in the AzureML system registry, "azureml-preview"
registry_ml_client = MLClient(credential, registry_name="azureml-preview")

Class FeatureStoreOperations: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class FeatureSetOperations: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class FeatureStoreEntityOperations: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.


### 2. Pick a model to deploy

Browse models in the Model Catalog in the AzureML Studio, filtering by the `fill-mask` task. In this example, we use the `bert-base-uncased` model. If you have opened this notebook for a different model, replace the model name and version accordingly. 

In [2]:
model_name = "bert-base-uncased"
model_version = "3"
foundation_model = registry_ml_client.models.get(model_name, model_version)
print(
    "\n\nUsing model name: {0}, version: {1}, id: {2} for inferencing".format(
        foundation_model.name, foundation_model.version, foundation_model.id
    )
)



Using model name: bert-base-uncased, version: 3, id: azureml://registries/azureml-preview/models/bert-base-uncased/versions/3 for inferencing


### 3. Prepare data for inference.

A subset of the book corpus dataset is available in the [book-corpus-dataset](./book-corpus-dataset/) folder. The next few cells show basic data preparation:
* Visualize some data rows
* We will `<mask>` one word in each sentence so that the model can predict the masked words.
* Save few samples in the format that can be passed as input to the online-inference endpoint.

In [3]:
# load the ./book-corpus-dataset/train_100.jsonl file into a pandas dataframe and show the first 5 rows
import pandas as pd

pd.set_option(
    "display.max_colwidth", 0
)  # set the max column width to 0 to display the full text
train_df = pd.read_json("./book-corpus-dataset/train_100.jsonl", lines=True)
train_df.head()

Unnamed: 0,text
0,"after receiving a positive reply , ella watched through the glass as zayn looked over the group of people , all of them shamefaced for some reason ."
1,"`` to me , that 's the essence of who you are ."
2,"he stood up and grabbed his robe out of his closet , his eyes searching for her in the darkness ."
3,"after thirty-four years of marriage to his father , his mother was still trying to civilize him ."
4,"`` then get to work , '' she said and pushed at his shoulders again ."


In [4]:
# Get the right mask token from huggingface
import urllib.request, json

with urllib.request.urlopen(f"https://huggingface.co/api/models/{model_name}") as url:
    data = json.load(url)
    mask_token = data["mask_token"]

# take the value of the "text" column, replace a random word with the mask token and save the result in the "masked_text" column
import random, os

train_df["masked_text"] = train_df["text"].apply(
    lambda x: x.replace(random.choice(x.split()), mask_token, 1)
)
# save the train_df dataframe to a jsonl file in the ./book-corpus-dataset folder with the masked_ prefix
train_df.to_json(
    os.path.join(".", "book-corpus-dataset", "masked_train.jsonl"),
    orient="records",
    lines=True,
)
train_df.head()

Unnamed: 0,text,masked_text
0,"after receiving a positive reply , ella watched through the glass as zayn looked over the group of people , all of them shamefaced for some reason .","after receiving a positive reply , ella watched through [MASK] glass as zayn looked over the group of people , all of them shamefaced for some reason ."
1,"`` to me , that 's the essence of who you are .","`` to me , that 's the essence [MASK] who you are ."
2,"he stood up and grabbed his robe out of his closet , his eyes searching for her in the darkness .","he stood up [MASK] grabbed his robe out of his closet , his eyes searching for her in the darkness ."
3,"after thirty-four years of marriage to his father , his mother was still trying to civilize him .","after thirty-four years of marriage to his [MASK] , his mother was still trying to civilize him ."
4,"`` then get to work , '' she said and pushed at his shoulders again .","`` then get to work , '' she said and pushed at his shoulders again [MASK]"


### 4. Deploy the model to an online endpoint
Online endpoints give a durable REST API that can be used to integrate with applications that need to use the model.

In [5]:
import time, sys
from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
    ManagedOnlineDeployment,
    OnlineRequestSettings,
)

# Create online endpoint - endpoint names need to be unique in a region, hence using timestamp to create unique endpoint name
timestamp = int(time.time())
online_endpoint_name = "fill-mask-" + str(timestamp)
# create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=online_endpoint_name,
    description="Online endpoint for " + foundation_model.name + ", for fill-mask task",
    auth_mode="key",
)
workspace_ml_client.begin_create_or_update(endpoint).wait()

In [6]:
# create a deployment
demo_deployment = ManagedOnlineDeployment(
    name="demo",
    endpoint_name=online_endpoint_name,
    model=foundation_model.id,
    instance_type="Standard_DS2_v2",
    instance_count=1,
    request_settings=OnlineRequestSettings(
        request_timeout_ms=60000,
    ),
)
workspace_ml_client.online_deployments.begin_create_or_update(demo_deployment).wait()
endpoint.traffic = {"demo": 100}
workspace_ml_client.begin_create_or_update(endpoint).result()

Instance type Standard_DS2_v2 may be too small for compute resources. Minimum recommended compute SKU is Standard_DS3_v2 for general purpose endpoints. Learn more about SKUs here: https://learn.microsoft.com/en-us/azure/machine-learning/referencemanaged-online-endpoints-vm-sku-list
Check: endpoint fill-mask-1684194854 exists
data_collector is not a known attribute of class <class 'azure.ai.ml._restclient.v2022_02_01_preview.models._models_py3.ManagedOnlineDeployment'> and will be ignored


................................................................................................................................

ManagedOnlineEndpoint({'public_network_access': 'Enabled', 'provisioning_state': 'Succeeded', 'scoring_uri': 'https://fill-mask-1684194854.eastus2euap.inference.ml.azure.com/score', 'openapi_uri': 'https://fill-mask-1684194854.eastus2euap.inference.ml.azure.com/swagger.json', 'name': 'fill-mask-1684194854', 'description': 'Online endpoint for bert-base-uncased, for fill-mask task', 'tags': {}, 'properties': {'azureml.onlineendpointid': '/subscriptions/ea4faa5b-5e44-4236-91f6-5483d5b17d14/resourcegroups/amyharrispersonal/providers/microsoft.machinelearningservices/workspaces/amyharris-canary/onlineendpoints/fill-mask-1684194854', 'AzureAsyncOperationUri': 'https://management.azure.com/subscriptions/ea4faa5b-5e44-4236-91f6-5483d5b17d14/providers/Microsoft.MachineLearningServices/locations/eastus2euap/mfeOperationsStatus/oe:c76e6446-545b-4141-80f9-e8ad59c471f2:f3b6f1cb-48cb-4cdf-a06c-8ba81b3901a2?api-version=2022-02-01-preview'}, 'print_as_yaml': True, 'id': '/subscriptions/ea4faa5b-5e44-

### 5. Test the endpoint with sample data

We will fetch some sample data from the test dataset and submit to online endpoint for inference. We will then show the display the scored labels alongside the ground truth labels

In [7]:
import json

# read the ./book-corpus-dataset/masked_train.jsonl file into a pandas dataframe
df = pd.read_json("./book-corpus-dataset/masked_train.jsonl", lines=True)
# escape single and double quotes in the masked_text column
df["masked_text"] = df["masked_text"].str.replace("'", "\\'").str.replace('"', '\\"')
# pick 1 random row
sample_df = df.sample(1)
# create a json object with the key as "inputs" and value as a list of values from the masked_text column of the sample_df dataframe
test_json = {"inputs": {"input_string": sample_df["masked_text"].tolist()}}
# save the json object to a file named sample_score.json in the ./book-corpus-dataset folder
with open(os.path.join(".", "book-corpus-dataset", "sample_score.json"), "w") as f:
    json.dump(test_json, f)
sample_df.head()

Unnamed: 0,text,masked_text
63,at this rate she 'd never figure out if she had enough tables set up for all the wedding guests .,at this rate she \'d never figure out if she had enough tables [MASK] up for all the wedding guests .


In [8]:
# score the sample_score.json file using the online endpoint with the azureml endpoint invoke method
response = workspace_ml_client.online_endpoints.invoke(
    endpoint_name=online_endpoint_name,
    deployment_name="demo",
    request_file="./book-corpus-dataset/sample_score.json",
)
print("raw response: \n", response, "\n")
# convert the json response to a pandas dataframe
response_df = pd.read_json(response)
response_df.head()

raw response: 
 [{"0": "set"}] 



Unnamed: 0,0
0,set


In [9]:
# compare the predicted squences with the ground truth sequence
compare_df = pd.DataFrame(
    {
        "ground_truth_sequence": sample_df["text"].tolist() * 5,
        "predicted_sequence": response_df["sequence"].tolist(),
        "score": response_df["score"].tolist(),
    }
)
compare_df.head()

KeyError: 'sequence'

### 6. Delete the online endpoint
Don't forget to delete the online endpoint, else you will leave the billing meter running for the compute used by the endpoint

In [None]:
workspace_ml_client.online_endpoints.begin_delete(name=online_endpoint_name).wait()