## Text Generation Inference & Content Moderation using Online Endpoints and Azure Content Safety

This sample shows how to deploy `text-generation` type models to an online endpoint for inference and how you can moderate the response with Azure Content Safety before give that generate content to users.

### Task
`text-generation`  is the task of producing new text. These models can, for example, fill in incomplete text or paraphrase. Some common applications of text generation are code generation and story generation.

### Model
Models that can perform the `text-generation` task are tagged with `task: text-generation`. We will use the `gpt2` model in this notebook. If you opened this notebook from a specific model card, remember to replace the specific model name. If you don't find a model that suits your scenario or domain, you can discover and [import models from HuggingFace hub](../../import/import-model-from-huggingface.ipynb) and then use them for inference. 

### Inference data
We will use the [book corpus](https://huggingface.co/datasets/bookcorpus) dataset. A copy of this dataset is available in the [book-corpus-dataset](./book-corpus-dataset/) folder.

### Azure Content Safety
We will use [Azure Content Safety](https://azure.microsoft.com/en-us/pricing/details/cognitive-services/content-moderator/#faq) to validate that the content is safe.

### Outline
* Set up pre-requisites.
* Pick a model to deploy.
* Prepare data for inference. 
* Deploy the model for real time inference.
* Test the endpoint
* Filter content based on response
* Clean up resources.

### 1a. Set up pre-requisites for Azure Machine Learning
* Install dependencies
* Connect to AzureML Workspace. Learn more at [set up SDK authentication](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-setup-authentication?tabs=sdk). Replace  `<WORKSPACE_NAME>`, `<RESOURCE_GROUP>` and `<SUBSCRIPTION_ID>` below.
* Connect to `azureml` system registry

In [6]:
from azure.ai.ml import MLClient
from azure.identity import (
    DefaultAzureCredential,
    InteractiveBrowserCredential,
    ClientSecretCredential,
)
from azure.ai.ml.entities import AmlCompute
import time

try:
    credential = DefaultAzureCredential()
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    credential = InteractiveBrowserCredential()

workspace_ml_client = MLClient(
    credential,
    subscription_id="<SUBSCRIPTION_ID>",
    resource_group_name="<RESOURCE_GROUP>",
    workspace_name="<WORKSPACE_NAME>",
)
# the models, fine tuning pipelines and environments are available in the AzureML system registry, "azureml-preview"
registry_ml_client = MLClient(credential, registry_name="azureml-preview")

### 1b. Set up pre-requisites for Azure Content Safety
1.	Please Log in to the [Azure portal](https://ms.portal.azure.com/?microsoft_azure_marketplace_ItemHideKey=microsoft_azure_cognitiveservices_contentsafety&feature.canmodifystamps=true&Microsoft_Azure_ProjectOxford=stage1#create/Microsoft.CognitiveServicesContentSafety) and apply for an Azure AI Content Safety resource. We offer two region options: West Europe and East US.
2.	After your Azure AI Content Safety resource is successful created, you will receive an API key, and you can use Azure AI Content Safety APIs by referring to this [product document](https://aka.ms/acs-doc).
3.	You could also try our service by [interactive Studio](https://aka.ms/acsstudio).


### 2. Pick a model to deploy

Browse models in the Model Catalog in the AzureML Studio, filtering by the `text-generation` task. In this example, we use the `gpt2` model. If you have opened this notebook for a different model, replace the model name and version accordingly. 

In [7]:
model_name = "gpt2"
model_version = "3"
foundation_model = registry_ml_client.models.get(model_name, model_version)
print(
    "\n\nUsing model name: {0}, version: {1}, id: {2} for inferencing".format(
        foundation_model.name, foundation_model.version, foundation_model.id
    )
)



Using model name: gpt2, version: 3, id: azureml://registries/azureml-preview/models/gpt2/versions/3 for inferencing


### 3. Prepare data for inference.

A copy of the book corpus dataset is available in the [book-corpus-dataset](./book-corpus-dataset/) folder. The next few cells show basic data preparation:
* Visualize some data rows
* Save few samples in the format that can be passed as input to the online-inference endpoint.

In [8]:
# load the ./book-corpus-dataset/train.jsonl file into a pandas dataframe and show the first 5 rows
import pandas as pd

pd.set_option(
    "display.max_colwidth", 0
)  # set the max column width to 0 to display the full text
train_df = pd.read_json("./book-corpus-dataset/train.jsonl", lines=True)
train_df.head(2)

Unnamed: 0,text
0,"usually , he would be tearing around the living room , playing with his toys ."
1,but just one look at a minion sent him practically catatonic .


### 4. Deploy the model to an online endpoint
Online endpoints give a durable REST API that can be used to integrate with applications that need to use the model.

In [9]:
import time, sys
from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
    ManagedOnlineDeployment,
    OnlineRequestSettings,
)

# Create online endpoint - endpoint names need to be unique in a region, hence using timestamp to create unique endpoint name
timestamp = int(time.time())
online_endpoint_name = "text-generation-" + str(timestamp)
# create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=online_endpoint_name,
    description="Online endpoint for "
    + foundation_model.name
    + ", for text-generation task",
    auth_mode="key",
)
workspace_ml_client.begin_create_or_update(endpoint).wait()

In [10]:
# create a deployment
demo_deployment = ManagedOnlineDeployment(
    name="demo",
    endpoint_name=online_endpoint_name,
    model=foundation_model.id,
    instance_type="Standard_DS2_v2",
    instance_count=1,
    request_settings=OnlineRequestSettings(
        request_timeout_ms=60000,
    ),
)
workspace_ml_client.online_deployments.begin_create_or_update(demo_deployment).wait()
endpoint.traffic = {"demo": 100}
workspace_ml_client.begin_create_or_update(endpoint).result()

Instance type Standard_DS2_v2 may be too small for compute resources. Minimum recommended compute SKU is Standard_DS3_v2 for general purpose endpoints. Learn more about SKUs here: https://learn.microsoft.com/en-us/azure/machine-learning/referencemanaged-online-endpoints-vm-sku-list
Check: endpoint text-generation-1683912751 exists
data_collector is not a known attribute of class <class 'azure.ai.ml._restclient.v2022_02_01_preview.models._models_py3.ManagedOnlineDeployment'> and will be ignored


...

### 5. Test the endpoint with sample data

We will fetch some sample data from the test dataset and submit to online endpoint for inference. We will then show the display the scored labels alongside the ground truth labels

In [11]:
import json
import os

# read the ./book-corpus-dataset/train.jsonl file into a pandas dataframe
df = pd.read_json("./book-corpus-dataset/train.jsonl", lines=True)
# escape single and double quotes in the text column
df["text"] = df["text"].str.replace("'", "\\'").str.replace('"', '\\"')
# pick 1 random row
sample_df = df.sample(1)
# create a json object with the key as "inputs" and value as a list of values from the article column of the sample_df dataframe
sample_json = {"inputs": sample_df["text"].tolist()}
# save the json object to a file named sample_score.json in the ./book-corpus-dataset folder
test_json = {"inputs": {"input_string": sample_df["text"].tolist()}}
# save the json object to a file named sample_score.json in the ./book-corpus-dataset folder
with open(os.path.join(".", "book-corpus-dataset", "sample_score.json"), "w") as f:
    json.dump(test_json, f)
sample_df.head()

.

Unnamed: 0,text
57104,`` so you \'re going to continue to be cavalier about this ? \'\'


...

In [76]:
import json
# score the sample_score.json file using the online endpoint with the azureml endpoint invoke method
generated_response = workspace_ml_client.online_endpoints.invoke(
    endpoint_name=online_endpoint_name,
    deployment_name="demo",
    request_file="./book-corpus-dataset/sample_score.json",
)

print("raw response: \n", generated_response , "\n")

raw response: 
 [{"0": "`` so you \\'re going to continue to be cavalier about this ? \\'\\' \"\n\nThis comment could refer to this one. (Note that the \"Cavalry'' does not have to be a noun, at least in the senses"}] 



In [144]:
# Clean and Parse result
cleaned_response_data = json.loads(generated_response)[0]["0"]
print("cleaned response: \n", cleaned_response_data , "\n")

cleaned response: 
 `` so you \'re going to continue to be cavalier about this ? \'\' "

This comment could refer to this one. (Note that the "Cavalry'' does not have to be a noun, at least in the senses 



### 6. Azure Content Safety (ACS) Content Moderation
Use Azure Content Safety for content moderation. We will run the text generate result through the Azure Content Safety endpoint and return a generic message telling the user their content was filtered if the generate text was flag for medium or high severity for [Hate, Self Harm, Sexual or Violence.](https://aka.ms/acs-doc)



### Call ACS Text API with a sample request 
1. Go to the Azure Content Safety resource you created in Step 1b. The key can be found in the Keys and Endpoint section in the left pane. 
1. Find your Resource Endpoint URL in your Azure Portal in the **Resource Overview** page under the **Endpoint** field. 
1. Substitute the `<Endpoint>` term with your Resource Endpoint URL.
1. Paste your subscription key into the `Ocp-Apim-Subscription-Key` field.
1. Change the body of the request to whatever string of text you'd like to analyze.

> **NOTE:**
>
> The samples may contain offensive content, user discretion advised.

In [145]:
import requests
def analyze_text(input_text):
    endpoint = "<ENDPOINT>"
    key = '<SUBSCRIPTION_KEY>'

    # Build request
    url = endpoint + '/contentsafety/text:analyze?api-version=2023-04-30-preview'
    headers = {
        'Ocp-Apim-Subscription-Key': key ,
        'Content-Type': 'application/json'
    }
    data = {
        "text": input_text,
        "categories": [
            "Hate","Sexual","SelfHarm","Violence"
        ]
    }

    # Analyze text
    try:
        acs_response = requests.post(url, headers=headers, json=data)
    except Exception as e:
        print("Error code: {}".format(e.error.code))
        print("Error message: {}".format(e.error.message))
        return
    
    json_acs_content = json.loads(acs_response.content)
    print("Text API response: \n", json.dumps(json_acs_content, indent=4))
    
    for key in json_acs_content:
        if key != "blocklistsMatchResults":
            if json_acs_content[key]["severity"] < 2:
                continue
            else:
                category = json_acs_content[key]["category"]
                return "Text Generated for User: This content was filter for {0}, as the severity of the response medium or high".format(category)
    return "Text Generated for User: {0}".format(input_text)

In [146]:
analyze_text(cleaned_response_data)

Text API response: 
 {
    "blocklistsMatchResults": [],
    "hateResult": {
        "category": "Hate",
        "severity": 0
    },
    "selfHarmResult": {
        "category": "SelfHarm",
        "severity": 0
    },
    "sexualResult": {
        "category": "Sexual",
        "severity": 0
    },
    "violenceResult": {
        "category": "Violence",
        "severity": 0
    }
}


'Text Generated for User: `` so you \\\'re going to continue to be cavalier about this ? \\\'\\\' "\n\nThis comment could refer to this one. (Note that the "Cavalry\'\' does not have to be a noun, at least in the senses'

#### Interpret Text API response

You should see results displayed as JSON data. For example:

```json
{
    "blocklistsMatchResults": [],
    "hateResult": {
        "category": "Hate",
        "severity": 0
    },
    "selfHarmResult": {
        "category": "SelfHarm",
        "severity": 0
    },
    "sexualResult": {
        "category": "Sexual",
        "severity": 0
    },
    "violenceResult": {
        "category": "Violence",
        "severity": 0
    }
}
```

The JSON fields in the output are defined in the following table:

| Name         | Description                                                  | Type    |
| :----------- | :----------------------------------------------------------- | ------- |
| **category** | Each output class that the API predicts. Classification can be multi-labeled. For example, when a text is run through a text content safmodel, it could be classified as sexual content as well as violence. | String  |
| **severity** | The higher the severity of input content, the larger this value is. The values could be: 0,2,4,6. | Integer |


> **NOTE: Why severity level is not continuous**
>
> Currently, we only use levels 0, 2, 4, and 6. In the future, we may be able to extend the severity levels to 0, 1, 2, 3, 4, 5, 6, 7: seven levels with finer granularity.

### 7. Delete the online endpoint
Don't forget to delete the online endpoint, else you will leave the billing meter running for the compute used by the endpoint

In [None]:
workspace_ml_client.online_endpoints.begin_delete(name=online_endpoint_name).wait()