# Working with 3rd party (detached) Prompts/Prompt Template Assets(Cloud)


This notebook should be run using with Runtime 22.2 & Python 3.10 or greater runtime environment, if you are viewing this in Watson Studio, and do not see Python 3.10.x in the upper right corner of your screen, please update the runtime now. 

The notebook will create a retrieval augmented generation prompt template asset in a given project, configure OpenScale to monitor that PTA and evaluate generative quality metrics and model health metrics. This notebook has data which is memory centric

If users wish to execute this notebook for task types other than retrieval_augmented_generation, please consult [this](https://github.com/IBM/watson-openscale-samples/blob/main/IBM%20Cloud/WML/notebooks/watsonx/README.md) document for guidance on evaluating prompt templates for the available task types.

Note : User can search for `EDIT THIS` and fill the inputs needed.

## Prerequisite

* It requires service credentials for IBM Watson OpenScale:
* Requires a CSV file containing the test data that needs to be evaluated
* Requires the ID of project in which you want to create the prompt template asset.

### Contents

- [Setup](#settingup)
- [Create Prompt template](#prompt)
- [Prompt Setup](#ptatsetup)
- [Risk evaluations for prompt template asset subscription](#evaluate)
- [Display the Model Risk metrics](#mrmmetric)
- [Display the Generative AI Quality metrics](#genaimetrics)
- [Plot rougel and rougelsum metrics against records](#plotproject)
- [See factsheets information](#factsheetsspace)

## Setup <a name="settingup"></a>

In [None]:
!pip install --upgrade datasets==2.10.0 --no-cache | tail -n 1
!pip install --upgrade evaluate --no-cache | tail -n 1
!pip install --upgrade ibm-aigov-facts-client | tail -n 1
!pip install --upgrade ibm-watson-openscale | tail -n 1
!pip install --upgrade ibm-watsonx-ai | tail -n 1
!pip install --upgrade matplotlib | tail -n 1
!pip install --upgrade pydantic==2.7.4 --no-cache | tail -n 1
!pip install --upgrade sacrebleu --no-cache | tail -n 1
!pip install --upgrade sacremoses --no-cache | tail -n 1
!pip install --upgrade textstat --no-cache | tail -n 1
!pip install --upgrade transformers --no-cache | tail -n 1

Note: you may need to restart the kernel to use updated packages.

In [None]:
!pip install --upgrade pydantic==2.7.4 --no-cache | tail -n 1

### Provision services and configure credentials

If you have not already, provision an instance of IBM Watson OpenScale using the [OpenScale link in the Cloud catalog](https://cloud.ibm.com/catalog/services/watson-openscale).

Your Cloud API key can be generated by going to the [**Users** section of the Cloud console](https://cloud.ibm.com/iam#/users). From that page, click your name, scroll down to the **API Keys** section, and click **Create an IBM Cloud API key**. Give your key a name and click **Create**, then copy the created key and paste it below.

**NOTE:** You can also get OpenScale `API_KEY` using IBM CLOUD CLI.

How to install IBM Cloud (bluemix) console: [instruction](https://console.bluemix.net/docs/cli/reference/ibmcloud/download_cli.html#install_use)

How to get api key using console:
```
bx login --sso
bx iam api-key-create 'my_key'
```

In [None]:
IAM_URL = "https://iam.cloud.ibm.com"
DATAPLATFORM_URL = "https://api.dataplatform.cloud.ibm.com"
#DATAPLATFORM_URL = "https://api.eu-de.dataplatform.cloud.ibm.com"
#DATAPLATFORM_URL = "https://api.au-syd.dataplatform.cloud.ibm.com"
SERVICE_URL = "https://aiopenscale.cloud.ibm.com"
#SERVICE_URL = "https://au-syd.aiopenscale.cloud.ibm.com"
CLOUD_API_KEY = "<apikey>" # YOUR_CLOUD_API_KEY
WML_CREDENTIALS = {
                "url": "https://us-south.ml.cloud.ibm.com",
                "apikey": CLOUD_API_KEY,
                "auth_url": IAM_URL,
                "wml_location" : "cloud"
}


## Set the project ID

In order to set up a development type subscription, the PTA must be within the project. Please supply the project ID where the PTA needs to be created.

In [None]:
PROJECT_ID = "<project_id>" # YOUR_PROJECT_ID

## Read space id from user

User can use an existing space or can create a new space to promote the model. User should choose any of these options with the below variable.

In [None]:
use_existing_space = True # Set it as False if user wants to create a new space
space_id="<space_id>"

In [None]:
import json
from ibm_watsonx_ai import APIClient

wml_client = APIClient(WML_CREDENTIALS)
wml_client.version

## Function to create the access token

This function generates an IAM access token using the provided credentials. The API calls for creating and scoring prompt template assets utilize the token generated by this function.

In [None]:
import requests, json
def generate_access_token():
    headers={}
    headers["Content-Type"] = "application/x-www-form-urlencoded"
    headers["Accept"] = "application/json"
    data = {
        "grant_type": "urn:ibm:params:oauth:grant-type:apikey",
        "apikey": CLOUD_API_KEY,
        "response_type": "cloud_iam"
    }
    response = requests.post(IAM_URL + "/identity/token", data=data, headers=headers)
    json_data = response.json()
    iam_access_token = json_data["access_token"]
        
    return iam_access_token

iam_access_token = generate_access_token()
print(iam_access_token)

# Demo Dataset <a name="alternative"></a>


Used as alternative to be run for testing in low resource CPD clusters


In [None]:
!wget https://ibm.box.com/shared/static/b8c7kbrnjl1ij9em23cmmeoznwcep9ud.csv


In [None]:
!mv b8c7kbrnjl1ij9em23cmmeoznwcep9ud.csv Summary_data.csv

In [None]:
import pandas as pd

test_data_path = "Summary_data.csv"
llm_data = pd.read_csv(test_data_path, encoding='latin1')
llm_data=llm_data.head(10)


In [None]:
llm_data.to_csv(test_data_path)

In [None]:
llm_data

# Create Prompt template <a name="prompt"></a>

Create a prompt template for a retrieval augmented generation task

In [None]:
from ibm_aigov_facts_client import AIGovFactsClient

facts_client = AIGovFactsClient(
    api_key=CLOUD_API_KEY,
    container_id=PROJECT_ID,
    container_type="project",
    disable_tracing=True
    #region="europe"
)


In [None]:
prompt_input="""
summarise the given context.
{input_incident}
"""

In [None]:

from ibm_aigov_facts_client import DetachedPromptTemplate, PromptTemplate

detached_information = DetachedPromptTemplate(
    prompt_id="detached_prompt",
    model_id="meta-llama/llama-3-70b-instruct",
    model_provider="Facebook",
    model_name="llama-3-70b-instruct",
    model_url="https://us-south.ml.cloud.ibm.com/ml/v1/deployments/insurance_test_deployment/text/generation?version=2021-05-01",
    prompt_url="prompt_url",
    prompt_additional_info={"IBM Cloud Region": "us-east1"}
)

task_id = "summarization"
name = "External prompt sample Summarization"
description = "Detached prompt sample - testing_DO_NOT_USE"
model_id = "meta-llama/llama-3-70b-instruct"

# define parameters for PromptTemplate
prompt_variables = {"input_incident": ""}
input = prompt_input
input_prefix= ""
output_prefix= ""

prompt_template = PromptTemplate(
    input=input,
    prompt_variables=prompt_variables,
    input_prefix=input_prefix,
    output_prefix=output_prefix,
)

pta_details = facts_client.assets.create_detached_prompt(
    model_id=model_id,
    task_id=task_id,
    name=name,
    description=description,
    prompt_details=prompt_template,
    detached_information=detached_information)
project_pta_id = pta_details.to_dict()["asset_id"]

# See factsheets information <a name="factsheetsspace"></a>

In [None]:
factsheets_url = f"{DATAPLATFORM_URL.replace('api.', '')}/wx/prompt-details/{project_pta_id}/factsheet?context=wx&project_id={PROJECT_ID}"

print(f"User can navigate to the published facts in project {factsheets_url}")

# Evaluate Prompt template from space <a name="evaluatespace"></a>

Now, we can promote the created prompt template asset to space and perform similar actions.

# Promote PTA to space <a name="promottospace"></a> 

Below cell promotes the prompt template asset from the project to the space.

In [None]:

headers={}
headers["Content-Type"] = "application/json"
headers["Accept"] = "*/*"
headers["Authorization"] = "Bearer {}".format(iam_access_token)
verify = True

url = "{}/v2/assets/{}/promote".format(DATAPLATFORM_URL ,project_pta_id)

params = {
    "project_id":PROJECT_ID
}

payload = {
    "space_id": space_id
}
response = requests.post(url, json=payload, headers=headers, params = params, verify = verify)
json_data = response.json()
space_pta_id = json_data["metadata"]["asset_id"]
space_pta_id

# Create deployment for prompt template asset in space <a name="ptadeployment"></a>

To create a subscription from space, it is necessary to create a deployment for prompt template assets in spaces.

In [None]:
DEPLOYMENTS_URL = WML_CREDENTIALS["url"] + "/ml/v4/deployments"

payload = {
    "prompt_template": {
      "id": space_pta_id
    },
    "detached": {
    },
    "base_model_id": "meta-llama/llama-3-70b-instruct",
    "description": "rag qa deployment",
    "name": "TEST model deployment_DO_NOT_USE_v93",
    "space_id": space_id
}

version = "2023-07-07" # The version date for the API of the form YYYY-MM-DD. Example : 2023-07-07
params = {
    "version":version,
    "space_id":space_id
}

response = requests.post(DEPLOYMENTS_URL, json=payload, headers=headers, params = params, verify = verify)
json_data = response.json()


if "metadata" in json_data:
    deployment_id = json_data["metadata"]["id"]
    print(deployment_id)
else:
    print(json_data)

In [None]:
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator, CloudPakForDataAuthenticator

from ibm_watson_openscale import *
from ibm_watson_openscale.supporting_classes.enums import *
from ibm_watson_openscale.supporting_classes import *

service_instance_id = None # Update this to refer to a particular service instance
authenticator = IAMAuthenticator(
    apikey=CLOUD_API_KEY,
    url=IAM_URL
)
wos_client = APIClient(
    authenticator=authenticator,
    service_url=SERVICE_URL,
    service_instance_id=service_instance_id
)
data_mart_id = wos_client.service_instance_id
print(wos_client.version)

# Setup the prompt template asset in space for evaluation with supported monitor dimensions <a name="ptaspace"></a>

The prompt template assets from space is only supported with [`pre_production` and `production`] operational space IDs. Running the below cell will create a `pre_production` type subscription from the prompt template asset promoted to the space. The `problem_type` value should depend on the task type specified in the prompt template asset.

In [None]:
label_column = "output"
operational_space_id = "production"
problem_type= "summarization"
input_data_type= "unstructured_text"

monitors = {
    "generative_ai_quality": {
        "parameters": {   
            "min_sample_size": 5,
            "metrics_configuration":{  
                  "content_analysis":{},
                  "pii": { "record_level_max_score": 0.5 },
                  "hap_score": { "record_level_max_score": 0.5 },
                  "pii_input": { "record_level_max_score": 0.5 },
                  "hap_input_score": { "record_level_max_score": 0.5 },
            }
        }
    },
   
}


response = wos_client.wos.execute_prompt_setup(prompt_template_asset_id = space_pta_id, 
                                                                   space_id = space_id,
                                                                   deployment_id = deployment_id,
                                                                   label_column = label_column, 
                                                                   operational_space_id = operational_space_id, 
                                                                   problem_type = problem_type,
                                                                   input_data_type = input_data_type, 
                                                                   supporting_monitors = monitors, 
                                                                   background_mode = False)

result = response.result
result._to_dict()

With the below cell, users can read the prompt setup task and check its status

In [None]:
response = wos_client.monitor_instances.mrm.get_prompt_setup(prompt_template_asset_id = space_pta_id,
                                                             deployment_id = deployment_id,
                                                             space_id = space_id)

result = response.result
result_json = result._to_dict()
result_json

### Read subscription id from prompt setup

Once prompt setup status is finished, Read the subscription id from it.

In [None]:
prod_subscription_id = result_json["subscription_id"]
prod_subscription_id

## Below segment is required only if the user chooses PRODUCTION SPACE <a name="Prod"></a>

Now that the WML service has been bound and the subscription has been created, we need to score the prompt template asset. The downloaded csv is used to construct the payload as well as feedback for the deployment.

In [None]:
import csv

feature_fields = ["input_incident"]
prediction = "generated_text"

pl_data = []
prediction_list = []

with open(test_data_path, 'r') as csv_file:
    csv_reader = csv.DictReader(csv_file)
    for row in csv_reader:
        request = {
            "parameters": {
                "template_variables": {
                }
            }
        }
        for each in feature_fields:
            request["parameters"]["template_variables"][each] = str(row[each])

        predicted_val = row[prediction]
        prediction_list.append(predicted_val)
        response = {
            "results": [
                {
                    prediction: predicted_val,
                    'input_token_count': 1000,
                    'generated_token_count': 200
                }
            ]
        }
        record = {"request": request, "response": response,  "response_time": 3000}
        pl_data.append(record)
pl_data


In [None]:
import time
from ibm_watson_openscale.supporting_classes.enums import *

time.sleep(5)
payload_data_set_id = None
payload_data_set_id = wos_client.data_sets.list(type=DataSetTypes.PAYLOAD_LOGGING, 
                                                target_target_id=prod_subscription_id, 
                                                target_target_type=TargetTypes.SUBSCRIPTION).result.data_sets[0].metadata.id
if payload_data_set_id is None:
    print("Payload data set not found. Please check subscription status.")
else:
    print("Payload data set id: ", payload_data_set_id)

In [None]:
import uuid
from ibm_watson_openscale.supporting_classes.payload_record import PayloadRecord
time.sleep(5)
pl_records_count = wos_client.data_sets.get_records_count(payload_data_set_id)
print("Number of records in the payload logging table: {}".format(pl_records_count))
if pl_records_count < 110:
    print("Payload logging did not happen, performing explicit payload logging.")
    wos_client.data_sets.store_records(data_set_id=payload_data_set_id, request_body=pl_data,background_mode=False)
    time.sleep(5)
    pl_records_count = wos_client.data_sets.get_records_count(payload_data_set_id)
    print("Number of records in the payload logging table: {}".format(pl_records_count))

Run below 2 cells if you have Devlopment / Validation  Deployment Space to trigger manual evaluation. Production Deployment Space will have the evaluation triggered through auto scheduler at every 1 hour with the min sample size condition met.

In [None]:
monitor_definition_id = "mrm"
target_target_id = prod_subscription_id
result = wos_client.monitor_instances.list(data_mart_id=data_mart_id,
                                           monitor_definition_id=monitor_definition_id,
                                           target_target_id=target_target_id,
                                           space_id=space_id).result
result_json = result._to_dict()
print(result_json)
mrm_monitor_id = result_json["monitor_instances"][0]["metadata"]["id"]
mrm_monitor_id

In [None]:
#####################################################################################
######### For pre_production flow 
######################################################################################
body = {}
#content_type = 'text/csv'
response  = wos_client.monitor_instances.mrm.evaluate_risk(monitor_instance_id=mrm_monitor_id, 
                                                    body = body,
                                                    space_id = space_id,
                                                    evaluation_tests = ["model_health", "generative_ai_quality"],
                                                    background_mode = False)

In [None]:
factsheets_url = "https://dataplatform.cloud.ibm.com/ml-runtime/deployments/{}/details?space_id={}&context=wx&flush=true".format(deployment_id, space_id)

print("User can navigate to the published facts in space {}".format(factsheets_url))

In [None]:
custom_monitor_id = 'user_feedback_metrics'
integrated_system_id='0194baf3-1f96-7ff7-9474-73caac3cf071'
print("custom monitor id : ", custom_monitor_id)
print("integrated_system_id :", integrated_system_id)

In [None]:
custom_monitor_details = wos_client.monitor_definitions.get(monitor_definition_id=custom_monitor_id).result
print('Monitor definition details:', custom_monitor_details)


In [None]:
target = Target(
        target_type=TargetTypes.SUBSCRIPTION,
        target_id=prod_subscription_id
    )
 
#thresholds = [MetricThresholdOverride(metric_id='positive feedback', type = MetricThresholdTypes.LOWER_LIMIT, value=0.9)]
 
custom_monitor_instance_details = wos_client.monitor_instances.create(
            data_mart_id=data_mart_id,
            background_mode=False,
            monitor_definition_id=custom_monitor_id,
            parameters={'custom_metrics_provider_id': integrated_system_id},
            target=target
).result


In [None]:
print(custom_monitor_instance_details)
custom_monitor_instance_id = custom_monitor_instance_details.metadata.id
print(custom_monitor_instance_id)

In [None]:
from datetime import datetime, timezone, timedelta
from ibm_watson_openscale.base_classes.watson_open_scale_v2 import MonitorMeasurementRequest
custom_monitoring_run_id=integrated_system_id
measurement_request = [MonitorMeasurementRequest(timestamp=datetime.now(timezone.utc),
                                                 metrics=[{"positive_feedback": 0.9, "negative_feedback" : 0.7, "comment": "user feedback data"}], run_id=custom_monitoring_run_id)]
print(measurement_request[0])
 
published_measurement_response = wos_client.monitor_instances.measurements.add(
    monitor_instance_id=custom_monitor_instance_id,
    monitor_measurement_request=measurement_request).result
published_measurement_id = published_measurement_response[0]["measurement_id"]
print(published_measurement_response)


In [None]:
published_measurement = wos_client.monitor_instances.measurements.get(monitor_instance_id=custom_monitor_instance_id, measurement_id=published_measurement_id).result
print(published_measurement)


In [None]:
factsheets_url = "https://dataplatform.cloud.ibm.com/ml-runtime/deployments/{}/details?space_id={}&context=wx&flush=true".format(deployment_id, space_id)

print("User can navigate to the published facts in space {}".format(factsheets_url))

## Congratulations!

This notebook will publish the evaluation metrics for Summarization and also supports publishing custom metrics for user feedbacks.