# Create an Azure OpenAI Generative AI Model as a prompt template in watsonx.governance 2.x

This notebook has been adapted for the watsonx.governance Level 4 PoX hands-on lab. It is originally based on [this notebook](https://github.com/rreno85/wxgovlab/blob/main/watsonxgov%20detached%20prompt%20-%20Azure%20OpenAI.ipynb) by Bob Reno.

This notebook will create a *detached* prompt template asset that references a generative AI model in Azure OpenAI to start governing this model in **watsonx.governance**.

**Notes**

- This notebook should be run using with Runtime 22.2 & Python 3.10 or greater runtime environment (e.g.: 3.11, 3.12), if you are viewing this in Watson Studio, and do not see "Python 3.10/3.11" in the upper right corner of your screen, please update the runtime now. 
- This notebook assumes you have **access to an Azure OpenAI account that has the `opeanai-gpt-3.5` model deployed**. If you don't have access to this account, try reserving the `Access to Azure OpenAI GPT 3.5 Model` environment available in IBM's [TechZone](https://techzone.ibm.com/) (as of September 2024).
- If users wish to execute this notebook for task types other than summarization, please consult [this](https://github.com/IBM/watson-openscale-samples/blob/main/IBM%20Cloud/WML/notebooks/watsonx/README.md) document for guidance on evaluating prompt templates for the available task types.


## Setup <a name="settingup"></a>

Run the below cell to install the required packages.

In [None]:
!pip install --upgrade datasets==2.10.0 --no-cache | tail -n 1
!pip install --upgrade evaluate --no-cache | tail -n 1
!pip install --upgrade --extra-index-url https://test.pypi.org/simple/ ibm-aigov-facts-client | tail -n 1
!pip install --upgrade "ibm-watson-openscale>=3.0.4" | tail -n 1
!pip install "ibm-watson-machine-learning"
!pip install --upgrade matplotlib | tail -n 1
!pip install --upgrade pydantic==1.10.11 --no-cache | tail -n 1
!pip install --upgrade sacrebleu --no-cache | tail -n 1
!pip install --upgrade sacremoses --no-cache | tail -n 1
!pip install --upgrade textstat --no-cache | tail -n 1
!pip install --upgrade openai rich azure-identity --no-cache | tail -n 1
# !pip install --upgrade transformers --no-cache | tail -n 1

**Note:** you may need to *restart the kernel* to use the updated packages. You don't need to run the cell above again after restarting

Fill-in your platform and Azure credentials:

In [None]:
import os
from rich import print
from IPython.display import display, Markdown

CPD_URL = "https://cpd-cpd.apps.________________.ocp.techzone.ibm.com/"
CPD_USERNAME = "complianceofficer"
CPD_API_KEY = "<EDIT THIS>"

AZURE_OPENAI_ENDPOINT = "<EDIT THIS>"
AZURE_OPENAI_DEPLOYMENT_NAME = "<EDIT THIS>"
AZURE_CLIENT_ID = "<EDIT THIS>"
AZURE_CLIENT_SECRET = "<EDIT THIS>"
AZURE_TENANT_ID = "<EDIT THIS>"

PROJECT_ID = os.environ.get('PROJECT_ID', "<YOUR_PROJECT_ID>")
print(f"Your project id is '{PROJECT_ID}'")

### Function to create the access token

This function generates an IAM access token using the provided credentials. The API calls for creating and scoring prompt template assets utilize the token generated by this function.

In [None]:
import requests
import urllib3, json  # noqa: E401
urllib3.disable_warnings()

def generate_access_token():
    headers={}
    headers["Content-Type"] = "application/json"
    headers["Accept"] = "application/json"
    data = {
        "username":CPD_USERNAME,
        "api_key":CPD_API_KEY
    }
    data = json.dumps(data).encode("utf-8")
    url = CPD_URL + "/icp4d-api/v1/authorize"
    response = requests.post(url=url, data=data, headers=headers,verify=False)
    response.raise_for_status()
    json_data = response.json()
    iam_access_token = json_data['token']
    print("Access token generated succesfully!")
    return iam_access_token

iam_access_token = generate_access_token()

## Creating the Prompt Template

The following cell shows the development of a prompt template used to summarize resumes from job applicants. 

We will test inference on Azure OpenAI and create a detached prompt template in our project in watsonx that references the model and prompt.

In [None]:
PROMPT_TEMPLATE = """
You will be given a resume. Please summarize the resume in 100 words or less.

--- start of text ---
{text}
--- end of text ---
""".strip()

In [None]:
import pandas as pd
import asyncio
from openai import AsyncAzureOpenAI
from azure.identity import ClientSecretCredential, get_bearer_token_provider

def get_azure_token_provider():
    default_scope = "https://cognitiveservices.azure.com/.default"
    credential = ClientSecretCredential(
        tenant_id=os.environ.get('AZURE_TENANT_ID', AZURE_TENANT_ID),
        client_id=os.environ.get('AZURE_CLIENT_ID' ,AZURE_CLIENT_ID),
        client_secret=os.environ.get('AZURE_CLIENT_SECRET', AZURE_CLIENT_SECRET)
    )
    return get_bearer_token_provider(credential, default_scope)

async def summarize_resume(text:str, max_tokens:int=200, token_provider=None):
    """
    This function uses the Azure OpenAI API to summarize the text of the resume given.
    Usage: `summary = await summarize('[resume text to summarize]')`
    """
    if token_provider is None:
        token_provider = get_azure_token_provider()
    client = AsyncAzureOpenAI(
        azure_endpoint=os.environ.get('AZURE_OPENAI_ENDPOINT', AZURE_OPENAI_ENDPOINT),
        api_version="2024-02-15-preview",
        azure_ad_token_provider=token_provider
    )
    model_response = await client.chat.completions.create(
        model=os.environ.get('AZURE_OPENAI_DEPLOYMENT_NAME', AZURE_OPENAI_DEPLOYMENT_NAME),
        messages=[{"role": "user", "content": PROMPT_TEMPLATE.format(text=text)}],
        max_tokens=max_tokens
    )
    return model_response.choices[0].message.content

async def summarize_batch(resumes:list) -> list:
    """Summarize all the resumes given"""
    token_provider = get_azure_token_provider()
    summaries = await asyncio.gather(
        *[summarize_resume(resume, token_provider=token_provider) for resume in resumes]
    )
    return summaries

### Load the resume data

In [None]:
data = pd.read_csv("https://raw.githubusercontent.com/CloudPak-Outcomes/Outcomes-Projects/main/watsonx-governance-l4/data/resume_summarization_test_data.csv").head(10)
print(f"{len(data)} rows of data loaded")
data.head()

### Generate the summaries of the resumes

**Note:** This might take a while to finish running

In [None]:
data['generated_text'] = await summarize_batch(data['Resume'].values)
data.head()

Display the results

In [None]:
# you can run this multiple times to show the results from different row samples
def display_result(row):
    print(f"[bold]Resume:[/bold]\n[red]{row.Resume}[/red]")
    print(f"[bold]AI Generated Summary:[/bold]\n[blue1]{row.generated_text}[/blue1]")
    print(f"[bold]Reference (Labeled) Summary:[/bold]\n[green]{row.Summarization}[/green]")

display_result(data.sample().iloc[0])

### Create the detached prompt template <a name="detached_prompt"></a>

Create a detached prompt template in your project for the summarization task that references the Azure OpenAI model.

In [None]:
from ibm_aigov_facts_client import (
    AIGovFactsClient, CloudPakforDataConfig,
    DetachedPromptTemplate, PromptTemplate
)
from ibm_aigov_facts_client.utils.enums import Task

creds = CloudPakforDataConfig(
    service_url=CPD_URL,
    username=CPD_USERNAME,
    api_key=CPD_API_KEY
)
facts_client = AIGovFactsClient(
    cloud_pak_for_data_configs=creds,
    container_id=PROJECT_ID,
    container_type="project",
    disable_tracing=True
)

In [None]:
detached_information = DetachedPromptTemplate(
    prompt_id="detached-aoai-prompt",
    model_id=f"azure/{AZURE_OPENAI_DEPLOYMENT_NAME}",
    model_provider="Azure OpenAI",
    model_name="GPT-3.5-turbo",
    model_url=AZURE_OPENAI_ENDPOINT,
    prompt_url="prompt_url",
    prompt_additional_info={"model_owner": "Microsoft", "model_version": "gpt-3.5-turbo-1106"}
)
prompt_name = "Detached prompt for Azure OpenAI GPT-3.5-turbo"
prompt_description = "A detached prompt for summarization using Azure OpenAI's GPT-3.5-turbo model"

# define parameters for PromptTemplate
prompt_template = PromptTemplate(
    input=PROMPT_TEMPLATE,
    prompt_variables={"text": ""},
)
pta_details = facts_client.assets.create_detached_prompt(
    model_id=f"azure/{AZURE_OPENAI_DEPLOYMENT_NAME}",
    task_id=Task.SUMMARIZATION, # 'summarization' task
    name=prompt_name,
    description=prompt_description,
    prompt_details=prompt_template,
    detached_information=detached_information
)
project_pta_id = pta_details.to_dict()["asset_id"]
print(f"Detached Prompt template ID: '{project_pta_id}'")

In [None]:
factsheets_url = f"{CPD_URL.strip('/')}/wx/prompt-details/{project_pta_id}/factsheet?context=wx&project_id={PROJECT_ID}"
display(Markdown(f"[Click here to navigate to the published factsheet in the project]({factsheets_url})"))

**<u>Click the link above to go to the newly published factsheet in your watsonx project</u>**