# Custom generative ai evaluator using python function wrapper on Watsonx.ai foundation model.

The notebook demonstrates the creation of custom generative ai evaluator by creating a python function deployment in Cloud/CPD which wraps an Watsonx.ai foundation model.
Here the wrapped foundation model is used as the evaluator.

The custom generative ai evaluator endpoint should support the input, output formats described below.

Input format
```json
{
  "input_data": [
    {
      "fields": ["input"],
      "values": [["<prompt_1>"], ["<prompt_2>"]]
    }
  ]
}
```
e.g: `{"input_data": [{"fields": ["input"], "values": [["tell me about IBM"], ["tell me about openscale"]]}]}`

Output format
```json
{
  "predictions": [
    {
      "fields": ["generated_text"],
      "values": [
        [
          "<generated_text_value_1>"
        ],
        [
          "<generated_text_value_2>"
        ]
      ]
    }
  ]
```
e.g: `{"predictions": [{"fields": ["generated_text"], "values": [["International Business Machines Corporation (IBM) is a multinational technology company..."], ["IBM Watson OpenScale is a machine learning model ...."]]}]}`

**Note**: In the output response, generated_text field name is mandatory.

## Learning goals
- Configure foundation model 
- Create python function
- Deploy python function
- Test the Deployment

## Contents

- [Step 1 - Setup](#step-1)
- [Step 2 - Python function creation and deployment in watsonx.ai](#step-2)
- [Step 3 - Testing python function deployment](#step-3)

## Step 1 - Setup <a id="step-1"></a>

### Install the necessary libraries

In [None]:
!pip install --upgrade ibm-watsonx-ai | tail -n 1

### Configure credentials

In [None]:
CLOUD_API_KEY = "<API_KEY>"

CREDENTIALS = {
    "url": "https://us-south.ml.cloud.ibm.com",
    "apikey": CLOUD_API_KEY,
}

Uncomment the code and execute the cell below only if the Python function needs to be deployed in the CPD.

In [None]:
# CREDENTIALS = {
#     "url": "<CPD_URL>",
#     "username": "<USERNAME>",
#     "password": "<PASSWORD>",
#     "instance_id": "openshift",
#     "apikey": "<API_KEY>",
#     "version": "5.0",
# }

In [2]:
from ibm_watsonx_ai import APIClient

watsonx_ai_client = APIClient(CREDENTIALS)
watsonx_ai_client.version

'1.1.22'

In [None]:
space_id = "<DEPLOYMENT_SPACE_ID>"
watsonx_ai_client.set.default_space(space_id)

'SUCCESS'

### Foundation model credentials

In [None]:
FM_CREDENTIALS = {  # credentials to score the foundation model in cloud
    "url": "https://us-south.ml.cloud.ibm.com",
    "iamurl": "https://iam.cloud.ibm.com",
    "apikey": "<API_KEY>",
}

Uncomment the code and run the below cell only if foundation model is present in CPD

In [None]:
# FM_CREDENTIALS = {  # credential to score the foundation model in CPD
#     "url": "<CPD_URL>",
#     "username": "<USERNAME>",
#     "password": "<PASSWORD>",
#     "instance_id": "openshift",
#     "apikey": "<API_KEY>",
#     "version": "5.0",
# }

In [None]:
params = {
    "fm_credentials": FM_CREDENTIALS,
    "space_id": "<SPACE_ID>",
    # "project_id": "<PROJECT_ID>"
}

## Python function creation and deployment in watsonx.ai <a id="step-2"></a>

This wrapper function asynchronously scores against the `google/flan-ul2` foundation model. The model's response is then converted into the required output format.

In [6]:
def scoring_wrapper(params=params):
    import requests
    import json
    import asyncio
    import aiohttp

    space_id = params.get("space_id")
    api_key = params["fm_credentials"]["apikey"]
    score_endpoint = params["fm_credentials"]["url"]
    url = f"{score_endpoint}/ml/v1/text/generation?version=2023-05-29"
    retries = 3
    delay = 2

    if iam_url := params["fm_credentials"].get("iamurl"):
        auth_headers = {"Content-Type": "application/x-www-form-urlencoded"}
        auth_url = f"{iam_url}/oidc/token"
        auth_body = {
            "apikey": api_key,
            "grant_type": "urn:ibm:params:oauth:grant-type:apikey",
        }
        token_str = "access_token"
    else:
        auth_headers = {"Content-Type": "application/json"}
        auth_url = f"{score_endpoint}/icp4d-api/v1/authorize"
        auth_body = json.dumps(
            {"username": params["fm_credentials"].get("username"), "api_key": api_key}
        )
        token_str = "token"

    def score(payload):
        auth_resp = requests.post(
            auth_url,
            verify=False,
            headers=auth_headers,
            data=auth_body,
        )
        token = auth_resp.json().get(token_str)
        headers = {
            "Accept": "application/json",
            "Content-Type": "application/json",
            "Authorization": f"Bearer {token}",
        }
        values = payload["input_data"][0]["values"]
        inputs = [value[0] for value in values]

        async def score_async_wrap():
            async def parallel_request(session, input, retries=retries, delay=delay):
                attempt = 0
                body = {
                    "input": f"{input}",
                    "parameters": {
                        "decoding_method": "greedy",
                        "max_new_tokens": 900,
                        "repetition_penalty": 1,
                    },
                    "model_id": "google/flan-ul2",
                    "space_id": space_id,
                }
                while attempt < retries:
                    async with session.post(
                        url, headers=headers, json=body, verify_ssl=False
                    ) as response:
                        try:
                            result = await response.json()
                            result = result["results"][0]["generated_text"]
                            return result
                        except Exception as e:
                            attempt += 1
                            if attempt < retries:
                                await asyncio.sleep(delay)
                            else:
                                return e

            async with aiohttp.ClientSession() as session:
                tasks = [parallel_request(session, input) for input in inputs]
                responses = await asyncio.gather(*tasks)

            return {
                "predictions": [
                    {
                        "fields": ["generated_text"],
                        "values": [[response] for response in responses],
                    }
                ]
            }

        return asyncio.run(score_async_wrap())

    return score

### Storing python function

In [None]:
sofware_spec_uid = watsonx_ai_client.software_specifications.get_id_by_name(
    "runtime-24.1-py3.11"
)

func_name = "<FUNCTION_NAME>"
meta_data = {
    watsonx_ai_client.repository.FunctionMetaNames.NAME: func_name,
    watsonx_ai_client.repository.FunctionMetaNames.SOFTWARE_SPEC_UID: sofware_spec_uid,
}

function_details = watsonx_ai_client.repository.store_function(
    meta_props=meta_data, function=scoring_wrapper
)

In [8]:
function_details

{'entity': {'software_spec': {'id': '45f12dfe-aa78-5b8d-9f38-0ee223c47309',
   'name': 'runtime-24.1-py3.11'},
  'type': 'python'},
 'metadata': {'created_at': '2024-11-08T10:56:10.538Z',
  'id': '1114792c-3f0c-4e43-a0f9-a80846ddeafd',
  'modified_at': '2024-11-08T10:56:10.538Z',
  'name': 'cloud_test',
  'owner': 'IBMid-693000DYYL',
  'space_id': '74557a01-62df-49f8-9be1-571f7d26ee28'},

In [9]:
function_uid = function_details["metadata"]["id"]
print("Function UID:" + function_uid)

Function UID:1114792c-3f0c-4e43-a0f9-a80846ddeafd


### Deploying the function

In [10]:
function_deployment_details = watsonx_ai_client.deployments.create(
    function_uid,
    {
        watsonx_ai_client.deployments.ConfigurationMetaNames.NAME: func_name + "_deployment",
        watsonx_ai_client.deployments.ConfigurationMetaNames.ONLINE: {},
    },
)



######################################################################################

Synchronous deployment creation for id: '1114792c-3f0c-4e43-a0f9-a80846ddeafd' started

######################################################################################


initializing
Note: online_url and serving_urls are deprecated and will be removed in a future release. Use inference instead.
..
ready


-----------------------------------------------------------------------------------------------
Successfully finished deployment creation, deployment_id='abed00f4-a5d8-4826-8083-5bc5ded3a9b2'
-----------------------------------------------------------------------------------------------




In [11]:
func_deployment_uid = watsonx_ai_client.deployments.get_uid(function_deployment_details)
print("Function Deployment UID:" + func_deployment_uid)

Function Deployment UID:abed00f4-a5d8-4826-8083-5bc5ded3a9b2


## Step 3 - Testing python function deployment <a id="step-3"></a>

In [12]:
func_scoring_url = watsonx_ai_client.deployments.get_scoring_href(function_deployment_details)
print("Scoring URL:" + func_scoring_url)

Scoring URL:https://us-south.ml.cloud.ibm.com/ml/v4/deployments/abed00f4-a5d8-4826-8083-5bc5ded3a9b2/predictions


In [13]:
payload_scoring = {
    "input_data": [{"fields": ["input"], "values": [["hi"], ["what is 1+1"]]}]
}

scores_function_response = watsonx_ai_client.deployments.score(
    func_deployment_uid, payload_scoring
)
print(scores_function_response)

{'predictions': [{'fields': ['generated_text'], 'values': [['hi'], ['2']]}]}
