### Introduction
In this notebook, we will test out [Microsoft's Phi-3-vision-128k-instruct](https://huggingface.co/microsoft/Phi-3-vision-128k-instruct) with the [custom inference server with hugging face transformers library](app/server.py) within SAP AI Core through [SAP Generative AI Hub SDK](https://pypi.org/project/generative-ai-hub-sdk/), which can significantly simplify the integration of self-hosted open-source LLMs in SAP AI Core with your own application, and provides the same interface as proprietary LLMs in SAP Generative AI Hub.<br/><br/>

In addition, you can also run another transformer-based open-source LLM on [Hugging Face](https://huggingface.co), like [microsoft/Phi-3-medium-128k-instruct](https://huggingface.co/microsoft/Phi-3-medium-128k-instruct) etc.

### Prerequisites
Before running this notebook, please assure you have performed the [Prerequisites](../../README.md) and [01-deployment.ipynb](01-deployment.ipynb). As a result, a deployment of transformer scenario is running in SAP AI Core. <br/><br/>

If the configuration and deployment are created through SAP AI Launchpad, please manually update the configuration_id and deployment_id in [env.json](env.json)
```json
{
    "configuration_id": "<YOUR_CONFIGURATION_ID_OF_TRANSFORMER_SCENARIO>",
    "deployment_id": "<YOUR_DEPLOYMENT_ID_BASED_ON_CONFIG_ABOVE>"
}
```

### The high-level flow:
- Load configurations info
- Connect to SAP AI Core via its SDK
- Check the status and logs of the deployment
- Register the byom-open-source-llm scenario as a foundation model scenario via SAP Generative AI Hub SDK
- Inference the model through **SAP Generative AI Hub SDK**
    - Option 1: Proxy with OpenAI-like interface
    - Option 2: Proxy with Langchain-like interface
    - Option 3: Proxy with Langchain-like interface, together with Langchain components


#### 1.Load config info 
- resource_group loaded from [config.json](../config.json)
- deployment_id(created in 01-deployment.ipynb) loaded [env.json](env.json)

In [None]:
import requests, json
from ai_core_sdk.ai_core_v2_client import AICoreV2Client

In [None]:
# Please replace the configurations below.
# config_id: The target configuration to create the deployment. Please create the configuration first.
with open("../config.json") as f:
    config = json.load(f)

with open("./env.json") as f:
    env = json.load(f)

deployment_id = env["deployment_id"]
resource_group = config.get("resource_group", "default")
print("deployment id: ", deployment_id, " resource group: ", resource_group)

#### 2.Initiate connection to SAP AI Core 

In [None]:
# Initiate an AI Core SDK client with the information of service key
ai_core_sk = config["ai_core_service_key"]
base_url = ai_core_sk.get("serviceurls").get("AI_API_URL") + "/v2/lm"
ai_core_client = AICoreV2Client(base_url=ai_core_sk.get("serviceurls").get("AI_API_URL")+"/v2",
                        auth_url=ai_core_sk.get("url")+"/oauth/token",
                        client_id=ai_core_sk.get("clientid"),
                        client_secret=ai_core_sk.get("clientsecret"),
                        resource_group=resource_group)


In [None]:
token = ai_core_client.rest_client.get_token()
headers = {
        "Authorization": token,
        'ai-resource-group': resource_group,
        "Content-Type": "application/json"}


#### 3.Check the deployment status 

In [None]:
# Check deployment status before inference request
deployment_url = f"{base_url}/deployments/{deployment_id}"
response = requests.get(url=deployment_url, headers=headers)
resp = response.json()    
status = resp['status']

deployment_log_url = f"{base_url}/deployments/{deployment_id}/logs"
if status == "RUNNING":
        print(f"Deployment-{deployment_id} is running. Ready for inference request")
else:
        print(f"Deployment-{deployment_id} status: {status}. Not yet ready for inference request")
        #retrieve deployment logs
        #{{apiurl}}/v2/lm/deployments/{{deploymentid}}/logs.

        response = requests.get(deployment_log_url, headers=headers)
        print('Deployment Logs:\n', response.text)


#### 4.Register the scenario as a foundation model scenario for SAP Generative AI Hub SDK

In [None]:
from gen_ai_hub.proxy.gen_ai_hub_proxy import GenAIHubProxyClient

GenAIHubProxyClient.add_foundation_model_scenario(
    scenario_id="byom-transformer-server",
    config_names="transformer*",
    prediction_url_suffix="/v1/chat/completions",
)

proxy_client = GenAIHubProxyClient(ai_core_client = ai_core_client)

Set the target model to be used with SAP Generative AI Hub. It must be identical as the modelName setup in the configuration in SAP AI Core

In [None]:
model = "microsoft/Phi-3-vision-128k-instruct"

#### 5. Basic Samples about Three Options of using SAP Generative AI Hub SDK in BYOM Open-Source LLMs

##### 5.1 Option 1-Proxy with OpenAI-like interface
Now let's test its OpenAI compatible API for Chat Completion via Proxy with OpenAI-like interface in SAP Generative AI Hub SDK, which is the exact API interface of Chat Completion of GPT-3.5/4 in SAP Generative AI Hub. 

In [None]:
# Option 1: Proxy with OpenAI-like interface
from gen_ai_hub.proxy.native.openai import OpenAI

openai = OpenAI(proxy_client=proxy_client)
messages = [{"role": "user", "content": "Tell me a joke"}]
# kwargs = dict(deployment_id='xxxxxxx', model=model,messages = messages)
result = openai.chat.completions.create(
    # **kwargs
    deployment_id=deployment_id,
    model=model,
    messages=messages
)

print("Option 1: Proxy with OpenAI-like interface\n", result.choices[0].message.content)

##### 5.2 Option 2-Proxy with Langchain-like interface
Now let's test its OpenAI compatible API for Chat Completion via Proxy with Langchain-like interface in SAP Generative AI Hub SDK, which is the exact API interface of Chat Completion of GPT-3.5/4 in SAP Generative AI Hub. 

In [None]:
# Option 2: Proxy with Langchain-like interface
from gen_ai_hub.proxy.langchain.openai import ChatOpenAI
from langchain.schema.messages import HumanMessage

messages = [HumanMessage(content="Tell me a joke")]
llm = ChatOpenAI(
    proxy_client=proxy_client,
    deployment_id=deployment_id,
    model_name=model
)
completion = llm.invoke(messages)
print("Option 2: Proxy with Langchain-like interface\n", completion.content)

##### 5.3 Option 3-Proxy with Langchain-like interface, together with Langchain components
Now let's test its OpenAI compatible API for Chat Completion via Proxy with  Langchain-like interface, together with Langchain components in SAP Generative AI Hub SDK, which is the exact API interface of Chat Completion of GPT-3.5/4 in SAP Generative AI Hub. 

In [None]:
# Option 3: Proxy with Langchain-like interface, together with Langchain components
from langchain.prompts import PromptTemplate

llm = ChatOpenAI(
    proxy_client=proxy_client,
    deployment_id=deployment_id,
    model_name=model,
    temperature=0.5,
    max_tokens=400,
    # model_kwargs={
    #     "frequency_penalty": -2, "presence_penalty": -1
    # }
)

template = "Tell me a joke about {topic}"
prompt = PromptTemplate(template=template, input_variables=["topic"])
llm_chain = prompt | llm

completion = llm_chain.invoke({"topic": "Generative AI"})

print("Option 3: Proxy with Langchain-like interface, together with Langchain components\n",completion.content)

##### 5.4 Vision Sample#4: Public Facilities Issue Spotter for [Citizen Reporting use case](https://github.com/SAP-samples/btp-generative-ai-hub-use-cases/tree/main/01-social-media-citizen-reporting) (e.g. a dirty street)

In next sample, we'll ask [Microsoft's Phi-3-vision-128k-instruct](https://huggingface.co/microsoft/Phi-3-small-128k-instruct) model to be an Assistant of Public Facilities Issue Spotter for city council.
Responsible for analyzing images reported by citizens through a mobile app to identify issues related to public facilities. <br/>
Here are the tasks: <br/>

- 1.Analyze images reported by citizens through a mobile app to identify issues related to public facilities. If no issue identified, go to step 5, otherwise continue with next steps
- 2.Extract photographic date and location information from images for accurate documentation.
- 3.Categorize identified issues based on predefined categories (e.g., infrastructure damage, cleanliness, safety hazards).
- 4.Assess the severity and priority of identified issues to determine appropriate action plans.
- 5.Output with JSON schema in triple quote as below:

```json
{ "issue_identified": "{{true or false}}",
#below section only output when there is an issue identified
"title": "{{A title about the issue less than 100 characters}}",
"description": "{{A short description about the issue less than 300 characters}}",
"photo_date": "{{Extracted photographic date from its metadata in yyyy-mm-dd:hh:mm:ss format}}",
"longitude": "{{Extracted the longitude of photographic location from its metadata. Output -1 if fails to extract location info from image}}",
"latitude": "{{Extracted the latitude of photographic location from its metadata. Output -1 if fails to extract location info from image}}",
"category": "{{Identified category: 01-Infrastructure Damage, 02-Cleanliness, 03-Safety Hazards, 04-Duplicated}}",
"priority": "{{Identified priority: 01-Very High, 02-High, 03-Medium, 04-Low}}",
"suggested_action": "{{01-Immediate Attendance, 02-Schedule Inspection, 03-Schedule Service, 04-Refer to similar issue}}"
}
```

<br/>
We'll inference the vision model through SAP Generative AI Hub.<br/>
Firstly, let's have a look at the image 

In [None]:
from IPython.display import display, Image
image_url = "https://raw.githubusercontent.com/SAP-samples/btp-generative-ai-hub-use-cases/main/10-byom-oss-llm-ai-core/resources/11-dirty-street.jpg"
# Display the image
display(Image(url=image_url))

Prepare the request as OpenAI-like chat completion

In [None]:
user_msg = 'You are a helpful Assistant of Public Facilities Issue Spotter for city council.\
Responsible for analyzing images reported by citizens through a mobile app to identify issues related to public facilities. \
Here are your tasks:\
1.Analyze images reported by citizens through a mobile app to identify issues related to public facilities. \
If no issue identified, go to step 5, otherwise continue with next steps \
2.Extract photographic date and location information from images for accurate documentation. \
3.Categorize identified issues based on predefined categories (e.g., infrastructure damage, cleanliness, safety hazards).\
4.Assess the severity and priority of identified issues to determine appropriate action plans. \
5.Output with JSON schema in triple quote as below:\
""" \
{ "issue_identified": "{{true or false}}", \
#below section only output when there is an issue identified\
"title": "{{A short title about the issue}}", \
"description": "{{A detail description about the issue}}", \
"photo_date": "{{Extracted photographic date from its metadata in yyyy-mm-dd:hh:mm:ss format. Leave it blank if no metadata found in it.}}", \
"longitude": "{{Extracted longitude of photographic location from its metadata. Do not make up any number. Output -1 if fails to extract location info from image}}",\
"latitude": "{{Extracted latitude of photographic location from its metadata. Do not make up any number. Output -1 if fails to extract location info from image}}",\
"category": "{{Identified category: 01-Infrastructure Damage, 02-Cleanliness, 03-Safety Hazards, 04-Duplicated}}",\
"priority": "{{Suggested Priority: 01-Very High, 02-High, 03-Medium, 04-Low}}",\
"suggested_action": "{{01-Immediate Attendance, 02-Schedule Inspection, 03-Schedule Service, 04-Refer to similar issue }}"\
} \
"""\
'

messages = [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": user_msg},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": image_url
                    },
                }
            ]
        }
    ]

#JSON Mode
response_format={"type": "json_object"} 

Inference the model through OpenAI-like interfaces through native OpenAI SDK

In [None]:
# Option 1: Proxy with OpenAI-like interface
from gen_ai_hub.proxy.native.openai import OpenAI
openai = OpenAI(proxy_client=proxy_client)
result = openai.chat.completions.create(
    deployment_id=deployment_id,
    model=model,
    response_format=response_format,
    messages=messages
)

print("Option 1: Proxy with OpenAI-like interface\n", result.choices[0].message.content)

##### 5.5 Sample#5: Citizen Reporting App with Option 2-Langchain-compatible Interface

In [None]:
# Option 2: Proxy with Langchain-like interface
from gen_ai_hub.proxy.langchain.openai import ChatOpenAI
from langchain.schema.messages import HumanMessage
human_msg = [
                {"type": "text", "text": user_msg},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": image_url
                    },
                },
            ]
messages = [HumanMessage(content=human_msg)]
llm = ChatOpenAI(
    proxy_client=proxy_client,
    deployment_id=deployment_id,
    model_name=model
).bind(
 response_format=response_format
)

completion = llm.invoke(messages)
print("Option 2: Proxy with Langchain-like interface\n", completion.content)

##### 5.6 Sample#5: Citizen Reporting Use Case with Option 3-Langchain-compatible Interface

Prepare the schema of entities about out of social medial post in citizen reporting app.<br/>
Output example:<br/>
```json
{
    "address": "Oakwood Road",
    "category": "PUBLIC CLEANLINESS",
    "description": "The public area on Oakwood Road in Sagenai is in a disgraceful state with piles of rubbish and litter scattered everywhere. The author is frustrated with the local authorities for not maintaining cleanliness despite the taxes they pay. They hope for immediate action.",
    "location": "51.57470453612761,0.003792117010085437",
    "priority": "3-Medium",
    "sentiment": "NEGATIVE",
    "summary": "Dirty public area on Oakwood Road"
}
```

In [None]:
from langchain.prompts import ChatPromptTemplate, PromptTemplate
from langchain.output_parsers import ResponseSchema
from langchain.output_parsers import StructuredOutputParser
        
category = '''Identify if the social media reports a situation related to one of the following categories: 
            1. PUBLIC CLEANLINESS: Dirty public areas, overflowing dustbins and littering. Bulky waste in common areas.  
            2. ROADS & FOOTPATHS: Including covered linkways, signboards & streetlights. E.g. Pot holes, huge cracks, etc.
            3. FACILITY & PARK MAINTENANCE: Fallen trees, overgrown grass, and maintenance of park lighting and facilities.
            4. PESTS: Sighting of bees and hornets, potential mosquito breeding sites, and more.
            5. DRAINS & SEWERS: Choked, overflowing, or damaged drains, bad sewage smells, flooding.   
            Output the category name. If none of the categories fits, or in doubt, return OTHER - PLEASE CHECK.  
            '''

            
priority = '''Identify the priority to be given to the reported issues:
            4-Low : the issue does not pose any problem with public safety and does not necessarily need to be handled urgently. 
            3-Medium : the issue does not cause any immediate danger, but it has significant and negative impact on the daily life of people in the neighborhood.
            2-High : the issue needs to be resolved quickly because it can potentially cause dangerous situations or disruptions. 
            1-Very High : the issue needs to be handled as soon as possible, as it is a matter of public safety. 
            Return the priority level. If in doubt, return 3-Medium '''
            
        
sentiment ='''Extract the sentiment of the post: 
            1. NEUTRAL: if the issue is reported politely
            2. NEGATIVE: if the post shows irritation, impatience, annoyance
            3. VERY NEGATIVE: the post expresses rage, hatred
            '''

address = ResponseSchema(name="address",
            description="Extract the address where the issue has been noticed. Return the street only and omit the town or country. For example: Oakwood Road.")
category = ResponseSchema(name="category",
            description=category)
description = ResponseSchema(name="description",
            description="Summarize the issue that is being reported in not more that 300 characters and a neutral tone.")
location = ResponseSchema(name="location",
            description="Extract the coordinates where the issue has been notices. The format should be: (51.57470453612761,0.003792117010085437).")
priority = ResponseSchema(name="priority",
            description=priority)
sentiment = ResponseSchema(name="sentiment",
            description=sentiment)
summary = ResponseSchema(name="summary",
            description="Summarize the issue that is being reported in 40 characters and a neutral tone.")
        
response_schemas = [
            address,
            category,
            description,
            location,
            priority,
            sentiment,
            summary
        ]

Helper function to convert the social media post into string.

In [None]:
def post_to_str(input_message):
    #message = f"redditPostId: {input_message["id"]}, author: {input_message["author"]}, title: {input_message["title"]}, message: {input_message["longText"]}, postingDate: {input_message["postingDate"]}"
    message = "redditPostId: " + input_message["id"]+\
            ", author: "+input_message["author"]+", title: "+input_message["title"]+\
            ", message: "+input_message["longText"]+", postingDate: "+input_message["postingDate"]
    return message

Prepare the final prompt to extract the entities from the social medial post about public facility issue through citizen reporting app

In [None]:
template_string = '''Extract information from the following social media post: 
            {post}
            {format_instructions}
            '''
        
output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
format_instructions = output_parser.get_format_instructions()
print(format_instructions)
        
prompt_template = ChatPromptTemplate.from_template(template=template_string)
input_message = {
        "id": "198qqqm",
        "author": "jacobtan89",
        "title": "Dirty public area",
        "longText": "The public area on Oakwood Road in Sagenai is in a disgraceful state with piles of rubbish and litter scattered everywhere. The author is frustrated with the local authorities for not maintaining cleanliness despite the taxes they pay. They hope for immediate action. #CleanUpYourAct #OakwoodRoadNightmare #DisgustingNeighborhood Coordinates:(51.57470453612761,0.003792117010085437)",
        "postingDate": "2024-01-17T07:13:48.000Z"
    }

message = post_to_str(input_message)
complete_prompt = prompt_template.format_messages(
            post = message,
            format_instructions = format_instructions
)

print(complete_prompt)

In [None]:
llm = ChatOpenAI(
    proxy_client=proxy_client,
    deployment_id=deployment_id,
    model_name=model,
    temperature=0.5,
    max_tokens=400,
    # model_kwargs={
    #     "frequency_penalty": -2, "presence_penalty": -1
    # }
)

#llm_chain = complete_prompt | llm
completion = llm.invoke(complete_prompt)

print("Option 3: Proxy with Langchain-like interface, together with Langchain components\n",completion.content)