## Retrieve Access Tokens

Using deployment url to make LLM calls

First step is to retreive the token - we make use of the key json file content and extract the authorization url, client id, secret & api url.

Post request is done to retreive the access token

In [None]:
import json
import requests

with open("irpa-r1208-hands-on-exercises-sk.json", "r") as key_file:
    svcKey = json.load(key_file)
authUrl = svcKey["url"]
clientid = svcKey["clientid"]
clientsecret = svcKey["clientsecret"]
apiUrl = svcKey["serviceurls"]["AI_API_URL"]

# request token
params = {"grant_type": "client_credentials" }
resp = requests.post(f"{authUrl}/oauth/token",
                    auth=(clientid, clientsecret),
                    params=params)

BtpLlmApiUrl = apiUrl
BtpLlmAccessToken = resp.json()["access_token"]

Deployment url's are initialized

In [None]:
# gemini-1.0-pro (OK)
deploymentUrl_gemini_1_0 = "https://api.ai.prod.eu-central-1.aws.ml.hana.ondemand.com/v2/inference/deployments/d74bdeb62e0aace3"

# gpt-35-turbo (OK)
deploymentUrl_gpt35_turbo = "https://api.ai.prod.eu-central-1.aws.ml.hana.ondemand.com/v2/inference/deployments/db892a872898f277"

# text-embedding-ada-002 (OK)
deploymentUrl_ada_002 = "https://api.ai.prod.eu-central-1.aws.ml.hana.ondemand.com/v2/inference/deployments/d50d5c53990484d7"

# gpt-4o (OK)
deploymentUrl_gpt_4o = "https://api.ai.prod.eu-central-1.aws.ml.hana.ondemand.com/v2/inference/deployments/d02fb5127194087a"

# gemini-pro-vision, whisper
deploymentUrl_cpit = "https://api.ai.prod.eu-central-1.aws.ml.hana.ondemand.com/v2/inference/deployments/dc3567fb9fdc9729"

# meta--llama3-70b-instruct (OK)
deploymentUrl_llama = "https://api.ai.prod.eu-central-1.aws.ml.hana.ondemand.com/v2/inference/deployments/ddcaf9eb805927da"

## GPT-3.5-Turbo | GPT-4o
https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#early-access-playground

api endpoint is provided /chat/completions along with api-version

form the header with resource_group_id as default, content as json & pass the access token received above

payload defined with content as prompt input along with parameters (temperature, max_tokens etc.)

In [None]:
import requests
import json

# Define the necessary variables
deployment_url = deploymentUrl_gpt35_turbo + "/chat/completions?api-version=2024-02-01"
resource_group_id = 'default'

# Define the headers
headers = {
    'AI-Resource-Group': resource_group_id,
    'Content-Type': 'application/json',
    'Authorization': f'Bearer {BtpLlmAccessToken}',
}

# Define the data payload
data = {
    "messages": [
        {
            "role": "user",
            "content": "Provide me a brief on SAP S/4HANA public Cloud"
        }
    ],
    "max_tokens": 100,
    "temperature": 0.0,
    "frequency_penalty": 0,
    "presence_penalty": 0
}

# Send the POST request
response = requests.post(deployment_url, headers=headers, data=json.dumps(data))

# Print the response
print(response.json())


In [None]:
response.json()['choices'][0]['message']

In [None]:
response.json()['choices'][0]['message']['content']

## text-embedding-ada-002

In [None]:
import requests
import json

# Define the URL and headers
url = f"{deploymentUrl_ada_002}/embeddings?api-version=2023-05-15"
headers = {
    'AI-Resource-Group': '<Resource Group Id>',
    'Content-Type': 'application/json',
    'Authorization': f'Bearer {BtpLlmAccessToken}'
}

# Define the payload
data = {
    "input": "What is SAP?"
}

# Make the POST request
response = requests.post(url, headers=headers, data=json.dumps(data))

# Print the response
print(response.json())


## Gemini 1.0 pro

In [None]:
import requests
import json

# Define the URL and the token
deployment_url = deploymentUrl_gemini_1_0 + "/models/gemini-1.0-pro:generateContent"
resource_group_id = "default"


# Set the headers
headers = {
    'AI-Resource-Group': resource_group_id,
    'Content-Type': 'application/json',
    'Authorization': f'Bearer {BtpLlmAccessToken}'
}

# Define the data payload
data = {
    "contents": [
        {
            "role": "user",
            "parts": {"text": "Hello!"}
        },
        {
            "role": "model",
            "parts": {"text": "Argh! What brings ye to my ship?"}
        },
        {
            "role": "user",
            "parts": {"text": "Tell me about SAP"}
        }
    ],
    "safety_settings": {
        "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
        "threshold": "BLOCK_LOW_AND_ABOVE"
    },
    "generation_config": {
        "temperature": 0.9,
        "topP": 1,
        "candidateCount": 1,
        "maxOutputTokens": 2048
    }
}

# Make the POST request
response = requests.post(deployment_url, headers=headers, data=json.dumps(data))

# Print the response
print(response.status_code)
print(response.json())


## meta--llama3-70b-instruct

In [None]:
import requests
import os

# Set environment variables or replace with your actual values
resource_group_id = 'default'

url = f'{deploymentUrl_llama}/chat/completions'
headers = {
    'AI-Resource-Group': resource_group_id,
    'Content-Type': 'application/json',
    'Authorization': f'Bearer {BtpLlmAccessToken}'
}
data = {
    "model": "meta--llama3-70b-instruct",
    "messages": [
        {
            "role": "user",
            "content": "What is SAP?"
        }
    ],
    "max_tokens": 100
}

response = requests.post(url, headers=headers, json=data)
print(response.json())


## gpt-4o Image Input

In [None]:
import requests
import json

# Define the necessary variables
deployment_url = deploymentUrl_gpt_4o + "/chat/completions?api-version=2024-02-01"
resource_group_id = 'default'

# Define the headers
headers = {
    'AI-Resource-Group': resource_group_id,
    'Content-Type': 'application/json',
    'Authorization': f'Bearer {BtpLlmAccessToken}',
}

# Define the data payload
data = {
    "model": "gpt-4o",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Whats in this image?"
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
                    }
                }
            ]
        }
    ],
    "max_tokens": 1000
}

# Send the POST request
response = requests.post(deployment_url, headers=headers, data=json.dumps(data))

# Print the response
print(response.json())


In [None]:
import requests
import base64
import json

# Define the necessary variables
deployment_url = deploymentUrl_gpt_4o + "/chat/completions?api-version=2024-02-01"
resource_group_id = 'default'

# Define the headers
headers = {
    'AI-Resource-Group': resource_group_id,
    'Content-Type': 'application/json',
    'Authorization': f'Bearer {BtpLlmAccessToken}',
}

# Function to encode the image
def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode('utf-8')

# Path to your image
image_path = "graph.png"

# Getting the base64 string
base64_image = encode_image(image_path)

# Define the data payload
data = {
    "model": "gpt-4o",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Whats in this image?"
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/jpeg;base64,{base64_image}",
                        "detail": "high"
                    }
                }
            ]
        }
    ],
    "max_tokens": 1000
}

# Send the POST request
response = requests.post(deployment_url, headers=headers, data=json.dumps(data))

# Print the response
print(response.json())


## gemini-pro-vision

In [None]:
import requests
import base64
import json

# Define the necessary variables
deployment_url = deploymentUrl_cpit + "/v1/chat/completions"
resource_group_id = 'default'

# Define the headers
headers = {
    'AI-Resource-Group': resource_group_id,
    'Content-Type': 'application/json',
    'Authorization': f'Bearer {BtpLlmAccessToken}',
}

# Function to encode the image
def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode('utf-8')

# Path to your image
image_path = "graph.png"

# Getting the base64 string
base64_image = encode_image(image_path)


# Define the data payload
data = {
    "model": "gemini-pro-vision",
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Whats in this image?"
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/jpeg;base64,{base64_image}",
                        "detail": "high"
                    }
                }
            ]
        }
    ],
    "max_tokens": 500
}

# Send the POST request
response = requests.post(deployment_url, headers=headers, data=json.dumps(data))

# Print the response
print(response.json())


## Whisper

In [None]:
import requests

# Define the necessary variables
deployment_url = deploymentUrl_cpit + "/v1/audio/transcriptions?api-version=2024-02-15-preview"
resource_group_id = 'default'

# Define the headers
headers = {
    'AI-Resource-Group': resource_group_id,
    'Authorization': f'Bearer {BtpLlmAccessToken}',
}

# Read the audio file
audio_file_path = "harvard.mp3"
with open(audio_file_path, 'rb') as f:
    audio_data = f.read()

# Define the files payload
files = {
    'file': ('harvard.mp3', audio_data, 'audio/wav')
}

# Define the data payload (if needed; adjust as necessary for Whisper API)
data = {
    'model': 'whisper',  # or specify the model version if different
    'language': 'en'  # optional, specify language if known
}

# Send the POST request
response = requests.post(deployment_url, headers=headers, data=data, files=files)

# Print the response
print(response.json())


## Using generative-ai-hub-sdk & RAG with HANA Vector Engine

"HANA_VECTOR_USER" - Administrator User (DBADMIN)

"HANA_VECTOR_PASS" - Administrator Password provided during setup

"HANA_HOST_VECTOR" - SQL Endpoint (remove :443 while copying from endpoint url)

add the HANA DB & AI Core key information in the environment variable (can be done via json file entries / .env file, do not pass directly in code for your applications)

In [None]:
import os 

# Define Keys HERE
env_vars = {    
 "HANA_VECTOR_USER": "DBADMIN",
 "HANA_VECTOR_PASS": "<>",
 "HANA_HOST_VECTOR": "<>",
 "AICORE_AUTH_URL": authUrl,
 "AICORE_CLIENT_ID": clientid,
 "AICORE_CLIENT_SECRET": clientsecret,
 "AICORE_RESOURCE_GROUP": "default",
 "AICORE_BASE_URL": apiUrl
}

os.environ.update(env_vars)

## generative-ai-hub-sdk

## gpt-35-turbo

approach 1 - use chat from openai

In [None]:
from gen_ai_hub.proxy.native.openai import chat

messages = [ {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},
            {"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},
            {"role": "user", "content": "Do other Azure Cognitive Services support this too?"} ]

kwargs = dict(model_name='gpt-35-turbo', messages=messages)
response = chat.completions.create(**kwargs)

print(response)

## gpt-35-turbo | gpt-4o

approach 2 - use ChatOpenAI from langchain.openai

In [None]:
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client
from gen_ai_hub.proxy.langchain.openai import ChatOpenAI

proxy_client=get_proxy_client('gen-ai-hub')
chat_llm = ChatOpenAI(proxy_model_name = 'gpt-4o', proxy_client = proxy_client, temperature=0.0)

response = chat_llm.invoke("who is the CEO of SAP?")
print(response.content)

In [None]:
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client
from gen_ai_hub.proxy.langchain.openai import ChatOpenAI

proxy_client=get_proxy_client('gen-ai-hub')
chat_llm = ChatOpenAI(proxy_model_name = 'gpt-4o', proxy_client = proxy_client, temperature=0.0)

response = chat_llm.invoke("What is the Cataract surgery policy in SAP?")
print(response.content)

## text-embedding-ada-002

In [None]:
from gen_ai_hub.proxy.native.openai import embeddings

response = embeddings.create(
    input="Every decoding is another encoding.",
    model_name="text-embedding-ada-002"
)
print(response.data)

## gemini-1.0-pro

In [None]:
from gen_ai_hub.proxy.native.google.clients import GenerativeModel

proxy_client = get_proxy_client('gen-ai-hub')
kwargs = dict({'model_name': 'gemini-1.0-pro'})
model = GenerativeModel(proxy_client=proxy_client, **kwargs)
content = [{
    "role": "user",
    "parts": [{
        "text": "Write a story about a magic backpack."
    }]
}]
model_response = model.generate_content(content)
print(model_response)

## meta--llama3-70b-instruct

In [None]:
from gen_ai_hub.proxy.native.openai import completions

response = completions.create(
  model_name="meta--llama3-70b-instruct",
  prompt="What is SAP?",
  max_tokens=70,
  temperature=0
)
print(response)

## RAG with HANA Vector Engine

use pip install hdbcli & pip install hana_ml to import these libraries (or pip3 install..)

In [None]:
# Import dbapi from hdbcli library
from hdbcli import dbapi
import hana_ml.dataframe as dataframe
from gen_ai_hub.proxy.langchain.openai import OpenAIEmbeddings
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client
from langchain.chains import RetrievalQA # deprecated
from langchain.text_splitter import CharacterTextSplitter
from langchain.text_splitter import TokenTextSplitter
from langchain.document_loaders import TextLoader
from langchain.document_loaders import PyPDFLoader
from langchain_community.vectorstores.hanavector import HanaDB

The below code retrieves the values of HANA user, Password & Host from environment variables. You can directly pass the same as well instead of first adding to the environment variable

In [None]:
# Get the HANA Cloud username from environment variables
HANA_USER_VDB = os.getenv('HANA_VECTOR_USER')
# Get the HANA Cloud password from environment variables
HANA_PASSWORD_VDB = os.getenv('HANA_VECTOR_PASS')
# Get the HANA Cloud host from environment variables
HANA_HOST  = os.getenv('HANA_HOST_VECTOR')

Establish the connection to HANA DB

In [None]:
# Use connection settings from the environment
connection = dbapi.connect(
    address=HANA_HOST,
    port=443,
    user=HANA_USER_VDB,
    password=HANA_PASSWORD_VDB,
    encrypt='true',
    autocommit=True
)

In [None]:
# Connection Context
conn = dataframe.ConnectionContext(
    address=HANA_HOST,  
    port=443,
    user=HANA_USER_VDB,
    password=HANA_PASSWORD_VDB,
    encrypt='true',
    autocommit=True
)

Import necessary langchain modules along with assignment of the proxy client as gen-ai-hub & chat_llm model initiation as gpt-35-turbo, which can be changed based on model availability in GENAI Hub

In [None]:
from gen_ai_hub.proxy.langchain.openai import ChatOpenAI
from gen_ai_hub.proxy.core.proxy_clients import get_proxy_client

proxy_client = get_proxy_client('gen-ai-hub')
chat_llm = ChatOpenAI(proxy_model_name='gpt-35-turbo', proxy_client=proxy_client, temperature=0.0)

This code performs the document split (loaded using PyPDF Loader available with Langchain) in chunks of size 200 & chunk overlap 25. If you wish to use a text file, you can use TextLoader added as import below (for other file types supported, refer - https://python.langchain.com/docs/modules/data_connection/document_loaders/)

split of the text is done and embedding model is initiated using text-embedding-ada-002

In [None]:
#Load the PDF file & Create Chunks
loader = PyPDFLoader('./India Medical Insurance Policy 2024.pdf')

text_splitter = TokenTextSplitter(chunk_size=200, chunk_overlap=25)

#pages = loader.load_and_split(text_splitter)

documents = loader.load()

texts = text_splitter.split_documents(documents)

embedding_model = OpenAIEmbeddings(proxy_model_name='text-embedding-ada-002')

In [None]:
db = HanaDB(
    embedding=embedding_model, connection=connection, table_name="INSURANCE"
)

# Delete already existing documents from the table
db.delete(filter={})

# add the loaded document chunks
db.add_documents(texts)

In [None]:
# take a look at the table
hdf = conn.sql(''' SELECT "VEC_TEXT", "VEC_META", TO_NVARCHAR("VEC_VECTOR") AS "VEC_VECTOR" FROM "INSURANCE" ''')
df = hdf.head(10).collect()
df


In [None]:
retriever = db.as_retriever()

RetrieverQA is deprecated. the new approach is to use create_retreival_chain

first initiate the prompt

In [None]:
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_template("""Provide answers based on context provided:

<context>
{context}
</context>

Question: {input}""")

create document_chain with chat_llm (defined earlier and prompt

retrieval chain using hana db & document chain

In [None]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain

document_chain = create_stuff_documents_chain(chat_llm, prompt)
retrieval_chain = create_retrieval_chain(retriever, document_chain)

invoke the query

In [None]:
response = retrieval_chain.invoke({"input": "what is the insurance cap?"})
print(response["answer"])

In [None]:
response = retrieval_chain.invoke({"input": "what is the cataract surgery policy?"})
print(response["answer"])

## Reference:

https://python.langchain.com/docs/integrations/vectorstores/sap_hanavector/

https://github.tools.sap/Artificial-Intelligence-CoE/ies-genai-platform-models-cookbook/tree/main/examples/python/SAP-HANA-Cloud-VectorEngine-PoC

https://github.wdf.sap.corp/hana-multi-model/vector-getting-started

https://discovery-center.cloud.sap/protected/index.html#/mymissiondetail/95888/card/10571818/?tab=projectboard

generative-ai-hub-sdk code source - https://github.wdf.sap.corp/AI/generative-ai-hub-sdk/blob/main/docs/gen_ai_hub/examples/gen_ai_hub.ipynb

LLM endpoints - https://pages.github.tools.sap/Artificial-Intelligence-CoE/ies-genai-platform-models-cookbook/general/api-calls/inference-endpoints/

genaihub llm postman collections - https://github.tools.sap/Artificial-Intelligence-CoE/ies-genai-platform-models-cookbook/blob/8d6748546931457171edfacc09cf9dcaa0e046a8/examples/postman/GenAI-Hub-LLM-Deployments-postman-collection.json

OpenAI API versions - https://learn.microsoft.com/en-us/azure/ai-services/openai/api-version-deprecation