# How to consume Amazon Bedrock models like Amazon Nova, Anthropic Claude and Amazon Titan Models via SAP GenAI hub
## Part 1 Use AI Core REST APIs

Use this notebook to invoke the AI Core REST APIs to send your payloads into LLMs hosted on SAP GenAI hub. The documentation for the APIs are provided [here](https://api.sap.com/api/AI_CORE_API/resource/Deployment) and [here](https://help.sap.com/docs/sap-ai-core/sap-ai-core-service-guide/consume-generative-ai-models-using-sap-ai-core#aws-bedrock)

The objective of this notebook is to help you understand how to consume Bedrock models via SAP GenAI hub for your GenAI applications

### Step 1: Load your SAP AI Core credentials

First, you should have your AI core creadentials in your ~/.aicore/config.json. If you do not have one yet, get it by creating a service key from [here](https://help.sap.com/docs/sap-ai-core/sap-ai-core-service-guide/create-service-key). In this notebook, we will read and consume the data from this location into temporary environment variables.

A sample config.json is below:

```sh
$ cat ~/.aicore/config.json  
{
  "AICORE_AUTH_URL": "<>.authentication.us10.hana.ondemand.com",
  "AICORE_CLIENT_ID": "sb-b...64",
  "AICORE_CLIENT_SECRET": "21...xc=",
  "AICORE_RESOURCE_GROUP": "default",
  "AICORE_BASE_URL": "https://api.ai.prod.<>.ondemand.com"
}
```

In [None]:
import os
import json
import base64

def load_config():
    with open(os.path.expanduser('~/.aicore/config.json')) as f:
        config = json.load(f)
    for key, value in config.items():
        os.environ[key] = value

### Step 2: Create Utility functions
Creating a utility function to send API calls to AI Core instance repeatedly. You will need to install requests library 
```sh
pip install requests
```

In [None]:
import requests
import sys
import logging
import time
from util.logging import initLogger

TIME_RETRY_API_CALL = 20
TIMEOUT_API_CALL = 3600

log = logging.getLogger(__name__)
initLogger()


# Function to call a rest API
def call_api(
    type: str, url: str, headers: dict, data: dict = None, message: str = None
):
    timeNeeded = 0
    while timeNeeded < TIMEOUT_API_CALL:
        try:
            r = None
            # Send the request to retrieve the access token
            if type == "POST":
                r = requests.post(url=url, headers=headers, data=data)
            elif type == "GET":
                r = requests.get(url=url, headers=headers)
            # if the response is OK, return the JSON response
            if r.ok is True:
                log.success(f"{message}")
                return r.json()
            else:
                log.info(
                    f"response ({message}): {r.status_code} ({r.reason}): {r.text}"
                )
                log.warning(
                    f"Could not {message}! Re-trying in {TIME_RETRY_API_CALL} seconds..."
                )
                time.sleep(TIME_RETRY_API_CALL)
                timeNeeded += TIME_RETRY_API_CALL
        except requests.exceptions.RequestException as e:
            log.warning(str(e))
            log.error(f"Could not {message}! Exiting...")
            sys.exit(1)
    log.error(f"Could not {message} after {TIMEOUT_API_CALL} seconds! Exiting...")


Creating a template to hold the deploymentURL and other imporatant parameters


In [None]:
from dataclasses import dataclass
from json import JSONEncoder


class AiCoreMetadataJsonEncoder(JSONEncoder):
    def default(self, o):
        return o.__dict__


@dataclass
class AiCoreMetadataDefinition:
    authUrl: str
    clientId: str
    clientSecret: str
    apiBase: str
    resourceGroup: str
    targetAiCoreModel: str
    apiAccessToken: str
    deploymentUrl : str

    def __getitem__(self, item):
        return getattr(self, item)

Creating a dataClass that will initiate with the environment variables that have the AI Core credentials as well as the target model

In [None]:
@dataclass
class AiCoreMetadata(AiCoreMetadataDefinition):
    def __init__(self):

        load_config()

        self.authUrl = os.environ.get("AICORE_AUTH_URL")
        self.clientId = os.environ.get("AICORE_CLIENT_ID")
        self.clientSecret = os.environ.get("AICORE_CLIENT_SECRET")
        self.resourceGroup = os.environ.get("AICORE_RESOURCE_GROUP")
        self.apiBase = os.environ.get("AICORE_BASE_URL")
        self.targetAiCoreModel = os.environ.get("TARGET_AI_CORE_MODEL")
        self.apiAccessToken = get_api_access_token(self)
        self.deploymentUrl = get_deployment_details_for_model(self)

Creating a utility function that will get the API Access Token

In [None]:
def get_api_access_token(aiCoreMetadata: AiCoreMetadataDefinition) -> str:
    clientId = aiCoreMetadata.clientId
    clientSecret = aiCoreMetadata.clientSecret
    authUrl = aiCoreMetadata.authUrl

    # Create the authorization string
    authorizationString = f"{clientId}:{clientSecret}"
    # Encode the authorization string
    byte_data = authorizationString.encode("utf-8")
    # Base64 encode the byte data
    clientSecretBase64 = base64.b64encode(byte_data).decode("utf-8")

    # Create the URL to retrieve the access token
    aiCoreLocation = f"{authUrl}/oauth/token?grant_type=client_credentials"
    # Create the headers for the request
    headers = {"Authorization": f"Basic {clientSecretBase64}"}

    response = call_api(
        "POST",
        aiCoreLocation,
        headers,
        None,
        "retrieve access token from AI Core system",
    )
    # json_response = json.dumps(response, indent=2)

    return response["access_token"]

Once we have the API Access token, lets get the deployment URL that corresponds to the Model you want to use.


In [None]:
# Retrieve the deployment URL from the AI Core system metadata
def get_deployment_details_for_model(aiCoreMetadata: AiCoreMetadataDefinition):
    apiBase = aiCoreMetadata.apiBase
    token = aiCoreMetadata.apiAccessToken
    resourceGroup = aiCoreMetadata.resourceGroup

    # Create the URL to get the deployment 
    aiCoreLocation = f"{apiBase}/v2/lm/deployments"
    # Create the headers for the request
    headers = {}
    headers["AI-Resource-Group"] = resourceGroup
    headers["Authorization"] = f"Bearer {token}"
    allDeploymentDetails = None

    timeNeeded = 0
    message = f"retrieve deployment details for model id {aiCoreMetadata.targetAiCoreModel}"
    while timeNeeded < TIMEOUT_API_CALL:
        # Send the request to get the list of deployments
        response = call_api("GET", aiCoreLocation, headers, None, message)
        # json_response = json.dumps(response, indent=2)
        # log.check(
        #         f"API response from retrieveing deployment details for model id {aiCoreMetadata.targetAiCoreModel}:\n{json_response}"
        # )

        allDeploymentDetails = response
    
        for resource in allDeploymentDetails["resources"]:
            if resource["scenarioId"] == "foundation-models":
                model_name = resource["details"]["resources"]["backend_details"]["model"]["name"]
                if model_name == aiCoreMetadata.targetAiCoreModel:
                    return resource["deploymentUrl"]
        
        log.warning(
                f"Could not {message} Re-trying in {TIME_RETRY_API_CALL} seconds..."
            )

        time.sleep(TIME_RETRY_API_CALL)
        timeNeeded += TIME_RETRY_API_CALL

    log.error(
        f"Could not retrieve deployment details for id '{aiCoreMetadata.targetAiCoreModel}'! Exiting..."
    )
    sys.exit(1)

Another utility function to get deployment details in case deployment ID is provided. 

In [None]:
# Retrieve the deployment URL from the AI Core system metadata
def get_deployment_details(aiCoreMetadata: AiCoreMetadataDefinition, deploymenId: str):
    apiBase = aiCoreMetadata.apiBase
    token = aiCoreMetadata.apiAccessToken
    resourceGroup = aiCoreMetadata.resourceGroup

    # Create the URL to create the configuration
    aiCoreLocation = f"{apiBase}/v2/lm/deployments/{deploymenId}"
    # Create the headers for the request
    headers = {}
    headers["AI-Resource-Group"] = resourceGroup
    headers["Authorization"] = f"Bearer {token}"
    deploymentDetails = None

    timeNeeded = 0
    message = f"retrieve deployment details for deployment id {deploymenId}"
    while timeNeeded < TIMEOUT_API_CALL:
        # Send the request to create the deployment
        response = call_api("GET", aiCoreLocation, headers, None, message)
        # json_response = json.dumps(response, indent=2)

        deploymentDetails = response
        deploymentUrl = deploymentDetails["deploymentUrl"]
        if deploymentUrl != "":
            log.success(f"AI Core deployment id '{deploymenId}' is now accessible!")
            return deploymentDetails
        else:
            log.warning(
                f"Could not {message} (deployment not finished)! Re-trying in {TIME_RETRY_API_CALL} seconds..."
            )

            time.sleep(TIME_RETRY_API_CALL)
            timeNeeded += TIME_RETRY_API_CALL

    log.error(
        f"Could not retrieve deployment details for id '{deploymenId}'! Exiting..."
    )
    sys.exit(1)

Let's instantiate our class. This step will populate the internal environment variables, execute the access token get utility function and also get the deployment URL for the model specified.

This is the utility function to send your input prompt to GenAI Hub

In [None]:
# Retrieve the available AI models from the AI Core system
def invoke(aiCoreMetadata: AiCoreMetadataDefinition, 
           data) -> str:

    token = aiCoreMetadata.apiAccessToken
    deploymentUrl = aiCoreMetadata.deploymentUrl
    
    # Determine the endpoint based on the target AI Core model
    nova_models = ["amazon--nova-pro", "amazon--nova-micro", "amazon--nova-lite"]
    aiCoreLocation = f"{deploymentUrl}/converse" if aiCoreMetadata.targetAiCoreModel in nova_models else f"{deploymentUrl}/invoke"
    
    # Create the headers for the request
    headers = {
        "AI-Resource-Group": aiCoreMetadata.resourceGroup,
        "Content-Type": "application/json",
        "Authorization": f"Bearer {token}",
    }
    
    response = call_api(
        "POST",
        aiCoreLocation,
        headers,
        json.dumps(data),
        "sending invoke",
    )
    json_response = json.dumps(response, indent=2)

    return response


## Step 3 Send the prompt to the GenAI Hub

In [None]:
# For Amazon Nova models
os.environ["TARGET_AI_CORE_MODEL"] = "amazon--nova-pro"
# os.environ["TARGET_AI_CORE_MODEL"] = "amazon--nova-lite"
# os.environ["TARGET_AI_CORE_MODEL"] = "amazon--nova-micro"

# Define your system prompt(s).
system_list = [
    { "text": "You should respond to all messages in german" }
]

# Define one or more messages using the "user" and "assistant" roles.
message_list = [
    {
        "role": "user", 
        "content": [
            {
                "text": "What is the capital of United States?"
            }
        ]
    },
]

# Configure the inference parameters.
inf_params = {"maxTokens": 150, "temperature": 0.7}

data = {
    "messages": message_list,
    "system": system_list,
    "inferenceConfig": inf_params,
}

In [None]:
# Initialize and save the metadata for the AI Core 
ai_core_metadata = AiCoreMetadata()

In [None]:
invoke(aiCoreMetadata= ai_core_metadata, 
       data=data)

Lets define a payload for Anthropic models 

In [None]:
# For Anthropic Claude models

# Multimodal models:
os.environ["TARGET_AI_CORE_MODEL"] = "anthropic--claude-3.5-sonnet"
# os.environ["TARGET_AI_CORE_MODEL"] = "anthropic--claude-3-opus"
# os.environ["TARGET_AI_CORE_MODEL"] = "anthropic--claude-3-sonnet"
# os.environ["TARGET_AI_CORE_MODEL"] = "anthropic--claude-3-haiku"

data = {}
messages = [
        {
            "role": "user",
            "content": "Hello, What is the capital of United States?"
        }
    ]
data["anthropic_version"] = "bedrock-2023-05-31"
data["max_tokens"] = 1000
data["messages"] = messages


In [None]:
# Extract the metadata for the AI Core system
ai_core_metadata = AiCoreMetadata()

In [None]:
invoke(aiCoreMetadata= ai_core_metadata, 
       data=data)

In [None]:
# For Amazon Titan Text Models

#Text based models:
os.environ["TARGET_AI_CORE_MODEL"] = "amazon--titan-text-lite"
# os.environ["TARGET_AI_CORE_MODEL"] = "amazon--titan-text-express"

data = {}

data = {
    "inputText": "What is the capital of United States?",
    "textGenerationConfig": {
        "maxTokenCount": 1000,
        "stopSequences": [],
        "temperature": 0,
        "topP": 1
     }
}

In [None]:
# Extract the metadata for the AI Core system
ai_core_metadata = AiCoreMetadata()

In [None]:
invoke(aiCoreMetadata= ai_core_metadata, 
       data=data)

In [None]:
# For Amazon Titan Embedding Models
os.environ["TARGET_AI_CORE_MODEL"] = "amazon--titan-embed-text"

data = {}

data = {
    "inputText": "What is the capital of United States?"
}

In [None]:
# Extract the metadata for the AI Core system
ai_core_metadata = AiCoreMetadata()

Finally get a result

In [None]:
invoke(aiCoreMetadata= ai_core_metadata, 
       data=data)

## Part 2 Use SAP GenAI Hub SDK
Reference code [here](https://help.sap.com/doc/generative-ai-hub-sdk/CLOUD/en-US/_reference/gen_ai_hub.html).

Run below if not installed


In [None]:
!pip install "generative-ai-hub-sdk[all]==4.4.3" "boto3==1.35.27" "langchain==0.3.20" "langgraph==0.3.20"

In [None]:
from gen_ai_hub.proxy.native.amazon.clients import Session

# Initialize the Bedrock client
bedrock = Session().client(model_name="amazon--nova-lite")
# bedrock = Session().client(model_name="anthropic--claude-3-haiku")

# Define your system prompt(s).
system_list = [
    {"text": "You should respond to all messages in german"}
]

# Define one or more messages using the "user" and "assistant" roles.
message_list = [
    {
        "role": "user", 
        "content": [
            {
                "text": "What is the capital of United States?"
            }
        ]
    },
]

# Configure the inference parameters.
inf_params = {"maxTokens": 150, "temperature": 0.7}

# Get the response from the model
response = bedrock.converse(
    messages=message_list,
    system=system_list,
    inferenceConfig=inf_params
)

# Extract and print the assistant's response
if 'output' in response and 'message' in response['output']:
    assistant_message = response['output']['message']['content'][0]['text']
    print("Response:")
    print(assistant_message)
else:
    print("No valid response received.")

# Extract and print additional details: stopReason, usage, and metrics
if 'stopReason' in response:
    print("\nStop Reason:")
    print(response['stopReason'])

if 'usage' in response:
    print("\nUsage Details:")
    for key, value in response['usage'].items():
        print(f"  {key}: {value}")

if 'metrics' in response:
    print("\nMetrics:")
    for key, value in response['metrics'].items():
        print(f"  {key}: {value}")
