# Deploying a Large Language Model in the GenAI Hub of SAP AI Core on BTP

## Pre-requisites

- Have [python](https://www.python.org/downloads/) installed
- Create an instance of SAP AI Core in your BTP sub-account: [documentation](https://help.sap.com/docs/sap-ai-core/sap-ai-core-service-guide/initial-setup?locale=en-US)
- Find information about available LLM's in [SAP Note 3437766](https://me.sap.com/notes/3437766) 
- Optionally, have the [Cloud Foundry CLI installed](https://docs.cloudfoundry.org/cf-cli/install-go-cli.html)


In [35]:
# Install the AI Core SDK with pip

%pip install ai_core_sdk

8963.94s - pydevd: Sending message related to process being replaced timed-out after 5 seconds



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [40]:
# In your working directory, log on to BTP with cf logon
# Download the credentials of your AICore instance with cf service-key <service-name> <key-name> > key.json
# Remove the first line of the key.json file
# Alternatively, create and download a service key in the BTP cockpit 

import json

with open('key.json') as f:
    btp_key = json.load(f).get('credentials')   # when using cf service-key
    # btp_key = json.load(f)                    # when using a downloaded service key from BTP cockpit

print(btp_key["serviceurls"]["AI_API_URL"])

https://api.ai.intprod-eu12.eu-central-1.aws.ml.hana.ondemand.com


In [42]:
# Load AI Core SDK
from ai_core_sdk.ai_core_v2_client import AICoreV2Client

# Create Connection using credentials from downloaded key.json
ai_core_client = AICoreV2Client(
    base_url = btp_key["serviceurls"]["AI_API_URL"] + "/v2", # The present SAP AI Core API version is 2
    auth_url=  btp_key["url"] + "/oauth/token", # Suffix to add
    client_id = btp_key["clientid"],
    client_secret = btp_key["clientsecret"],
)


In [43]:
# Query existing resource groups. It is expected that group "default" is present
response = ai_core_client.resource_groups.query()

for rg in response.resources:
    print(rg.resource_group_id)

default


In [44]:
# Find available models and their corresponding exectuables
# Alternatively, find this information in SAP Note 3437766 - https://me.sap.com/notes/3437766

exc = ai_core_client.executable.query(resource_group="default", scenario_id="foundation-models")

for e in exc.resources:
    p = e.parameters
    for i in p: 
        print (f"Executable: {e.id} - {i.description}")



Executable: aicore-ibm - supportedModels: ibm--granite-13b-chat
Executable: aicore-ibm - None
Executable: aicore-mistralai - supportedModels: mistralai--mistral-large-instruct
Executable: aicore-mistralai - None
Executable: aicore-nvidia - supportedModels: NV-Rerank-QA-Mistral-4B, NV-Embed-QA
Executable: aicore-nvidia - None
Executable: aicore-opensource - None
Executable: aicore-opensource - None
Executable: aws-bedrock - supportedModels: amazon--titan-text-express, amazon--titan-text-lite, amazon--titan-embed-text, amazon--titan-image-generator, anthropic--claude-3-haiku, anthropic--claude-3-opus, anthropic--claude-3-sonnet, anthropic--claude-3.5-sonnet
Executable: aws-bedrock - None
Executable: azure-openai - supportedModels: gpt-35-turbo, gpt-35-turbo-0125, gpt-35-turbo-16k, gpt-4o, gpt-4, gpt-4-32k, text-embedding-ada-002, text-embedding-3-small, text-embedding-3-large, dall-e-3, gpt-4o-mini
Executable: azure-openai - None
Executable: gcp-vertexai - supportedModels: text-bison, ch

In [None]:
# Create a new Configuration for OpenAI gpt-4o

from ai_api_client_sdk.models.parameter_binding import ParameterBinding

model_to_deploy = "gpt-4o"
executable_id = "azure-openai"

pb1 = ParameterBinding(key="modelName", value=model_to_deploy)
pb2 = ParameterBinding(key="modelVersion", value="latest")

ai_core_client.configuration.create( name=model_to_deploy, executable_id=executable_id, scenario_id="foundation-models",resource_group="default", parameter_bindings=[pb1, pb2])

In [45]:
# Query existing Configurations
confs = ai_core_client.configuration.query(scenario_id="foundation-models", resource_group="default")

for resource in confs.resources:
    print(f"Configuration ID: {resource.id}, Executable ID: {resource.executable_id}, Name: {resource.name}, Param0: {resource.parameter_bindings[0].value}, Param1: {resource.parameter_bindings[1].value}")


Configuration ID: fb1a73a5-8842-4e5f-be18-e8bc5911790d, Executable ID: azure-openai, Name: gpt-35-turbo, Param0: gpt-35-turbo, Param1: latest
Configuration ID: 53a15731-338e-4873-ba66-cbcc0d45e7e6, Executable ID: aicore-mistralai, Name: mistral, Param0: mistralai--mistral-large-instruct, Param1: latest
Configuration ID: fb4f93a9-a998-4079-9a7c-dfb6cf3ad428, Executable ID: azure-openai, Name: gpt-4o-mini, Param0: gpt-4o-mini, Param1: latest
Configuration ID: ce93b5ef-9e1f-46c5-a5de-497667d1e63b, Executable ID: aws-bedrock, Name: claude, Param0: anthropic--claude-3.5-sonnet, Param1: latest
Configuration ID: 4be0f740-3d39-4522-aa90-0cb2061c4e1d, Executable ID: azure-openai, Name: gpt-40, Param0: gpt-4o, Param1: latest


In [46]:
# Attention: make sure to use the correct configuration id which corresponds to the model you want to deploy

conf_id = "4be0f740-3d39-4522-aa90-0cb2061c4e1d"

# Create a deployment for the configuration
ai_core_client.deployment.create(configuration_id=conf_id, resource_group="default")

<ai_api_client_sdk.models.deployment_create_response.DeploymentCreateResponse at 0x10ddaded0>

In [47]:
# Optional: Find existing deployments for a configuration

deps = ai_core_client.deployment.query(resource_group="default", configuration_id=conf_id)

print(deps)

Resources: [{Deployment id: d0800c3392042c93}, {Deployment id: dab3e0a7598c5ebd}], Count: 2


In [48]:
# Optional: Get details of a deployment

for dep in deps.resources:
    print(dep.id)
    dep_detail = ai_core_client.deployment.get(deployment_id=dep.id, resource_group="default")
    print(dep_detail.details['resources']['backend_details']['model']['name'])
    print(dep_detail.details['resources']['backend_details']['model']['version'])



d0800c3392042c93
gpt-4o
latest
dab3e0a7598c5ebd
gpt-4o
latest
