# 3. Configure SAP AI Core & Launchpad for Evaluation

### 🔀 Workflow

1. Create **Object Store Secret**:

   For AI Core to access the S3 bucket

2. Create an **Artifact**:

   Facade for the S3 bucket to be mapped and accessible inside AI Core

3. GenAI Evaluation **Scenario**:

   Retrieve the Scenario ID for `genai-evaluation` which is pre-configured in the AI Core

4. GenAI Evaluation **Executable**:

   Retrieve the Executable ID for `genai-evaluation` which is pre-configured in the AI Core

5. Orchestration **Deployment** URL:

   Retrieve the Orchestration URL for the AI Core instance

6. Create **Configuration**:

   Create a configuration for GenAI evaluation with artifact, scenario, executable & orchestration URL

7. Create **Execution**:

   Create an execution for the GenAI evaluation with the configuration created in the previous step


In [None]:
from ai_core_sdk.ai_core_v2_client import AICoreV2Client
from dotenv import load_dotenv
import os

load_dotenv()

client_aicore = AICoreV2Client(
    base_url=os.getenv("AICORE_API_BASE") + "/v2",
    auth_url=os.getenv("AICORE_AUTH_URL") + "/oauth/token",
    client_id=os.getenv("AICORE_CLIENT_ID"),
    client_secret=os.getenv("AICORE_CLIENT_SECRET"),
    resource_group="default",
)

## Step 1: Create Object Store Secret

SAP Help Docs - https://help.sap.com/docs/sap-ai-core/sap-ai-core-service-guide/register-your-object-store-secret


In [84]:
object_store_secret_name = "ade-v2"

In [None]:
# Create an object store secret for the S3 bucket

secret_create = client_aicore.object_store_secrets.create(
    name=object_store_secret_name,
    type="S3",
    bucket=os.getenv("AWS_S3_BUCKET_NAME"),
    endpoint=os.getenv("AWS_S3_ENDPOINT"),
    path_prefix="ade-v2",
    region=os.getenv("AWS_REGION"),
    data={
        "AWS_ACCESS_KEY_ID": os.getenv("AWS_ACCESS_KEY_ID"),
        "AWS_SECRET_ACCESS_KEY": os.getenv("AWS_SECRET_ACCESS_KEY"),
    },
)

print(f"Object store secret '{secret_create}' created successfully.")

Object store secret 'Message: secret has been created' created successfully.


In [None]:
# Get the object store secret to verify creation

secret_get = client_aicore.object_store_secrets.get(name=object_store_secret_name)
print(f"{secret_get.name}")

ade-v2


## Step 2: Identify the GenAI Evaluation Scenario

SAP AI Core preships with the `genai-evaluations` scenario which is used to evaluate the GenAI models


In [77]:
scenarioid_genai_eval = "genai-evaluations"

scenario_get = client_aicore.scenario.get(scenario_id=scenarioid_genai_eval)

print(f"Scenario: {scenario_get.id} - {scenario_get.name}")

Scenario: genai-evaluations - genai-evaluations


## Step 3: Get the GenAI Evaluation Executable

SAP AI Core preships with the `genai-evaluations` executable inside the `genai-evaluations` scenario which is used to evaluate the GenAI models

In [78]:
executableid_genai_eval = "genai-evaluations"

executable_get = client_aicore.executable.get(
    scenario_id=scenarioid_genai_eval, executable_id=executableid_genai_eval
)
print(f"Executable Get: {executable_get.id} - {executable_get.name}")

Executable Get: genai-evaluations - genai-evaluations


## Step 4: Map the Object Store as an Artifact


In [80]:
artifactname_genai_eval = "ade-v2-eval-3"

In [None]:
from ai_api_client_sdk.models.artifact import Artifact

artifact_create = client_aicore.artifact.create(
    kind=Artifact.Kind.OTHER,  # "Other Artifact" section
    name=artifactname_genai_eval,
    description="ADE v2 Evaluation Artifact",
    scenario_id=scenarioid_genai_eval,
    url="ai://ade-v2/evals",
)

print(f"Artifact: {artifact_create}")

In [65]:
artifact_get = client_aicore.artifact.query(
    scenario_id=scenarioid_genai_eval,
    kind=Artifact.Kind.OTHER,
)

artifact_id = next(
    (a.id for a in artifact_get.resources if a.name == artifactname_genai_eval), None
)
print(f"Artifact Get: {artifact_id}")

Artifact Get: 31a15492-f6a5-4885-b251-d9a26e33ebc9


## Step 5: Get the Orchestration Deployment URL

SAP AI Core preships with an Orchestration deployment named `defaultOrchestrationConfig` which provides harmonized API access to all LLM models available in the AI Core instance. This deployment is used to execute the GenAI evaluation. 

In [59]:
deployment_query = client_aicore.deployment.query(
    scenario_id="orchestration", executable_ids=["orchestration"]
)

print(f"Deployment Get: {deployment_query}")

# find deployment by configuration name - 'defaultOrchestrationConfig'
deployment_id, deployment_url = next(
    (
        (d.id, d.deployment_url)
        for d in deployment_query.resources
        if d.configuration_name == "defaultOrchestrationConfig"
    ),
    (None, None),
)
print(f"Deployment ID: {deployment_id}")
print(f"Deployment URL: {deployment_url}")

Deployment Get: Resources: [{Deployment id: d79f9a643173f554}, {Deployment id: d54d3a65125a8b27}], Count: 2
Deployment ID: d79f9a643173f554
Deployment URL: https://api.ai.internalprod.eu-central-1.aws.ml.hana.ondemand.com/v2/inference/deployments/d79f9a643173f554


## Step 6: Create Configuration

Create a Configuration in AI Core, from which an execution is scheduled. The configuration has the executable & scenario mapped to `genai-evaluations`, and hence this executable would be used for the subsequent execution.

In [66]:
configname_genai_eval = "config-genai-aicore-evals-ade-v2-2"

In [None]:
import json
from ai_api_client_sdk.models.input_artifact_binding import InputArtifactBinding
from ai_api_client_sdk.models.parameter_binding import ParameterBinding

# directory which contains the orchestration configs for various llms - see `evals/runs/*.json` files
dir_runs = "runs"

# directory which contains the test dataset - see `evals/testdata/ade-v2-300.json` file
dir_testdata = json.dumps({"path": "testdata/ade-v2-300.json", "type": "json"})

# list of metrics to be used for evaluation
metrics_list = "bert_score,bleu"

# variable mapping for the metrics and the test dataset JSON
variable_mapping = json.dumps({"all_metrics/reference": "data/golden-truth"})

configuration_create = client_aicore.configuration.create(
    name=configname_genai_eval,
    scenario_id=scenarioid_genai_eval,
    executable_id=executableid_genai_eval,
    input_artifact_bindings=[
        InputArtifactBinding(key="rootFolder", artifact_id=artifact_id)
    ],
    parameter_bindings=[
        ParameterBinding(key="runs", value=dir_runs),
        ParameterBinding(key="testDataset", value=dir_testdata),
        ParameterBinding(key="metrics", value=metrics_list),
        ParameterBinding(key="variableMapping", value=variable_mapping),
        ParameterBinding(
            key="orchestrationDeploymentURL", value=deployment_url
        ),
    ],
)

print(f"Configuration Create: {configuration_create}")

Configuration Create: Id: 2c522e49-80c1-4b3a-80cc-480476e14e09, Message: Configuration created


In [None]:
# Query the configuration to get the ID

configuration_query = client_aicore.configuration.query(
    scenario_id=scenarioid_genai_eval,
    executable_ids=[executableid_genai_eval],
)

configid_genai_eval = next(
    (c.id for c in configuration_query.resources if c.name == configname_genai_eval),
    None,
)
print(f"Configuration ID: {configid_genai_eval}")

Configuration ID: 2c522e49-80c1-4b3a-80cc-480476e14e09


## Step 7: Schedule an Execution

Using the above configuration, schedule an execution to run GenAI evaluations. It may take some time, depending on the number of models and the size of the test dataset. 

The execution will run in the background, and you can monitor its status using the AI Core API or the SAP AI Core Launchpad.

In [90]:
execution_create = client_aicore.execution.create(configuration_id=configid_genai_eval)

print(f"Execution Create: {execution_create}")

Execution Create: Id: e8938839ddbe242f, Message: Execution scheduled


You can monitor the execution status in the 'Executions' tab of 'ML Operations' sections of the SAP AI Core Launchpad or by using the AI Core API to query the execution status.

![Monitoring AI Core execution status](../docs/images/execution-monitoring.png)

## Step 8: Observe the Results

Go to 'Evaluations' tab of 'Generative AI Hub' section of the SAP AI Core Launchpad to observe the results of the GenAI evaluation.
![Observing GenAI evaluation summary](../docs/images/evaluation-summary.png)
![Observing GenAI evaluation comparison results](../docs/images/evaluation-compare.png)
