# Ingest Live Data into your House Price Predictor with SAP AI Core

Author: https://github.com/dhrubpaul

You need Docker to complete this tutorial. If you are running this Jupyter notebook on web, we recommend to use your local system along with it.

The steps are analogous with the tutorial: https://developers.sap.com/tutorials/ai-core-data.html
Please open the tutorial and this notebook side-by-side for better understanding.

## Pre-requisite 
Create connection with SAP AI Core. Edit the below cell

In [2]:
%pip install ai_core_sdk

Note: you may need to restart the kernel to use updated packages.


In [3]:
import os
import yaml

# Load Library
from ai_core_sdk.ai_core_v2_client import AICoreV2Client

with open('credentials.yaml', 'r') as file:
    env_vars = yaml.safe_load(file)
    
os.environ['AICORE_CLIENT_ID'] = env_vars['AICORE_CLIENT_ID']
os.environ['AICORE_CLIENT_SECRET'] = env_vars['AICORE_CLIENT_SECRET']
os.environ['AICORE_RESOURCE_GROUP'] = env_vars['AICORE_RESOURCE_GROUP']
os.environ['AICORE_AUTH_URL'] = env_vars['AICORE_AUTH_URL']
os.environ['AICORE_BASE_URL'] = env_vars['AICORE_BASE_URL']

# Create Connection
ai_core_client = AICoreV2Client(
    base_url = "<YOUR_AI_API_URL>" + "/v2", # The present SAP AI Core API version is 2
    auth_url=  "<YOUR_url>" + "/oauth/token", # Suffix to add
    client_id = "<YOUR_clientid>",
    client_secret = "<YOUR_clientsecret>"
)
# no output is expected

## Step 1: Modify AI code

Refer step 1 of the tutorial: https://developers.sap.com/tutorials/ai-core-code.html#cf7b33ab-c455-47ee-a812-33a1ff587cf0

## Step 2: Create placeholders for datasets in workflows

*Refer step 2 of the tutorial: https://developers.sap.com/tutorials/ai-core-data.html#9d10c18c-a5af-4b8d-ab61-a54738e88dfd*

## Step 3: Create placeholders for hyperparameter

*Refer step 3 of the tutorial: https://developers.sap.com/tutorials/ai-core-data.html#6bd72f0d-4342-4a63-92af-5e598ec4c429*

## Step 4: Observe your scenario and placeholder

In [8]:
import yaml

with open ('credentials.yaml', 'r') as file:
    config = yaml.safe_load(file)

scenario_id = config['INGEST_DATA_SCENARIO_ID']
resource_group = config['AICORE_RESOURCE_GROUP']

response = ai_core_client.executable.query(
    scenario_id = scenario_id, resource_group = resource_group
)

for executable in response.resources:
    for key, value in executable.__dict__.items():
        if "artifact" in key or "parameter" in key:
            print(f"{key} :")
            for placeholder in value:
                print(f" {placeholder.__dict__}")
        else:
            print(f"{key} : {value}")

AIAPIAuthenticatorException: Could not retrieve Authorization token

## Step 5: Create cloud storage for datasets and models

Refer step 5 of the tutorial: https://developers.sap.com/tutorials/ai-core-data.html#f85e50fd-5e5b-40d1-8159-a8e6e1cb93c0

## Step 6: Connect local system to AWS S3

Refer step 6 of the tutorial: https://developers.sap.com/tutorials/ai-core-data.html#13bce519-1304-484c-aaf2-cd4296531a54

## Step 7: Upload datasets to AWS S3

Refer step 7 of the tutorial: https://developers.sap.com/tutorials/ai-core-data.html#0e5a8f81-d0cc-48e3-9da7-a8847e8c338d

## Step 8: Store an object store secret in SAP AI Core

In [None]:
# Create object Store secret
response = ai_core_client.object_store_secrets.create(
    name = "mys3", # identifier for this secret within your SAP AI Core
    path_prefix = "example-dataset/house-price-toy", # path that we want to limit restrict this secret access to
    type = "S3",
    data = { # Dictionary of credentials of AWS
        "AWS_ACCESS_KEY_ID": "<YOUR_AWS_ID>",
        "AWS_SECRET_ACCESS_KEY": "<YOUR_AWS_KEY>"
    },
    bucket = "<YOUR_BUCKET_NAME>", # Edit this
    region = "eu-central-1", # Edit this
    endpoint = "s3-eu-central-1.amazonaws.com", # Edit this
    resource_group = "default" # object store secret are restricted within this resource group. you may change this when creating secret for another resource group.
)
print(response.__dict__)

## Step 9: Create artifact to specify folder of dataset

In [None]:
# Create Artifact
from ai_api_client_sdk.models.artifact import Artifact
from ai_api_client_sdk.models.label import Label

response = ai_core_client.artifact.create(
    name = "House Price Dataset 101", # Custom Non-unqiue identifier
    kind = Artifact.Kind.DATASET,
    url = "ai://mys3/data/jan", #
    scenario_id = "learning-datalines",
    description = "Prices in the month of Jan",
    labels = [
        Label(key="ext.ai.sap.com/month", value="Jan"), # any descriptive key-value pair, helps in filtering, key must have the prefix ext.ai.sap.com/
    ],
    resource_group = "default" # required to restrict object store secret usage within a resource group
)

print(response.__dict__)

## Step 10: Locate artifacts

In [None]:
### List Artifacts
response = ai_core_client.artifact.query(resource_group="default")
#
for artifact in response.resources:
    for key, value in artifact.__dict__.items():
        if "label" in key:
            if value is None:
                continue
            print(f"{key} :")
            for label in value:
                print(f" {label.__dict__}")
        else:
            print(f"{key} : {value}")
    print('-'*3)

*Please refer to step 5: https://developers.sap.com/tutorials/ai-core-code.html#beb0c055-7441-41d3-a285-304a1c41b6fb*

## Step 11: Use artifacts with workflows using configuration

In [None]:
from ai_api_client_sdk.models.parameter_binding import ParameterBinding
from ai_api_client_sdk.models.input_artifact_binding import InputArtifactBinding

response = ai_core_client.configuration.create(
    name = "House Price January 1",
    scenario_id = "learning-datalines",
    executable_id = "data-pipeline",
    input_artifact_bindings = [
        InputArtifactBinding(key = "housedataset", artifact_id = "<YOUR_JAN_ARTIFACT_ID>") # placeholder as name
    ],
    parameter_bindings = [
        ParameterBinding(key = "DT_MAX_DEPTH", value = "3") # placeholder name as key
    ],
    resource_group = "default"
)
print(response.__dict__)


## Step 12: Run you workflow using execution

In [None]:
# Create execution
response = ai_core_client.execution.create(
    configuration_id = '<YOUR_CONFIGURATION_ID>',
    resource_group = 'default'
)

response.__dict__


In [None]:
# show execution logs
response = ai_core_client.execution.query_logs(
    execution_id = '<YOUR_EXECUTION_ID>',
    resource_group = 'default',
    start = datetime(1990, 1, 1) # Optional, else shows logs of last 1 hour
)

for log in response.data.result:
    print(log.__dict__)


## Step 13: Set model pipeline

*Refer step 13 of the tutorial: https://developers.sap.com/tutorials/ai-core-data.html#8a3de5a6-565a-4836-9ccf-4d4cbe0c4252*

## Step 14: Create required object store secret `default` for model


In [None]:
# Create object Store secret for placing models to S3
response = ai_core_client.object_store_secrets.create(
    name = "default", # name must be `default`, please DONT correlate with resource group name (which is also default here), these are not related.
    path_prefix = "example-dataset/house-price-toy/model", # ensure path prefix is targeted to where you want your models subdirectory to be located
    type = "S3",
    data = { # Dictionary of credentials of AWS
        "AWS_ACCESS_KEY_ID": "<YOUR_AWS_ID>",
        "AWS_SECRET_ACCESS_KEY": "<YOUR_AWS_KEY>"
    },
    bucket = "<YOUR_BUCKET_NAME>", # bucket name,
    region = "eu-central-1", # Edit this
    endpoint = "s3-eu-central-1.amazonaws.com", # Edit this
    resource_group = "default"
)
print(response.__dict__)


## Step 15: Create another configuration with new data

In [None]:
from ai_api_client_sdk.models.parameter_binding import ParameterBinding
from ai_api_client_sdk.models.input_artifact_binding import InputArtifactBinding

response = ai_core_client.configuration.create(
    name = "House Price Feburary 1",
    scenario_id = "learning-datalines",
    executable_id = "data-pipeline",
    input_artifact_bindings = [
        InputArtifactBinding(key = "housedataset", artifact_id = "<YOUR_FEB_ARTIFACT_ID>") # placeholder as name
    ],
    parameter_bindings = [
        ParameterBinding(key = "DT_MAX_DEPTH", value = "5") # placeholder name as key
    ],
    resource_group = "default"
)
print(response.__dict__)


## Step 16: Create execution

In [None]:
# execute this multiple times in interval of 30 seconds
response = ai_core_client.execution.get(
    execution_id = '<YOUR_EXECUTION_ID>',
    resource_group = 'default'
)

for key, value in response.__dict__.items():
    if "output" in key:
        print(f"{key} : ")
        for artifact in value:
            print(f" {artifact.__dict__}")
    else:
        print(f"{key} : {value}")


## Step 17: Locate your model in AWS S3

*Refer step 17 of the tutorial: https://developers.sap.com/tutorials/ai-core-data.html#a74a958b-509c-42a8-a29d-d89dd240e0a4*