## Payment Delay Explain
While [SHapley Additive exPlanations (SHAP) values](https://shap.readthedocs.io/en/latest/index.html) offer a powerful explanation of the prediction for data scientists, they can be difficult to interpret directly for business users. In this notebook we will use SAP AI Foundation services for Large Language Model (LLM) to intepret the prediction results in natural language. 

We demonstrate this through the example of [SAP AI Core Orchestration Service](https://help.sap.com/docs/sap-ai-core/sap-ai-core-service-guide/orchestration-8d022355037643cebf775cd3bf662cc5?locale=en-US), which provides harmonized access to a wide range of frontier AI / LLM models. The orchestration service will be exposed as an MLflow model, for fully integrated downstream processing in SAP Databricks.

1. Install and import packages
2. Load prediction data and SHAP Values
3. Define Explanation Model class
4. Run Explanation Model
5. Validate explanation output

## Prerequisites
> [IMPORTANT]
> This step has already been done by the administrator. However, if you need to do it on you own you can follow the steps described here to create a secret scope for secure access of SAP Cloud SDK for AI:

To securely store and access SAP AI Core credentials from within SAP Databricks, create a Databricks secret(refer to [Databricks documentation on creating secret scopes](https://docs.databricks.com/aws/en/security/secrets)). 

Then, store the following access parameters as case-sensitive key-value pairs within this scope, as they are [defined in the documentation of the generative AI SDK](https://help.sap.com/doc/generative-ai-hub-sdk/CLOUD/en-US/_reference/README_sphynx.html#environment-variables):

- AICORE_BASE_URL
- AICORE_AUTH_URL
- AICORE_CLIENT_ID
- AICORE_CLIENT_SECRET
- AICORE_RESOURCE_GROUP

Setting these credentials as environment variables allows the SAP Cloud SDK for AI (Python) to seamlessly authenticate and interact with the SAP AI Core orchestration service within the Databricks runtime. The SAP Cloud SDK for AI (Python) automatically manages the configuration and deployment of orchestration service endpoints and the desired Large Language Models (LLMs) upon initial use.


## 1. Install and import packages
In the next few cells, we will install and import the required packages. Please make sure, that we use the right versions of the libraries. While SAP AI Core capabilities are exposed via a REST API (see the [documentation on the SAP Business Accelerator Hub](https://api.sap.com/package/SAPAICore/rest)), this guide leverages the SAP Cloud SDK for AI (Python) to simplify consumption within Databricks notebooks. This SDK [sap-ai-sdk-gen](https://pypi.org/project/sap-ai-sdk-gen/) is free for download from PyPI and is documented on the [SAP Help Portal](https://help.sap.com/doc/generative-ai-hub-sdk/CLOUD/en-US/index.html).

In [0]:
%pip install "sap-ai-sdk-gen[all]"
%pip install "mlflow[databricks]==3.1.3"
%restart_python

In [0]:
import os
import json
import random
import datetime, time
from typing import List, Optional, Iterable, Union, Dict, Any
import httpx
import mlflow
import gen_ai_hub as gen_ai_hub

from gen_ai_hub.orchestration.service import OrchestrationService
from gen_ai_hub.orchestration.models.config import OrchestrationConfig
from gen_ai_hub.orchestration.exceptions  import OrchestrationError

#### Set Parameters
Please replace the values `<CATALOG_NAME>` and `<SCHEMA_NAME>` with the specific values that match our use case and group. You can find the correct names by checking the **Unity Catalog** and look for the specific catalog and schema names: `uc_XXX`, `grpX`.

In [0]:
%sql
-- CREATE CATALOG IF NOT EXISTS <CATALOG_NAME>;
SET CATALOG <CATALOG_NAME>;
CREATE SCHEMA IF NOT EXISTS <SCHEMA_NAME>;
USE SCHEMA <SCHEMA_NAME>;

In [0]:
SECRET_SCOPE = "aicore_service_params"
# single env var provided to model
os.environ["SECRET_SCOPE"] = SECRET_SCOPE


## 2. Load prediction data and SHAP Values
Replace `<DELAY_PREDICTION_SHAP>` with the name of the delay prediction result table from the previous exercise.

In [0]:
shap_table_df = spark.read.table("uc_delayed_payment.grp1.delay_prediction_dataset_shap_martin")
# display(shap_table_df.limit(10))
print(f"rows: {shap_table_df.count()}, columns: {len(shap_table_df.columns)}")

#### Filter on Top 5 payment delay predictions 

We will use only a subset of the prediction dataset to apply the LLM explanation. For that we create a data smaple of top 5 delays. Replace the `<LIMIT>` with the value `5`.

In [0]:
filtered_shap_table_df = shap_table_df.orderBy("delay_prediction", ascending=False).limit(<LIMIT>).toPandas()
display(filtered_shap_table_df)

## Define Explanation Model class

In the following we show how the additional capabilities of the SAP AI Core Orchestration Service can be leveraged in combination with MLflow in Databricks to generate our model explanations for the delay prediction. We use the orchestration service of GenAI Hub via the SAP Cloud SDK for AI (Python) as documented [here](https://help.sap.com/doc/generative-ai-hub-sdk/CLOUD/en-US/_reference/orchestration-service.html).

In our exercise, we use a **GPT-4o-mini** model with a maximum token limit of **1000**. In our example, we directly use Prompt Template of AI Core without the prompt registry functionality.

For the integration into MLflow, we provide a custom Python class derived from [mlflow.pyfunc.PythonModel](https://mlflow.org/docs/latest/api_reference/python_api/mlflow.pyfunc.html#mlflow.pyfunc.PythonModel). The Python class is elaborated in the following with the methods

- **\__init\__(…)** to instatiate the class
- **load_context(…)** to instantiate the orchestration service
- **predict(…)** to generate the verbal explanations

#### init(...)
For the init function we define in total three different parameters. 
- llm_model: stores the model name, 
- max_tokens:  parameter as well as the 
- orchestration_service: store the orchestration service instance

#### load context(...)

The load_context method is used to establish the connection to an external service from the MLflow custom class. A detailed explanation of the setup of the mlflow pyfunc PythonModel class can be found under the following [tutorial article](https://www.mlflow.org/docs/latest/ml/traditional-ml/tutorials/creating-custom-pyfunc/part2-pyfunc-components/).



Instead of directly consuming a LLM for explanation, we use the orchestration service of the Generative AI Hub because it offers a variety of features such as
- the Prompt Registry for the management prompt templates,
- the capability to mask personal or enterprise critical information
- etc.

For that we configure and create an instance of the SAP AI Core OrchestrationService using the following variables:
 

Replace the variable `<LLM_MODEL>` with the value `GPT-4o-mini` and the variable `<MAX_TOKENS>` with the value `1000`.

In the following we demonstrate how the additional capabilities of the orchestration service can be leveraged in combination with MLflow in Databricks to generate our model explanations for the delay prediction. We use the orchestration service of GenAI Hub via the SAP Cloud SDK for AI (Python) as documented [here](https://help.sap.com/doc/generative-ai-hub-sdk/CLOUD/en-US/_reference/orchestration-service.html).



#### predict(...)
For the predict method, we expect the *model_input* variable which receives the filtered shap values as input.

For the overall output generation, we iterate over each individual role of the pandas dataframe. In the very first step, we extract the primary keys to return them in combination with the generated delay_explanation. The variables shap_array as well as the delay_prediction are used to fill the prompt template we used for the orchestration configuration. 

We focus on the five variables with the highest abolute SHAP values, which we filter out by replacing the 
`<TOP_FEATURES>` with the value `5`.

Once we have prepared the input of our data, we start our MLflow span to log our traces for the individual completion task.

In [0]:
import os
import json
import pandas as pd
import mlflow
from gen_ai_hub.orchestration.models.message import UserMessage
import gen_ai_hub
from gen_ai_hub.orchestration.models.llm import LLM
from gen_ai_hub.orchestration.models.template import Template, TemplateValue
from gen_ai_hub.orchestration.models.message import UserMessage
from typing import Dict, Any
from mlflow.tracing import set_span_chat_messages

class ExplanationModel(mlflow.pyfunc.PythonModel):

    def __init__(self):
        self.feature_descrptions = None
        self.key_columns = None
        self.llm_model = "gpt-4o-mini"
        self.max_tokens = 1000
        self.orchestration_service = None

        # since the model is not saved and loaded, we need to load the context here
        self.load_context()


    def load_context(self, context={}) -> None:
        """
        Load the context from the mlflow context.
        """
        # get secet scope
        secret_scope = os.environ['SECRET_SCOPE']

        # set secret values as environment variables
        for key in ["AICORE_CLIENT_ID", "AICORE_CLIENT_SECRET", "AICORE_AUTH_URL", "AICORE_BASE_URL", "AICORE_RESOURCE_GROUP"]:
            os.environ[key] = dbutils.secrets.get(scope = secret_scope, key=key)

        user_message_content = "{{?prompt}}"

        orchestration_config = OrchestrationConfig(
            template=Template(messages=[UserMessage(user_message_content)]),
            llm=LLM(name=self.llm_model, parameters={"max_tokens": self.max_tokens})
        )
        self.orchestration_service = OrchestrationService(config = orchestration_config)

    def load_key_column_names(self, key_columns):
        self.key_columns = key_columns


    def build_prompt(self, row: Dict[str, Any]) -> str:
        """
        We construct the prompt from "shap_array" as a dynamic set of parameters.
        """

        # We focus on the five variables with the highest abolute SHAP values, which we filter out by replacing the <TOP_FEATURES> with the value 5
        shap_value_array = sorted( row["shap_array"], key = lambda x: (abs(x["shap_value"])), reverse=True)[:5]

        included_feature_number = 5
        feature_descriptions_md = ""
        feature_values_md = ""
        shap_values_md = ""

        for i in range(included_feature_number):
            shap_array_item = shap_value_array[i]
            feature_descriptions_md += f"- '{shap_array_item['column_name']}': {shap_array_item['column_description']}\n"
            feature_values_md +=  f"- Value of '{shap_array_item['column_name']}': {shap_array_item['column_value']}\n"
            shap_values_md += f"- SHAP value for '{shap_array_item['column_name']}': {shap_array_item['shap_value']} days\n"


        prompt = "You are a data scientist who explains predictions of a business artificial intelligence model.\n"
        prompt += "The model predicts the expected delay for a payment.\n"

        prompt += "\nThe following payment attributes are relevant for the prediction:\n"
        prompt += feature_descriptions_md

        prompt += "\nValues of these payment attributes:\n"
        prompt += feature_values_md

        prompt += "\nSHAP values for the attribute:\n"
        prompt += shap_values_md

        #TODO: add confidence etc

        prompt += f"\nThe SHAP value for a feature decribes to what amount the predicted delay of {row['delay_prediction']} days deviates from the average value. A negative SHAP value for a feature means that the feature value reduces the prediction below average, while a positive value means that the feature value increases the prediction above average.\n"
        prompt += "Your task is to explain this specific prediction in a concise manner to a business person who is not a data scientist and who wants to understand which features are relevant for the prediction."
        prompt += " Do not mention the SHAP values in your explanation. You can use the feature values."

        return prompt



    def predict(self, model_input, params=None):
        """ 
        Process the model inuput:
        Expected input: Pandas DataFrame with 
        - key columns, with column names as listed in the list 'key_columns'
        - columns 'shap_array', where each entry has the key 'column_name', 'column_description', 'column_value', 'shap_value' (all double)
        - column 'delay_prediction'
        Generated output: Pandas DataFrame with 
        - key columns,
        - column 'delay_prediction' (double), and 
        - colums 'delay_explanation' (string)
        """
        
        model_output = []
        rows = json.loads(model_input.to_json(orient='records'))

        for row in rows:

            row_result = {}
            for key in self.key_columns:
                row_result[key] = row[key]

            row_result["delay_prediction"] = row["delay_prediction"]

            template_values = []
            prompt = self.build_prompt(row)
            template_values.append(TemplateValue(name="prompt", value=prompt))

            with mlflow.start_span(name="shap_explanation", span_type="LLM") as span:

                span.set_inputs(row)
                try:
                    orchestrationResponse = self.orchestration_service.run(template_values=template_values)
                    orchestration_result = orchestrationResponse.orchestration_result
                    
                    choice = orchestration_result.choices[0]
                    completion = choice.message.content
                    row_result["delay_explanation"] = completion
                    messages = [{"role": message.role.value, "content": message.content} for message in orchestrationResponse.module_results.templating]
                    messages.append({"role": "assistant", "content": completion})
                    set_span_chat_messages(span, messages)
                    span.set_attributes({"model_name": self.llm_model, "max_tokens": self.max_tokens})
                    span.set_outputs({"prompt": prompt, "delay_explanation": completion})

                except OrchestrationError as error:
                    model_output.append(error.message)
                    span.set_outputs({"ERROR": error.message})


                model_output.append(row_result)
        
        
        pd_model_output = pd.DataFrame.from_records(model_output)
        return pd_model_output

## 4. Run explanation model

In [0]:
mlflow.set_tracking_uri("databricks")
mlflow.set_registry_uri("databricks-uc")


In [0]:
key_columns = ["CompanyCode", "AccountingDocument", "FiscalYear", "AccountingDocumentItem"]

In [0]:
explanation_model = ExplanationModel()

explanation_model.load_key_column_names(key_columns)
# explanation_model.load_feature_decriptions(feature_descrptions)


# test prompt generation
explanation_output = explanation_model.predict(filtered_shap_table_df)

display(explanation_output)


## 5. Validate the explanation output

When executed successfully, you should be able to see the MLflow Trace UI. There you will find all logged information well organized for review:
1. The _explanation_output_ table
    - containing 5 records
    - each record has a the column *delay_explanation*, where the LLM result is stored
2. The MLflow Trace UI
    - The tab _Chat_ contains all the messages we logged during the execution. Here you can find in the `Assistant` section the LLM explanation.
    - The tab _Inputs/Outputs contains the Inputs and Outputs for the explanation
    - the tab _Attributes_ contains the parameters we set in the explanation model class 