<a target="_blank" href="https://colab.research.google.com/github/amanichopra/sap-genai-hub/blob/main/orchestration_templating.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# Preparation

## Install Libraries

In [None]:
!pip install "generative-ai-hub-sdk[all]"
!pip install "numpy<2.0.0" --force-reinstall

Now, make sure to reset the runtime. In Google Colab, you can do this by clicking `Runtime` and `Restart Session`, as shown here:

<img src="assets/colab_restart_session.png" style="width:500px">

Now, you can continue by running the below cells. The packages have already been installed into the runtime before restarting.

## Authentication

Before requests to orchestration can be issued, we need to provide authentication details to the SDK. This can be done either via a configuration file or via the environment. Make sure to read the [Generative AI Hub SDK docs](https://help.sap.com/doc/generative-ai-hub-sdk/CLOUD/en-US/index.html) for more details. Below you will find an example for authenticating via environment variables using this very notebook. Ensure to store credentials in a file called `env_vars.env` file for the below command to work. If using Google Colab, you can place this file in the project folder by clicking the folder icon on the left and dropping the file in the workspace as shown:

<img src="./assets/upload_env.png" style="width:500px">

In [1]:
import os
from dotenv import load_dotenv

load_dotenv(dotenv_path='env_vars.env')

True

## Intializing the Orchestration Service

Typically, a virtual deployment of Orchestration must be configured before any interactions can occur. Once deployed, you will have access to a unique endpoint URL and deployment ID. You can use either the URL or ID when using the SDK. In this particular exercise, we will use deployment ID, which should be defined the the `AICORE_ORCH_DEPLOYMENT_ID` environment variable.

# Templating Module in the Orchestration Service

Now that everything is prepared, we can write our first basic orchestration pipeline. The first fundamental module we will look at is templating. The templating module provides capabilities to define prompt skeletons that can then be parameterized per inference call. To check out how this works we first up select a Large Language Model (LLM) that will be used for inference.

In [14]:
from gen_ai_hub.orchestration.models.llm import LLM

llm = LLM(
    name="gemini-1.5-flash",
    version="latest",
    parameters={"max_tokens": 256, "temperature": 0.2},
)

Now we can create a template using the template object provided by the Generative AI Hub SDK.

In [15]:
from gen_ai_hub.orchestration.models.message import SystemMessage, UserMessage
from gen_ai_hub.orchestration.models.template import Template, TemplateValue

template = Template(
    messages=[
        SystemMessage("You are a helpful translation assistant."),
        UserMessage(
            "Translate the following text to {{?to_lang}}: {{?text}}",
        ),
    ],
    defaults=[
        TemplateValue(name="to_lang", value="English"),
    ],
)

The code above creates a template that provides

- a system message,
- a user message that leverages templating syntax,
- default values for the introduced template parameters.

Currently there are three message types available:
- `SystemMessage`: A message for priming AI behavior. The system message is usually passed in as the first of a sequence of input messages.
- `UserMessage`: A message from a user.
- `AssistantMessage`: A message of the LLM.

Parameters are defined within the message string using the following syntax: `{{?param_name}}`.

Next up we create a orchestration configuration from the created objects.

In [16]:
from gen_ai_hub.orchestration.models.config import OrchestrationConfig

config = OrchestrationConfig(
    template=template,
    llm=llm,
)

Lastly, we can call the orchestration service. Note that the actual template values are now passed to the `run` method. The `TemplateValue` name parameter corresponds to the parameter name `text` provided in the user message string. The parameter `to_lang` is omitted and the default defined in the PromptTemplate is used.

In [22]:
from gen_ai_hub.orchestration.service import OrchestrationService

orchestration_service = OrchestrationService(
    deployment_id=os.environ['AICORE_ORCH_DEPLOYMENT_ID'],
    config=config,
)
result = orchestration_service.run(
    template_values=[
        TemplateValue(
            name="text",
            value="Interaktives Lernen mit SAP.",
        )
    ]
)
print(result.orchestration_result.choices[0].message.content)

Interactive learning with SAP.



# Model Harmonization

Orchestration harmonizes model usage, removing the need for prompting each model in vendor specific fashion. You can easily switch between a variety of models. Check out this SAP [note](https://me.sap.com/notes/3437766) for further information regarding model availability. The code below will demonstrate how to easily switch between models, based on the templating code above!

In [26]:
orchestration_service.config.llm = LLM(
    name="o3-mini",
) # switch out gemini-1.5-flash with o3-mini

result = orchestration_service.run(
    template_values=[
        TemplateValue(
            name="text",
            value="Interaktives Lernen mit SAP.",
        )
    ]
)
print(result.orchestration_result.choices[0].message.content)

Interactive learning with SAP.


You can switch between those two models and compare their responses throughout all exercises. Simply change the `name` parameter of the LLM module configuration.

# Summary

Within this exercise you learned how to create a basic Orchestration pipeline that uses the Templating module. Also, you changed the model used for inference with ease. Let's explore more modules in the following exercises. Continue to [Exercise 3 - Orchestration Content Filtering](./orchestration_content_filtering.ipynb).