In [1]:
!pip install -U "generative-ai-hub-sdk>=3.1" tqdm

Looking in indexes: https://pypi.org/simple, https://int.repositories.cloud.sap/artifactory/api/pypi/deploy-releases-pypi/simple, https://int.repositories.cloud.sap/artifactory/api/pypi/proxy-deploy-releases-hyperspace-pypi/simple

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.0.1[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3.10 -m pip install --upgrade pip[0m


# ReAct Agent From Scratch Using SAP's Generative AI Hub

We are going to build a [ReAct Agent](https://arxiv.org/abs/2210.03629) without using any AI Agent framework. We are going to use the generative AI hub's **Orchestration service** to do the LLM calls.

![Screenshot taken from ](./orchestration_service.png)

The Orchestration service will help you to design robust AI workflows, connecting diverse components like data pipelines, AI models, and prebuilt modules (grounding, content filtering, and more), and gain peace of mind with less maintenance. Focus on innovation, not integration, and bring your AI vision to life faster. Learn more about the Orchestation service [here](https://help.sap.com/docs/sap-ai-core/sap-ai-core-service-guide/orchestration-workflow).

The Orchestration service provides access to all LLMs available on the Generative AI Hub through a harmonize API. This API allows the integration of both proprietary and open source models, providing a unified interface for an ever growing and changing model landscape.


#### Create an Orchestration Deployment

To use orchestration, we need to create a deployment for it in AI Core.

The following code will check your instance for an existing deployment, and if none is found, create one and wait for it to become available.

In [2]:
from typing import Literal, Type, Union, Dict, Any, List, Callable
from gen_ai_hub.proxy import get_proxy_client
from typing import Callable
from ai_core_sdk.ai_core_v2_client import AICoreV2Client
from ai_api_client_sdk.models.status import Status
import time
from IPython.display import clear_output

client = get_proxy_client()

def spinner(check_callback: Callable, timeout: int = 300, check_every_n_seconds: int = 10):
    start = time.time()
    last_check = start
    while time.time() - start < timeout:
        now = time.time()
        if now - start > timeout:
            break
        if now - last_check > check_every_n_seconds:
            return_value = check_callback()
            if return_value:
                return return_value
        for char in '|/-\\':
            clear_output(wait=True)  # Clears the output to show a fresh update
            print(f'Waiting for the deployment to become ready... {char}')
            time.sleep(0.2)  # Adjust the speed as needed


def retrieve_or_deploy_orchestration(ai_core_client: AICoreV2Client,
                                     scenario_id: str = "orchestration",
                                     executable_id: str = "orchestration",
                                     config_suffix: str = "simple",
                                     start_timeout: int = 300):
    if not config_suffix:
        raise ValueError("Empty `config_suffix` not allowed")
    deployments = ai_core_client.deployment.query(
        scenario_id=scenario_id,
        executable_ids=[executable_id],
        status=Status.RUNNING
    )
    if deployments.count > 0:
        return sorted(deployments.resources, key=lambda x: x.start_time)[0]
    config_name = f"{config_suffix}-orchestration"
    configs = ai_core_client.configuration.query(
        scenario_id=scenario_id,
        executable_ids=[executable_id],
        search=config_name
    )
    if configs.count > 0:
        config = sorted(deployments.resources, key=lambda x: x.start_time)[0]
    else:
        config = ai_core_client.configuration.create(
            scenario_id=scenario_id,
            executable_id=executable_id,
            name=config_name,
        )
    deployment = ai_core_client.deployment.create(configuration_id=config.id)

    def check_ready():
        updated_deployment = ai_core_client.deployment.get(deployment.id)
        return None if updated_deployment.status != Status.RUNNING else updated_deployment
    
    return spinner(check_ready)

#### Helper Function to run LLM requests

We will define a simple helper function that accepts a prompt template, model name and values for the placeholders in the prompt template.

In [3]:
import pathlib

import yaml

from gen_ai_hub.proxy import get_proxy_client
from ai_api_client_sdk.models.status import Status

from gen_ai_hub.orchestration.models.config import OrchestrationConfig
from gen_ai_hub.orchestration.models.llm import LLM
from gen_ai_hub.orchestration.models.message import SystemMessage, UserMessage
from gen_ai_hub.orchestration.models.template import Template, TemplateValue
from gen_ai_hub.orchestration.models.content_filter import AzureContentFilter
from gen_ai_hub.orchestration.service import OrchestrationService

client = get_proxy_client()
deployment = retrieve_or_deploy_orchestration(client.ai_core_client)
orchestration_service = OrchestrationService(api_url=deployment.deployment_url, proxy_client=client)

def send_request(prompt, _print=False, _model='meta--llama3-70b-instruct', **kwargs):
    config = OrchestrationConfig(
        llm=LLM(name=_model, parameters={"temperature": 0}),
        template=Template(messages=[UserMessage(prompt)])
    )
    template_values = [TemplateValue(name=key, value=value) for key, value in kwargs.items()]
    answer = orchestration_service.run(config=config, template_values=template_values)
    result = answer.module_results.llm.choices[0].message.content 
    if _print:
        formatted_prompt = answer.module_results.templating[0].content
        print(f"<-- PROMPT --->\n{formatted_prompt if _print else prompt}\n<--- RESPONSE --->\n{result}")  
    return result


In [4]:
template = "Write a haiku about {{?topic}}"

_ = send_request(template, topic="SAP Devtover", _print=True, _model='meta--llama3-70b-instruct')

<-- PROMPT --->
Write a haiku about SAP Devtover
<--- RESPONSE --->
Here is a haiku about SAP Devtober:

Code flows like a stream
Innovate with SAP pride
Devtober dreams big


## Building a ReAct Agent from Scratch

We need to define tools for the agent. To simplify this we will write a helper class that extracts the name of the tool, description and input schema from a python function definition.

In [5]:
from dataclasses import dataclass
from typing import Callable
import json

@dataclass
class ToolInfo:
    name: str
    description: str
    schema: dict

    @classmethod
    def from_callable(cls, func: Callable):
        # Get the function's name
        func_name = func.__name__
        # Get the function's docstring, default to an empty string if it's None
        func_description = func.__doc__ or ""
        # Get the function's type annotations
        annotations = func.__annotations__
        # Filter out any arguments without a type annotation
        schema = {k: str(v) for k, v in annotations.items() if k != 'return'}
        # Return an instance of ToolInfo with the extracted name, description, and schema
        return cls(name=func_name, description=func_description, schema=schema)

    def __str__(self):
        return f"Tool: {self.name}\nUsage Info: {self.description}\nSchema: {self.schema}"



In [6]:
def add(a: int, b: int) -> int:
    """Adds two numbers."""
    return a + b

print(str(ToolInfo.from_callable(add)))


Tool: add
Usage Info: Adds two numbers.
Schema: {'a': "<class 'int'>", 'b': "<class 'int'>"}


The key component of the ReAct agent is the prompt.
We need to instruct the model to follow the `Thought -> Action -> Observation`flow, informing the model about the tools and how to use them.

In [7]:
REACT_PROMPT = """Respond to the human as helpfully and accurately as possible. You are an agent that gets a set of tools and can use these tools in actions.

The goal is to provide an answer to the query of the user. To find the correct answer you can use tools. As input your are given the user query and steps already taken.
Your response should always follow the pattern shown below. You in your response you have to fill ... with actual output. Text in [] is to explain what kind of output is expected.

--- start: response format in case of an action---
Thought: ... [Your though on what to do next]
Action:
```
... [A json blob describing the next action]
```
Observation: [always finish with "Observation" to trigger the execution of the action and to see the result]
--- end: response format in case of an action---
If you don't have to use a tool because you know the final answer instead of specifying an Action, give the final Answer like this:
--- start: response format to provide the final answer---
Thought: ... [Your though on what to do next]
Final Answer: the final answer to the original input question
--- end: response format to provide the final answer---

You have access to the following tools:

{{?tools}}

Use a json blob to specify a tool by providing an "name" key (tool name) and an "input" key (tool input).
Provide only ONE action per json, as shown:
--- start: json blob ---
{
  "name": ... [Name of the tool, valid tools, are: {{?tool_names}}]
  "input": ... [Arguments of the tool, expected schemas of the tools are listed above]
}
--- end: json blob ---
You can't use comments in the json blob.


To recap follow this format:

--- start: recap ---
User Query: ... [input query to answer]
Thought: ... [your thoughts on the action to take]
Action:
```
{
  "name": ... [Name of the tool, valid tools, are: {{?tool_names}}]
  "input": ... [Arguments of the tool, expected schemas of the tools are listed above]
}
```
Observation: ... [Result of the action]

... [repeat Thought/Action/Observation N times]

Thought: I know what to respond
Action:
```
{
  "name": "finish",
  "input": {
      "answer": ... [The final answer to return the result of exeucting the to plan to answer the original input. Using the "finish" tool is the only way to finish the conversation.]
  }
}
```
--- end: recap ---

Begin! Always start your response with "Thought" and finish by an Action + json blob with key "name" and "input". No text outside that schema. Only one Action per response. Never give an Observation as part of your response.

User Query: {{?query}}

{{?scratchpad}}"""

Now we need the code that:
1. Calls the LLM
2. Parses the response
3. Executes the tools
4. Triggers the next agent loop is not finished

In [8]:
# helper functions to parse the response and append observation to the response
import re

def extract_and_parse_json(text):
    # Use regex to find text encapsulated within ``` ... ``` or ```json ... ```blocks
    pattern = r"```(?:json)?\s*(.*?)```"
    match = re.search(pattern, text, re.DOTALL)
    
    if match:
        json_text = match.group(1).strip()
        try:
            # Parse the extracted JSON-like text
            parsed_json = json.loads(json_text)
            return parsed_json
        except json.JSONDecodeError:
            raise ValueError(f"Invalid JSON format: {text}")
    else:
        return ValueError(f"No JSON found in text: {text}")

def append_for_scratchpad(input_text: str, value: str, suffix: str = "Observation: ") -> str:
    # Split the input text by lines
    lines = input_text.splitlines()
    last_line = lines[-1]
    desired_line = f"{suffix}{value}"
    for c_exisiting, c_wanted in zip(last_line, desired_line):
        if c_exisiting != c_wanted:
            return f"{input_text}\n{desired_line}"
    return f"{input_text}{desired_line[len(last_line):]}"

def print_scratchpad(scratchpad):
    for i, step in enumerate(scratchpad):
        print(f"\x1b[31m >>> STEP {i+1} <<< \x1b[0m" + f"\n{step}")

In [9]:
# Tool to return the final answer
class FinalAnswer(Exception):
    def __init__(self, text: str, *args, **kwargs):
        self.text = text
        super().__init__(*args, **kwargs)

def finish(answer: str) -> str:
    """Give the final answer to the user query."""
    raise FinalAnswer(text=answer) 



In [10]:
import traceback

class ReactAgent:
    def __init__(self, tools: List[Callable], model='meta--llama3-70b-instruct'):
        self.model = model
        self.tools = {}
        self.tool_names = []
        self.tool_descriptions = []
        for tool in tools + [finish]: # always add the finish tool
            tool_info = ToolInfo.from_callable(tool)
            self.tool_names.append(tool_info.name)
            self.tool_descriptions.append(str(tool_info))
            self.tools[tool_info.name] = tool
    
    
    def run(self, query: str, max_turns: int = 20):
        scratchpad = []
        turns = 0
        while True and turns < max_turns:
            response = send_request(
                REACT_PROMPT,
                _model=self.model,
                _print=False,
                query=query,
                scratchpad="\n".join(scratchpad),
                tools="\n\n".join(self.tool_descriptions),
                tool_names=", ".join(self.tool_names)
            )
            try:
                action = extract_and_parse_json(response)
                observation = str(self.tools[action["name"]](**action["input"]))
            except FinalAnswer as answer:
                scratchpad.append(append_for_scratchpad(response, answer.text))
                return answer.text, scratchpad
            except Exception as e:
                observation = "Error when calling the tool" + "\n".join(traceback.format_exc().splitlines()[-10:])
            scratchpad.append(append_for_scratchpad(response, observation))
            turns += 1
        return None, scratchpad          

Now, let's test the agent.

As is well known, LLMs often struggle with even simple mathematical calculations. Therefore, we will provide our agent with tools for addition, subtraction, multiplication, and division.

In [11]:
# Tools
def add(a: int, b: int) -> int:
    """Adds two numbers."""
    return a + b

def subtract(a: int, b: int) -> int:
    """Subtract two numbers."""
    return a - b

def multiply(a: int, b: int) -> int:
    """Multiply two numbers."""
    return a * b

def divide(a: int, b: int) -> int:
    """Divide two numbers."""
    return a / b

agent = ReactAgent([add, multiply, subtract, divide])
res, scratchpad = agent.run("What is 5+5*2-10?")

Let us take a look at the scratchpad.

In [12]:
print_scratchpad(scratchpad)

[31m >>> STEP 1 <<< [0m
Thought: To evaluate this expression, I need to follow the order of operations (PEMDAS). First, I'll multiply 5 and 2, then add 5, and finally subtract 10.

Action:
```
{
  "name": "multiply",
  "input": {
    "a": 5,
    "b": 2
  }
}
```
Observation: 10
[31m >>> STEP 2 <<< [0m
Thought: Now that I have the result of the multiplication, I'll add 5 to it.

Action:
```
{
  "name": "add",
  "input": {
    "a": 5,
    "b": 10
  }
}
```
Observation: 15
[31m >>> STEP 3 <<< [0m
Thought: Now that I have the result of the addition, I'll subtract 10 from it.

Action:
```
{
  "name": "subtract",
  "input": {
    "a": 15,
    "b": 10
  }
}
```
Observation: 5
[31m >>> STEP 4 <<< [0m
Thought: I have the final result, which is 5.

Action:
```
{
  "name": "finish",
  "input": {
    "answer": "5"
  }
}
```
Observation: 5


Elementary school math is a bit lame, let us make our agent smarter and prevent halluciation by adding a Wikipedia search tool.

In [13]:
"""Util that calls Wikipedia."""
from typing import Any, Optional
import logging
import wikipedia

WIKIPEDIA_MAX_QUERY_LENGTH = 300

class WikipediaAPIWrapper:
    """Wrapper around WikipediaAPI.

    This wrapper will use the Wikipedia API to conduct searches and
    fetch page summaries. By default, it will return the page summaries
    of the top-k results. It limits the content by doc_content_chars_max.
    """

    def __init__(self, top_k_results=3, lang="en", doc_content_chars_max=1000):
        """Initialize WikipediaAPIWrapper with optional parameters."""
        self.top_k_results = top_k_results
        self.doc_content_chars_max = doc_content_chars_max
        wikipedia.set_lang(lang)

    def wiki_search(self, query: str) -> str:
        """Perform a Wikipedia search and retrieve page summaries.
        Good queries are for specific people, places, etc . things that have existing articles in the lexicon, rather than responding to questions."""
        try:
            page_titles = wikipedia.search(query[:WIKIPEDIA_MAX_QUERY_LENGTH], results=self.top_k_results)
        except Exception as e:
            logger.error(f"Error searching Wikipedia: {e}")
            return "Error occurred during Wikipedia search."

        summaries = []
        for page_title in page_titles[:self.top_k_results]:
            wiki_page = self._fetch_page(page_title)
            if wiki_page:
                summary = self._formatted_page_summary(page_title, wiki_page)
                if summary:
                    summaries.append(summary)

        if not summaries:
            return "No good Wikipedia Search Result was found"

        return "\n\n".join(summaries)[:self.doc_content_chars_max]

    @staticmethod
    def _formatted_page_summary(page_title: str, wiki_page: Any) -> Optional[str]:
        return f"Page: {page_title}\nSummary: {wiki_page.summary}"

    @staticmethod
    def _fetch_page(page_title: str) -> Optional[Any]:
        """Fetch the Wikipedia page."""
        try:
            return wikipedia.page(title=page_title, auto_suggest=False)
        except (wikipedia.exceptions.PageError, wikipedia.exceptions.DisambiguationError):
            return None


In [14]:
wiki_wrapper = WikipediaAPIWrapper()

In [15]:
wiki_wrapper.wiki_search("SAP")

"Page: SAP\nSummary: SAP SE (; German pronunciation: [ɛsʔaːˈpeː] ) is a German multinational software company based in Walldorf, Baden-Württemberg, Germany. It develops enterprise software to manage business operation and customer relations. The company is the world's largest enterprise resource planning (ERP) software vendor. \nFounded in 1972 as a private partnership named Systemanalyse und Programmentwicklung (System Analysis and Software Development). SAP GbR became in 1981 fully Systeme, Anwendungen und Produkte in der Datenverarbeitung abbreviated SAP GmbH after a five-year transition period beginning in 1976.:\u200a1972–1980\u200a In 2005, it further restructured itself as SAP AG. Since 7 July 2014, its corporate structure is that of a pan-European societas Europaea (SE); as such, its former German corporate identity is now a subsidiary, SAP Deutschland SE & Co. KG. It has regional offices in 180 countries and over 111,961 employees. \nSAP is a component of the DAX and Euro Stox

In [16]:
task = "What is ([Age of barack obama] * 1000) / ([SAP founding year] - [Age of Taylor Swift] - 437)?"

In [17]:
print(send_request(task))

A math problem!

Let's break it down step by step:

1. Age of Barack Obama: As of 2023, Barack Obama was born on August 4, 1961, which makes him 62 years old.
2. SAP founding year: SAP was founded in 1972.
3. Age of Taylor Swift: As of 2023, Taylor Swift was born on December 13, 1989, which makes her 34 years old.

Now, let's plug in the values:

([Age of Barack Obama] * 1000) = (62 * 1000) = 62,000

([SAP founding year] - [Age of Taylor Swift] - 437) = (1972 - 34 - 437) = 1501

Now, divide the two results:

62,000 ÷ 1501 ≈ 41.37

So, the answer is approximately 41.37!


In [18]:
agent = ReactAgent([add, multiply, subtract, divide, wiki_wrapper.wiki_search])
res, scratchpad = agent.run(task)
print_scratchpad(scratchpad)

[31m >>> STEP 1 <<< [0m
Thought: To evaluate this expression, I need to find the values of Barack Obama's age, SAP's founding year, and Taylor Swift's age.

Action:
```
{
  "name": "wiki_search",
  "input": {
      "query": "Barack Obama"
  }
}
```
Observation: Page: Barack Obama
Summary: Barack Hussein Obama II (born August 4, 1961) is an American politician who served as the 44th president of the United States from 2009 to 2017. As a member of the Democratic Party, he was the first African-American  president in U.S. history. Obama previously served as a U.S. senator representing Illinois from 2005 to 2008 and as an Illinois state senator from 1997 to 2004. 
Obama was born in Honolulu, Hawaii. He graduated from Columbia University in 1983 with a Bachelor of Arts degree in political science and later worked as a community organizer in Chicago. In 1988, Obama would enroll in Harvard Law School, where he became the first black president of the Harvard Law Review. He became a civil rig

and we can add more tools!

In [19]:
from datetime import datetime
def today():
    """Get today's time in isoformat."""
    return datetime.now()

In [20]:
agent = ReactAgent([add, multiply, subtract, divide, today, wiki_wrapper.wiki_search])
res, scratchpad = agent.run(task)
print_scratchpad(scratchpad)

[31m >>> STEP 1 <<< [0m
Thought: To evaluate this expression, I need to find the age of Barack Obama, the founding year of SAP, and the age of Taylor Swift.

Action:
```
{
  "name": "wiki_search",
  "input": {
      "query": "Barack Obama"
  }
}
```
Observation: Page: Barack Obama
Summary: Barack Hussein Obama II (born August 4, 1961) is an American politician who served as the 44th president of the United States from 2009 to 2017. As a member of the Democratic Party, he was the first African-American  president in U.S. history. Obama previously served as a U.S. senator representing Illinois from 2005 to 2008 and as an Illinois state senator from 1997 to 2004. 
Obama was born in Honolulu, Hawaii. He graduated from Columbia University in 1983 with a Bachelor of Arts degree in political science and later worked as a community organizer in Chicago. In 1988, Obama would enroll in Harvard Law School, where he became the first black president of the Harvard Law Review. He became a civil ri

In [21]:
res, scratchpad = agent.run("How many days ago was the 2024 Super Bowl? Spend some thoughts on own to do proper calculations with dates.")
print_scratchpad(scratchpad)

[31m >>> STEP 1 <<< [0m
Thought: To find the number of days ago the 2024 Super Bowl was, I need to know the date of the 2024 Super Bowl and today's date. I'll use the wiki_search tool to find the date of the 2024 Super Bowl and the today tool to get today's date.
Action:
```
{
  "name": "wiki_search",
  "input": {
      "query": "2024 Super Bowl"
  }
}
```
Observation: Page: Super Bowl LVIII
Summary: Super Bowl LVIII was an American football game played to determine the champion of the National Football League (NFL) for the 2023 season. In a rematch of Super Bowl LIV from four years earlier, the American Football Conference (AFC) champion and defending Super Bowl champion Kansas City Chiefs defeated the National Football Conference (NFC) champion San Francisco 49ers 25–22 in overtime. The Chiefs became the first team to win back-to-back Super Bowls since the 2004 New England Patriots. The game was played on February 11, 2024, at Allegiant Stadium in Paradise, Nevada. This was the fir