# Retrieval Augmented Generation with Amazon Bedrock - Retrieving Data Automatically from APIs

> *PLEASE NOTE: This notebook should work well with the **`Data Science 3.0`** kernel in SageMaker Studio*

---

Throughout this workshop so far, we have been working with unstructured text retrieval via semantic similarity search. However, another important type of retrieval which customers can take advantage of with Amazon Bedrock is **structured data retrieval** from APIs. Structured data retrieval is extremely useful for augmenting LLM applications with up to date information which can be retrieved in a repeatable manner, but the outputs are always changing. An example of a question you might ask an LLM which uses this type of retrieval might be "How long will it take for my Amazon.com order containing socks to arrive?". In this notebook, we will show how to integrate an LLM with a backend API service which has the ability to answer a user's question through RAG.

Specifically, we will be building a tool which is able to tell you the weather based on natural language. This is a fairly trivial example, but it does a good job of showing how multiple API tools can be used by an LLM to retrieve dynamic data to augment a prompt. Here is a visual of the architecture we will be building today.

![api](./images/api.png)

Let's get started!

---
## Setup `boto3` Connection

In [None]:
import json
import os
import sys

import boto3
import botocore

import xmltodict
from rich import print as rprint

module_path = ".."
sys.path.append(os.path.abspath(module_path))
from utils import bedrock
from utils.prompt_utils import extract_docstring_info, construct_format_tool_for_claude_prompt

boto3_bedrock = bedrock.get_bedrock_client(
    assumed_role=os.environ.get("BEDROCK_ASSUME_ROLE", None),
    region=os.environ.get("AWS_DEFAULT_REGION", None)
)

---
## Defining the API Tools

The first thing we need to do for our LLM is define the tools it has access to. In this case we will be defining local Python functions, but it is important to note that these could be any type of application service. Examples of what these tools might be on AWS include...

* An AWS Lambda function
* An Amazon RDS database connection
* An Amazon DynamnoDB table
  
More generic examples include...

* REST APIs
* Data warehouses, data lakes, and databases
* Computation engines

In this case, we define two tools which reach external APIs below with two python functions
1. the ability to retrieve the latitude and longitude of a place given a natural language input
2. the ability to retrieve the weather given an input latitude and longitude

In [None]:
import requests

def get_weather(latitude: str, longitude: str):
    """
    Fetches and returns the current weather for a given latitude and longitude from the Open-Meteo API.

    Args:
        latitude (str): The latitude of the location to fetch the weather for.
        longitude (str): The longitude of the location to fetch the weather for.

    Returns:
        dict: A dictionary containing the current weather data for the given location.
    """
    
    
    url = f"https://api.open-meteo.com/v1/forecast?latitude={latitude}&longitude={longitude}&current_weather=true"
    headers = {'User-Agent': 'Mozilla/5.0'}
    response = requests.get(url, headers=headers)
    return response.json()

def get_lat_long(place: str):
    
    """
    Fetches and returns the latitude and longitude of a given place from the Nominatim OpenStreetMap API.

    Args:
        place (str): The name of the place to fetch the latitude and longitude for.

    Returns:
        dict: A dictionary containing the latitude and longitude of the given place, or None if the place was not found.
    """
    
    url = "https://nominatim.openstreetmap.org/search"
    params = {'q': place, 'format': 'json', 'limit': 1}
    headers = {'User-Agent': 'Mozilla/5.0'}
    response = requests.get(url, params=params, headers=headers).json()
    if response:
        lat = response[0]["lat"]
        lon = response[0]["lon"]
        return {"latitude": lat, "longitude": lon}
    else:
        return None

def call_function(tool_name, parameters):
    
    """
    Calls a function with the given name and parameters.

    This function uses the `globals()` function to find a function with the given name in the global scope, and then calls it with the given parameters.

    Args:
        tool_name (str): The name of the function to call.
        parameters (dict): A dictionary of parameters to pass to the function.

    Returns:
        The output of the called function.
    """
    
    
    func = globals()[tool_name]
    output = func(**parameters)
    return output

We also define a function called `call_function` which is used to abstract the tool name. You can see an example of determining the weather in Las Vegas below.

In [None]:
place = 'Las Vegas'
lat_long_response = call_function('get_lat_long', {'place' : place})
weather_response = call_function('get_weather', lat_long_response)
print(f'Weather in {place} is...')
weather_response

As you might expect, we have to describe our tools to our LLM, so it knows how to use them. The strings below describe the python functions for lat/long and weather to Claude in an XML friendly format which we have seen previously in the workshop.

In [None]:
from langchain.chat_models import BedrockChat
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough, RunnableLambda
from langchain_core.output_parsers import StrOutputParser

model_id = "anthropic.claude-3-sonnet-20240229-v1:0"
model_parameter = {"temperature": 0.0, "max_tokens": 2000}
llm = BedrockChat(model_id=model_id, client=boto3_bedrock, model_kwargs=model_parameter, verbose=True)

---
## Define Prompts to Orchestrate our LLM Using Tools

Now that the tools are defined both programmatically and as a string, we can start orchestrating the flow which will answer user questions. The first step to this is creating a prompt which defines the rules of operation for Claude. In the prompt below, we provide explicit direction on how Claude should use tools to answer these questions.

In [None]:
function_calling_prompt = """

Your job is to formulate a solution to a given <user-request> based on the instructions and tools below.

Use these Instructions: 
1. In this environment you have access to a set of tools and functions you can use to answer the question.
2. You can call the functions by using the <function_calls> format below.
3. Only invoke one function at a time and wait for the results before invoking another function.
4. The results of the function will be in xml tag <function_results>. 
5. Only use the information in the <function_results> to answer the question.
6. Only make function calls if the answer is not already in the <function_results>.
7. If you need to use the output of a function as an input to another function, you can do so.
8. Put any intermediate reasoning steps into the <thoughts></thoughts> tags.
9. Once you truly know the answer to the <user-request>, place the answer in <answer></answer> tags. Make sure to answer in a full sentence which is friendly.

Here are the tools you can use:
{tools_prompt}

You may call them like this:
<function_calls>
<invoke>
<tool_name>$TOOL_NAME</tool_name>
<parameters>
<$PARAMETER_NAME>$PARAMETER_VALUE</$PARAMETER_NAME>
...
</parameters>
</invoke>
</function_calls>

Results from prior function calls are shown here. Do not repeat the same function call if you already have the results here.
<function_results>
{function_results}
</function_results>

<user-request>
{user_request}
</user-request>
"""

tools = [get_lat_long, get_weather]
tool_prompt = ""
for tool in tools:
    docstring_info = extract_docstring_info(tool.__doc__)
    tool_prompt += construct_format_tool_for_claude_prompt(tool.__name__, docstring_info['description'], docstring_info['params'])


function_calling_prompt = ChatPromptTemplate.from_template(function_calling_prompt, partial_variables={"tools_prompt": tool_prompt})


When the model determines that it needs to invoke a tool, it will generate an xml output that specifies the tool to use and the input parameters to the tool. Since we can't use xml strings to invoke our tools, we convert the xml output to a dictionary and then use the `call_function` function to invoke the tool.

In [None]:
def xml_to_dict(model_output: str):

    
    func_call_idx = model_output.find("<function_calls>")
    
    invoke_xml = model_output[func_call_idx + len("<function_calls>"):]
    
    invoke_dict = xmltodict.parse(invoke_xml)

    return {
        "tool_name": invoke_dict["invoke"]["tool_name"],
        "parameters": invoke_dict["invoke"]["parameters"],
    }

In [None]:
func_chain = (
    function_calling_prompt
    | llm.bind(stop=["</function_calls>", "</answer>"])
    | StrOutputParser()
)

---
## Executing the RAG Workflow

Armed with our prompt and structured tools, we can now write an orchestration function which will iteratively step through the logical tasks to answer a user question. In the cell below we use the `invoke_model` function to generate a response with Claude and the `single_retriever_step` function to iteratively call tools when the LLM tells us we need to. The general flow works like this...

1. The user enters an input to the application
2. The user input is merged with the original prompt and sent to the LLM to determine the next step
3. If the LLM knows the answer, it will answer and we are done. If not, go to next step 4.
4. The LLM will determine which tool to use to answer the question.
5. We will use the tool as directed by the LLM and retrieve the results.
6. We provide the results back into the original prompt as more context.
7. We ask the LLM the next step or if knows the answer.
8. Return to step 3.

We can orchestrate the entire workflow through a simple while loop where we continuously invoke the LLM until it has enough information to provide the final answer.

In [None]:
query = "What is the weather in Las Vegas?"
# query = "Is it currently warmer in Las Vegas or New York?" # try to see what happens when you ask a more complex question

intermediate_function_results = ""
max_iter = 10

while True:
    output = func_chain.invoke(
        {
            "user_request": query,
            "function_results": intermediate_function_results,
        }
    )
    rprint(f"[green]{output}[/green]")

    if ("<answer>" in output) | (max_iter == 0):
        answer_idx = output.find("<answer>")
        output = output[answer_idx:].replace("<answer>", "")
        break

    else:

        calling_params = xml_to_dict(output)

        func_call_results = call_function(**calling_params)
        rprint(f'[bold cyan]Calling function {calling_params["tool_name"]} with parameters {calling_params["parameters"]} [/bold cyan]')
        rprint(f'[bold cyan]Function call results: {func_call_results} [/bold cyan]')

        parameters_xml = "\n".join(
            [f"<{k}>{v}</{k}>" for k, v in calling_params["parameters"].items()]
        )

        intermediate_results = (
            "<result>\n<tool_name>"
            + calling_params["tool_name"]
            + "</tool_name>\n"
            + "<parameters>\n"
            + parameters_xml
            + "\n</parameters>\n<output>"
            + json.dumps(func_call_results)
            + "\n</output>\n</result>\n"
        )

        intermediate_function_results += intermediate_results
        max_iter -= 1

rprint(f"[#f77f00]==========FINAL OUTPUT==========\n{output}\n==============================[/#f77f00]")

---
## Next steps

Now that you have used a few different retrieval systems, lets move on to the next notebook where you can apply the skills you've learned so far!