# Research Assistant Agent w/ Claude Sonnet 3.5

In this notebook, we demonstrate how to build tools that make calls to external apis to get information.

Ref: <https://github.com/anthropics/courses/blob/master/ToolUse/02_your_first_simple_tool.ipynb>

Install packages using uv, an extremely fast python package installer\
Read more about uv here https://astral.sh/blog/uv

In [1]:
%%bash
pip install uv && uv pip install -U boto3 rich wikipedia

In [2]:
!python --version
%load_ext rich

In [3]:
import json
import types
from pathlib import Path

import boto3
import wikipedia
from rich import print

## List Anthropic Model IDs in Bedrock

In [4]:
session = boto3.Session()
region = session.region_name
bedrock = session.client(service_name="bedrock", region_name=region)
bedrock_runtime = session.client(service_name="bedrock-runtime", region_name=region)

# List Anthropic models in Bedrock
models = bedrock.list_foundation_models(
    byProvider="Anthropic", byOutputModality="TEXT"
)["modelSummaries"]

# Print only Claude 3 models
model_ids = [
    model["modelId"]
    for model in models
    if "claude-3" in model["modelId"] and "v1:0:" not in model["modelId"]
]

print("[b blue]Claude 3 ModelIDs")
print(model_ids)

print("===" * 10)
# change string in if loop for haiku, sonnet or opus
models = [m for m in model_ids if "sonnet" in m]
model_id = [m for m in models if "3-5-sonnet" in m][0]
print(f"Model ID: [b red]{model_id}")
print("===" * 10)

## Define tools

Your task is to help build out a research assistant using Claude. A user can enter a topic that they want to research and get a list of Wikipedia article links saved to a markdown file for later reading. We could try asking Claude directly to generate a list of article URLs, but Claude is unreliable with URLs and may hallucinate article URLs. Also, legitimate articles might have moved to a new URL after Claude's training cutoff date. Instead, we're going to use a tool that connects to the real Wikipedia API to make this work!

We'll provide Claude with access to a tool that accepts a list of possible Wikipedia article titles that Claude has generated but could have hallucinated. We can use this tool to search Wikipedia to find the actual Wikipedia article titles and URLs to ensure that the final list consists of articles that all actually exist. We’ll then save these article URLs to a markdown file for later reading.

### Helper functions

The first function, `generate_wikipedia_reading_list` expects to be passed a research topic like "The history of Hawaii" or "Pirates across the world" and a list of potential Wikipedia article names that we will have Claude generate. The function uses the wikipedia package to search for corresponding REAL wikipedia pages and builds a list of dictionaries that contain an article's title and URL.

Then it calls `add_to_research_reading_file`, passing in the list of Wikipedia article data and the overall research topic. This function simply adds markdown links to each of the Wikipedia articles to a file called `output/research_reading.md`.

The filename is hardcoded for now, and the function assumes it exists. It exists in this repo, but you'll need to create it yourself if working somewhere else.

### Goal

Our task is to implement a function called `get_research_help` that accepts a research topic and a desired number of articles. This function should use Claude to actually generate the list of possible Wikipedia articles and call the `generate_wikipedia_reading_list` function from above. Here are a few example function calls:

In [5]:
def generate_wikipedia_reading_list(research_topic, article_titles):
    wikipedia_articles = []
    for t in article_titles:
        results = wikipedia.search(t)
        try:
            page = wikipedia.page(results[0])
            title = page.title
            url = page.url
            wikipedia_articles.append({"title": title, "url": url})
        except:
            continue
    fpath = add_to_research_reading_file(wikipedia_articles, research_topic)
    return fpath


def add_to_research_reading_file(articles, topic):
    output_dir = Path("./output")
    research_filepath = output_dir.joinpath("research_reading.md")
    if not output_dir.exists():
        output_dir.mkdir()
    with open(research_filepath, "a", encoding="utf-8") as file:
        file.write(f"## {topic} \n")
        for article in articles:
            title = article["title"]
            url = article["url"]
            file.write(f"* [{title}]({url}) \n")
        file.write("\n\n")
    return str(research_filepath)

In [6]:
def get_research_help(
    research_topic, num_of_articles, bedrock_runtime, model_id=model_id
):
    """
    Function to generate research titles for a given research topic.
    """
    system_prompt = """Act as an expert research assistant.
        Your task is to help me gather research titles on a specific topic.
        The titles should be diverse yet simple and must relate to the topic.
        The titles should be not contain more than 4-5 words.
        titles should be generated in json format as a list.
        Just output the research titles and nothing else.
    """.strip()

    user_prompt = (
        f"Please generate {num_of_articles} titles for the topic '{research_topic}'."
    )
    print(user_prompt)
    payload = {
        "max_tokens": 4096,
        "anthropic_version": "bedrock-2023-05-31",
        "system": system_prompt,
        "messages": [{"role": "user", "content": user_prompt}],
    }
    response = bedrock_runtime.invoke_model(body=json.dumps(payload), modelId=model_id)
    response_body = json.loads(response.get("body").read())
    response_body = response_body["content"][0]
    return response_body["text"]

### Create tool definition for claude

In [7]:
tools = [
    {
        "name": "get_research_help",
        "description": "Generates research titles for a given research topic.",
        "input_schema": {
            "type": "object",
            "properties": {
                "research_topic": {
                    "type": "string",
                    "description": "The topic for which research titles are to be generated",
                },
                "num_of_articles": {
                    "type": "integer",
                    "description": "The number of research titles to generate",
                },
                "bedrock_runtime": {
                    "type": "object",
                    "description": "The Bedrock runtime object for executing the model",
                },
                "model_id": {
                    "type": "string",
                    "description": "The ID of the model to be used (optional, default value is provided in the function)",
                },
            },
            "required": ["research_topic", "num_of_articles", "bedrock_runtime"],
        },
    }
]


def process_tool_call(tool_name, tool_input, bedrock_runtime):
    if tool_name == "get_research_help":
        return get_research_help(
            tool_input["research_topic"], tool_input["num_of_articles"], bedrock_runtime
        )

In [8]:
def chat_with_claude(
    prompt, MODEL_NAME=model_id, tools=tools, bedrock_runtime=bedrock_runtime
):
    print(f"\n{'='*50}\nUser Message: {prompt}\n{'='*50}")

    system_prompt = """Answer as many questions as you can using your existing knowledge.
    For generating research titles, always use the get_research_help tool."""

    payload = {
        "max_tokens": 4096,
        "anthropic_version": "bedrock-2023-05-31",
        "system": system_prompt,
        "messages": [{"role": "user", "content": f"{prompt}"}],
        "tools": tools,
    }

    # print(payload)

    response = bedrock_runtime.invoke_model(
        body=json.dumps(payload), modelId=MODEL_NAME
    )
    # read byte stream and load the response object (dict)
    response_body = json.loads(response.get("body").read())
    # SimpleNamespace to make dict dot accessible
    message = types.SimpleNamespace(**response_body)

    # print("\nInitial Response:")
    print(f"Stop Reason: {message.stop_reason}")
    # print(f"Content:\n{json.dumps(message.content, indent=2)}")

    while True:
        if message.stop_reason == "tool_use":
            tool_use = next(
                (obj for obj in message.content if obj["type"] == "tool_use"), None
            )
            tool_name = tool_use["name"]
            tool_input = tool_use["input"]
            # tool_use_id = tool_use["id"]

            print(f"\nTool Used: {tool_name}")
            print(f"Tool Input: {tool_input}")
            # First get research titles
            tool_result = process_tool_call(tool_name, tool_input, bedrock_runtime)
            article_list = json.loads(tool_result)
            # print(type(article_list))
            print(f"Tool Result:\n{tool_result}")
            # Next, call generate_wikipedia_reading_list
            output_filepath = generate_wikipedia_reading_list(
                tool_input["research_topic"], article_list
            )

            # # append the tool_result as a user response
            # messages = [
            #     {"role": "user", "content": prompt},
            #     {"role": "assistant", "content": message.content},
            #     {
            #         "role": "user",
            #         "content": [
            #             {
            #                 "type": "tool_result",
            #                 "tool_use_id": tool_use_id,
            #                 "content": json.dumps(tool_result),
            #             }
            #         ],
            #     },
            # ]
            # # update messages in payload with new messages object
            # payload_ns = types.SimpleNamespace(**payload)
            # payload_ns.messages = messages
            # # convert SimpleNamespace object back to dict
            # payload = vars(payload_ns)
            # response = bedrock_runtime.invoke_model(
            #     body=json.dumps(payload), modelId=MODEL_NAME
            # )
            # response = json.loads(response.get("body").read())
            # message = types.SimpleNamespace(**response)
            response = output_filepath
            break
        # else:
        #     response = message
        #     break

    # final_response = next(
    #     (obj["text"] for obj in response.content if obj["type"] == "text"),
    #     response,
    # )
    return response

In [9]:
result_filepath = chat_with_claude(
    "Generate 3 research titles for topic Animal conciousness."
)
file_content = Path(result_filepath).read_text(encoding="utf-8")
print(f"\n{'='*50}\nFinal Response:\n\n{file_content}\n{'='*50}")

In [10]:
result_filepath = chat_with_claude(
    "Generate 5 research titles for topic Liquid Neural Networks."
)
file_content = Path(result_filepath).read_text(encoding="utf-8")
print(f"\n{'='*50}\nFinal Response:\n{file_content}\n{'='*50}")