# OpenAI Agent

## Setup

In [1]:
OPENAI_API_KEY = "YOUR OPENAI API KEY"
import openai
openai.api_key = OPENAI_API_KEY

In [2]:
import nest_asyncio
nest_asyncio.apply()

**Note**: The pdf files are included with this lesson. To access these papers, go to the `File` menu and select`Open...`.

In [3]:
import json
with open('../data/nhanes_data.json', 'r') as file:
    docs = json.load(file)
docs_names = [doc['text'] for doc in docs]
docs_names[0:5]

['ACQ_D', 'ACQ_E', 'ACQ', 'ACQ_C', 'ACQ_B']

In [4]:
from utils import get_doc_tools
from pathlib import Path

domo = ["DEMO","DEMO_B","DEMO_C","BMX","BMX_B","BMX_C","BPX","BPX_B","BPX_C"] 
docs_to_tools_dict = {}
for doc in domo:
    print(f"Getting tools for doc: {doc}")
    vector_tool, summary_tool = get_doc_tools(doc,Path("../data/metadata/{}.md".format(doc)))
    docs_to_tools_dict[doc] = [vector_tool, summary_tool]

Getting tools for doc: DEMO
Getting tools for doc: DEMO_B
Getting tools for doc: DEMO_C
Getting tools for doc: BMX
Getting tools for doc: BMX_B
Getting tools for doc: BMX_C
Getting tools for doc: BPX
Getting tools for doc: BPX_B
Getting tools for doc: BPX_C


In [5]:
all_tools = [t for doc in domo for t in docs_to_tools_dict[doc]]

In [6]:
all_tools

[<llama_index.core.tools.function_tool.FunctionTool at 0x14c997650>,
 <llama_index.core.tools.query_engine.QueryEngineTool at 0x14dfc3010>,
 <llama_index.core.tools.function_tool.FunctionTool at 0x14caf9f90>,
 <llama_index.core.tools.query_engine.QueryEngineTool at 0x14e027250>,
 <llama_index.core.tools.function_tool.FunctionTool at 0x14cad42d0>,
 <llama_index.core.tools.query_engine.QueryEngineTool at 0x14e2c5210>,
 <llama_index.core.tools.function_tool.FunctionTool at 0x14d38bf50>,
 <llama_index.core.tools.query_engine.QueryEngineTool at 0x14e2c4810>,
 <llama_index.core.tools.function_tool.FunctionTool at 0x14e2f5010>,
 <llama_index.core.tools.query_engine.QueryEngineTool at 0x14e902490>,
 <llama_index.core.tools.function_tool.FunctionTool at 0x14eb38710>,
 <llama_index.core.tools.query_engine.QueryEngineTool at 0x14eb430d0>,
 <llama_index.core.tools.function_tool.FunctionTool at 0x14e6448d0>,
 <llama_index.core.tools.query_engine.QueryEngineTool at 0x14edf6610>,
 <llama_index.core.t

In [7]:
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4o-mini")

In [8]:
len(all_tools)

18

In [9]:
# define an "object" index and retriever over these tools
from llama_index.core import VectorStoreIndex
from llama_index.core.objects import ObjectIndex

obj_index = ObjectIndex.from_objects(
    all_tools,
    index_cls=VectorStoreIndex,
)

In [10]:
obj_retriever = obj_index.as_retriever(similarity_top_k=3)

In [11]:
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent import AgentRunner

agent_worker = FunctionCallingAgentWorker.from_tools(
    tool_retriever=obj_retriever,
    llm=llm, 
    system_prompt=""" \
You are an agent designed to answer queries over a given epidemiology data and documents.
Please always use the tools provided to answer a question. Do not rely on prior knowledge.\

""",
    verbose=True
)
agent = AgentRunner(agent_worker)

In [12]:
response = agent.query(
    "Tell me about the relation between BMI and age "
    "How is the income presented in this data? and How can analyze it"
)
print(str(response))

Added user message to memory: Tell me about the relation between BMI and age How is the income presented in this data? and How can analyze it
=== Calling Function ===
Calling function: summary_tool_BMX with args: {"input": "relation between BMI and age"}
=== Function Output ===
There is a relationship between Body Mass Index (BMI) and age in the dataset provided. The BMI values are recorded for both males and females ranging from 2 years to 150 years. The BMI values are categorized based on age groups, and the BMI measurements are available for analysis across different age ranges as specified in the dataset.
=== Calling Function ===
Calling function: summary_tool_BMX_B with args: {"input": "income presentation in the data"}
=== Function Output ===
Income presentation is not included in the provided data. The data primarily focuses on body measurements and related comments, such as weight, height, circumferences, and skinfold measurements.
=== Calling Function ===
Calling function: sum

In [13]:
response = agent.query(
    "Show me the Pregnancy and Blood pressure. "
)
print(str(response))

Added user message to memory: Show me the Pregnancy and Blood pressure. 
=== Calling Function ===
Calling function: summary_tool_BPX with args: {"input": "Pregnancy and Blood Pressure"}
=== Function Output ===
Blood pressure measurements are taken on all examinees 8 years and older, regardless of pregnancy status. The data collected includes systolic and diastolic blood pressure readings, pulse measurements, and other related variables. The study does not exclude pregnant individuals from having their blood pressure measured as part of the examination protocol.
=== Calling Function ===
Calling function: summary_tool_BPX_B with args: {"input": "Pregnancy and Blood Pressure"}
=== Function Output ===
Pregnancy status is not explicitly mentioned in the provided context information related to blood pressure measurements. The data and variables discussed focus on blood pressure measurements, pulse rates, and related factors in individuals aged 8 years and older.
=== Calling Function ===
Call