### Document Agent
1. This code reads three PDF documents and create a vectorstore + retreiver objects for each
2. Creates three Tools using the retreiver objects to extract information from the vectorstores
3. Creates agents with the three tools
4. Invokes the Agent with a query, the agent should decide on which Tool to execute based on the query

In [2]:
from llama_index.llms.openai import OpenAI
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.agent.openai import OpenAIAgent 
from llama_index.core import Settings
from llama_index.core import VectorStoreIndex, ListIndex, SimpleDirectoryReader
from llama_index.core import get_response_synthesizer
from llama_index.core.query_engine import RetrieverQueryEngine
from dotenv import load_dotenv
from llama_index.core.response_synthesizers import ResponseMode
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.postprocessor import SimilarityPostprocessor
from llama_index.core import get_response_synthesizer
import os

In [3]:
# Load the OpenAI API Key into the environment variable named OPENAI_API_KEY
load_dotenv()   
api_key = os.getenv("OPENAI_API_KEY")
os.environ["OPENAI_API_KEY"] = api_key

In [4]:
# Set up model configuration
Settings.llm = OpenAI(temperature=0.5,
    model="gpt-3.5-turbo") 

In [5]:
# Load 3 PDF documents on mental Haalth from the folder, using SimpleDirectoryReader

titles = [
    "EJ1154566", 
    "sg-youth-mental-health-social-media-advisory", 
    "what_is_mental_illness"
    ]

documents = {}
for title in titles:
    documents[title] = SimpleDirectoryReader(input_files=[f"data/{title}.pdf"]).load_data()
print(f"loaded documents with {len(documents)} documents")

loaded documents with 3 documents


In [7]:
documents["sg-youth-mental-health-social-media-advisory"]

[Document(id_='5c80c7dd-f9b2-4da4-880a-57455089dbfc', embedding=None, metadata={'page_label': '1', 'file_name': 'sg-youth-mental-health-social-media-advisory.pdf', 'file_path': 'data/sg-youth-mental-health-social-media-advisory.pdf', 'file_type': 'application/pdf', 'file_size': 1029053, 'creation_date': '2024-07-21', 'last_modified_date': '2023-06-20'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={}, text='Social Media  \nand Y outh  Mental Health\nThe U.S. Surgeon General’s Advisory\n2023\n', mimetype='text/plain', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'),
 Document(id_='5b9cfb13-650c-443b-9d59-88e80283c796', embedding=None, metadata={'page_label': '2', 'f

In [8]:
# Iteratively, build three Tools - one for each document, with theirr corresponding retreiver objects

query_engine_tools = []
response_synthesizer = get_response_synthesizer(response_mode=ResponseMode.COMPACT)
for title in titles:

    # build vector index
    vector_index = VectorStoreIndex.from_documents(documents[title])
    retriever = VectorIndexRetriever(index=vector_index, similarity_top_k=3)
    # define query engines
    query_engine = RetrieverQueryEngine(
        retriever=retriever,
        node_postprocessors=[SimilarityPostprocessor(similarity_cutoff=0.7, # filter nodes with similarity score below the cutoff 
                                                    filter_empty=True,  # filter empty nodes
                                                    filter_duplicates=True,  # filter duplicate nodes
                                                    filter_similar=True,  # filter similar nodes
                                                    )],
        response_synthesizer=response_synthesizer,                                                 
    )

    # define tools
    query_engine_tools += [
        QueryEngineTool(
            query_engine=query_engine,
            metadata=ToolMetadata(
                name=f"vector_tool_{title}",
                description=f"Useful for retrieving specific context related to {title}"
            ),
        )
        ] 

# build agent
agent = OpenAIAgent.from_tools(
    query_engine_tools,
    verbose=True,
)

In [9]:
# print the agent and the number, names of tools
print("Agent: ", agent)
print("Number of Tools: ", len(query_engine_tools))
print("Tool Names: ", [tool.metadata.name for tool in query_engine_tools])

Agent:  <llama_index.agent.openai.base.OpenAIAgent object at 0x11b712750>
Number of Tools:  3
Tool Names:  ['vector_tool_EJ1154566', 'vector_tool_sg-youth-mental-health-social-media-advisory', 'vector_tool_what_is_mental_illness']


In [10]:
agent.query("What is mental health?")

Added user message to memory: What is mental health?
=== Calling Function ===
Calling function: vector_tool_what_is_mental_illness with args: {"input":"What is mental health?"}
Got output: Mental health refers to a person's emotional, psychological, and social well-being. It involves how individuals think, feel, and act, and encompasses aspects like handling stress, making choices, and relating to others.



Response(response="Mental health refers to a person's emotional, psychological, and social well-being. It involves how individuals think, feel, and act, and encompasses aspects like handling stress, making choices, and relating to others.", source_nodes=[NodeWithScore(node=TextNode(id_='25d15fa5-81b4-41aa-8986-ffcbb16b3897', embedding=None, metadata={'page_label': '2', 'file_name': 'what_is_mental_illness.pdf', 'file_path': 'data/what_is_mental_illness.pdf', 'file_type': 'application/pdf', 'file_size': 228227, 'creation_date': '2024-07-21', 'last_modified_date': '2020-01-15'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='e95d2015-6d0e-44c3-a1ba-9e57078606d5', node_type=<ObjectType.DOCUMENT: '4'>, metadat

In [11]:
agent.query("What are the Potential Risk of Harm from Content Exposure?").response

Added user message to memory: What are the Potential Risk of Harm from Content Exposure?
=== Calling Function ===
Calling function: vector_tool_EJ1154566 with args: {"input": "Potential Risk of Harm from Content Exposure"}
Got output: The potential risk of harm from content exposure should be carefully considered, especially when dealing with sensitive topics related to mental health. It is important to assess the potential impact on individuals who may come across such content, as exposure to certain information could have negative effects on their well-being.

=== Calling Function ===
Calling function: vector_tool_sg-youth-mental-health-social-media-advisory with args: {"input": "Potential Risk of Harm from Content Exposure"}
Got output: Children and adolescents are at risk of being exposed to extreme, inappropriate, and harmful content on social media platforms. This content, which can include self-harm acts and risky challenges, may be easily accessible and can have serious consequ

'The potential risks of harm from content exposure include:\n\n1. **General Risk Assessment**: Careful consideration is needed when dealing with sensitive topics related to mental health to assess the potential impact on individuals who may come across such content. Exposure to certain information could have negative effects on their well-being.\n\n2. **Risk to Children and Adolescents**: Children and adolescents are at risk of being exposed to extreme, inappropriate, and harmful content on social media platforms. This content, including self-harm acts and risky challenges, may normalize harmful behaviors and lead to mental health difficulties. Social media can also contribute to body dissatisfaction, disordered eating behaviors, and low self-esteem, particularly affecting adolescent girls.\n\n3. **Misconceptions About Mental Illness**: People with mental illness are not typically dangerous, especially when they receive appropriate treatment and support. Most individuals with mental il