# File Knowledge Retrieval Agent

## 0. Import Libraries

Import and initialize the necessary libraries.

In [1]:
from openai.types.chat import ChatCompletion
from openai.types.chat.chat_completion import Choice
from openai.types.chat.chat_completion_message import ChatCompletionMessage
from openai.types.chat.chat_completion_message_tool_call import ChatCompletionMessageToolCall
from openai.types.chat.chat_completion_message_tool_call import Function
import time
from openai import OpenAI
import json
from instill.clients import init_pipeline_client
import os

pipeline = init_pipeline_client(api_token=os.environ["INSTILL_API_TOKEN"])
client = OpenAI()

## 1. Initialize Variables

Here we set the outputs from the executing agent, as well as the parameters defined by Agent-BE, user interaction and other add-on pipelines.

In [2]:
# Catalog and namespace for retrieval
catalog_name = "benchmark-s1-wework"
namespace = "george_strong"

# File summary from indexing-generate-summary pipeline
file_summary = """
This document contains the S-1 Registration Statement for WeWork Companies Inc., filed with the Securities and Exchange Commission on August 14, 2019. It provides a comprehensive overview of the company's business model, which focuses on offering flexible workspace solutions through a "space-as-a-service" membership model, catering to a diverse clientele that includes freelancers, startups, and large enterprises. The document highlights WeWork's rapid growth, showcasing a committed revenue backlog of $4.0 billion as of June 30, 2019, alongside significant increases in memberships and revenue.

Key sections include an analysis of risk factors, financial performance, and strategic growth plans, emphasizing market expansion and product enhancement. It details the company's capital structure, including various stock classes and their voting rights, as well as significant relationships with major investors like SoftBank. The financial statements reflect substantial increases in revenue and operating expenses, alongside notable net losses, while also addressing lease-related liabilities and assets in accordance with accounting standards.

Additionally, the document discusses WeWork's acquisitions, stock-based compensation, and related party transactions, providing insights into the company's operational strategies and financial health. Overall, this registration statement serves as a detailed prospectus for potential investors, outlining both the opportunities and risks associated with investing in WeWork.
"""

# Instruction from executing agent
instruction = "identify the main risks for WeWork"

# State context from executing agent
state_context = " "

# Recommend actions from executing agent
recommend_actions = " "

# User's follow-up query
user_query = "what are the main risks for WeWork?"

# User's chat history
chat_history = " "

# Input from user toggle for deep document analysis
deep_analysis = False

# Default relevance threshold for RAG mode
relevance_threshold = 0.1

In [3]:
# Preprocess file summary to remove double quotes - can cause issues with JSON parsing
file_summary = file_summary.replace('"', "'")
chat_history = chat_history.replace('"', "'")
user_query = user_query.replace('"', "'")
instruction = instruction.replace('"', "'")

## 2. Define Prompt and System Message

Here we define the prompt and system message templates for the agent.

In [4]:
SYSTEM_MESSAGE = """
Collect and aggregate relevant information based on the supervisor's context and queries.
In your sentences or paragraphs, provide properly formatted bracket citations (e.g., [1][2]) for each referenced source where applicable.

You are a professional information retrieval, extraction and synthesis expert.
Your job is to read the current state/context from your supervisor and do your best to complete the assigned tasks so that the user can gather the necessary information.
When returning your responses, you must include correct bracket citations in the form [1][5], etc., ensuring that each bracketed number corresponds exactly to the appropriate chunk source.
Carefully maintain citation order and do not mix them up.

User Past Conversation History:
{chat_history}

User Follow-up Query:
{user_query}
"""

PROMPT = """
Below is the background/context from your supervisor:
${state_context}

The supervisor's task description for you is as follows:
${instruction}

Recommended actions or parameters you may use:
${recommend_actions}

Please read the information above carefully. Decide how you will call either the retrieving-rag or retrieving-extract tools to gather relevant data.
Incorporate your newly collected information into a comprehensive response for your supervisor.
Remember to embed bracket citations (e.g., [3][5]) within the text to show exactly which file/chunk sources support each piece of information you provide.

# Steps

1. Review the given context, the user's previous conversation history, and the user's follow-up query.
2. Determine if additional data from files is required to fulfill the request comprehensively.
3. If necessary, formulate a suitable file retrieval query (or queries) to extract more information from available file sources, referencing that data in your response with bracketed citations.
4. After receiving the tool response, decide whether more iterative retrievals are needed for completeness.
5. Compile a coherent, thorough answer with citations pointing to the specific file sources/chunks of information.

# Output Format

• Provide your final answer in clear, well-structured English.
• Enclose each reference to an external file source in bracket citations corresponding to the retrieved file segments (e.g., [1]).
• Make sure the numbering of citations aligns with the order of retrieved sources; do not mix them up or leave them out.

# Examples

(If needed, you may include placeholders [FILE #1], [FILE #4] for demonstration. For instance, “Based on the information from [1], [4] …”)

# Notes

• Always confirm each piece of information against the cited file source.
• If you find incomplete or conflicting data, continue extracting iteratively until you have sufficient clarity.
• Keep the final message comprehensive and aligned with the supervisor's requirements.
"""

## 3. Handle Deep File Analysis

Here we handle the deep file analysis toggle. If the user has selected deep file analysis, we will obtain the number of chunks in the document. If the number of chunks is less than or equal to 15, we will set `deep_analysis` to `False` and set `relevance_threshold` to `0`. The logic is that if the number of chunks is less than or equal to 15, we will retrieve the whole document. By setting the `relevance_threshold` to `0`, we ensure that all chunks are consumed by the agents response (none are filtered out).

In [5]:
# Default to RAG mode if num_chunks is less than or equal to 15 as the whole document is retrieved
if deep_analysis:
    num_chunks = pipeline.trigger(
        namespace_id=namespace,
        pipeline_id="get-num-chunks",
        data=[{
            "catalog-name": catalog_name,
            "namespace": namespace
        }]
    )['outputs'][0]['num-chunks']

    if num_chunks <= 15:
        deep_analysis = False
        relevance_threshold = 0 # We don't want to filter chunks if we're doing deep analysis so we set it to 0
print(deep_analysis)

False


## 4. Define Tools

Here we define the tools that the agent will use. We have two tools, `retrieving-rag` and `retrieving-extract`. The `retrieving-extract` tool is used when the user has selected deep file analysis. The `retrieving-rag` tool is used when the user has not selected deep file analysis (or the number of chunks is less than or equal to 15).

#### **Important**
To save an unnecessary LLM request (since the user makes the decision to use which tool with the toggle), we manually create the `ChatCompletionMessageToolCall` and `ChatCompletion` objects.

In [9]:
tools = [
{
    "type": "function",
    "function": {
        "name": "retrieving-rag",
        "description": "Retrieves chunks from the document that are relevant to the instruction, user query, and chat history.",
        "parameters": {
            "type": "object",
            "properties": {
                "catalog-name": {"type": "string"},
                "namespace": {"type": "string"},
                "instruction": {"type": "string"},
                "chat-history": {"type": "string"},
                "user-query": {"type": "string"},
                "relevance-threshold": {"type": "number"}
            },
            "required": ["catalog-name", "namespace", "instruction", "chat-history", "user-query", "relevance-threshold"],
            "additionalProperties": False
        },
        "strict": True
    }
},
{
    "type": "function",
    "function": {
        "name": "retrieving-extract",
        "description": "Extracts relevant information from the document that is relevant to the instruction, user query, chat history and file summary.",
        "parameters": {
            "type": "object",
            "properties": {
                "catalog-name": {"type": "string"},
                "namespace": {"type": "string"},
                "instruction": {"type": "string"},
                "chat-history": {"type": "string"},
                "user-query": {"type": "string"},
                "file-summary": {"type": "string"}
            },
            "required": ["catalog-name", "namespace", "instruction", "chat-history", "user-query", "file-summary"],
            "additionalProperties": False
        },
        "strict": True
    }
}]

# Manually create the tool call with a simple generated ID
if deep_analysis:
    print("Deep analysis")
    tool_call = ChatCompletionMessageToolCall(
        id=f"call_{int(time.time())}",  # Generate a simple unique ID
        type='function',
        function=Function(
            name='retrieving-extract',
            arguments=f'{{"catalog-name": "{catalog_name}", "namespace": "{namespace}", "instruction": "{instruction}", "chat-history": "{chat_history}", "user-query": "{user_query}", "file-summary": "{file_summary}"}}'
        )
    )
elif not deep_analysis:
    tool_call = ChatCompletionMessageToolCall(
        id=f"call_{int(time.time())}",  # Generate a simple unique ID
        type='function',
        function=Function(
            name='retrieving-rag',
            arguments=f'{{"catalog-name": "{catalog_name}", "namespace": "{namespace}", "instruction": "{instruction}", "chat-history": "{chat_history}", "user-query": "{user_query}", "relevance-threshold": {relevance_threshold}}}'
        )
    )

# Set up the messages with the system message and prompt
messages = [
    {
        "role": "system",
        "content": SYSTEM_MESSAGE.format(
            chat_history=chat_history,
            user_query=user_query
        )
    },
    {
        "role": "user",
        "content": PROMPT.format(
            state_context=state_context,
            instruction=instruction,
            recommend_actions=recommend_actions
        )
    }
]

# Manually create the message
message = ChatCompletionMessage(
    role='assistant',
    content=None,
    tool_calls=[tool_call]
)

# Manually create the choice
choice = Choice(
    finish_reason='tool_calls',
    index=0,
    logprobs=None,
    message=message
)

# Manually create minimal ChatCompletion
completion = ChatCompletion(
    id=f"chatcmpl_{int(time.time())}",  # Generate a simple unique ID
    choices=[choice],
    created=int(time.time()),  # Current timestamp
    model="gpt-4o",
    object="chat.completion"
)
print(completion)

ChatCompletion(id='chatcmpl_1739313763', choices=[Choice(finish_reason='tool_calls', index=0, logprobs=None, message=ChatCompletionMessage(content=None, refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_1739313763', function=Function(arguments='{"catalog-name": "benchmark-s1-wework", "namespace": "george_strong", "instruction": "identify the main risks for WeWork", "chat-history": " ", "user-query": "what are the main risks for WeWork?", "relevance-threshold": 0.1}', name='retrieving-rag'), type='function')]))], created=1739313763, model='gpt-4o', object='chat.completion', service_tier=None, system_fingerprint=None, usage=None)


## 5. Execute the Tool Call

We first extract the arguments from the tool call and then execute the tool call.


In [10]:
tool_call = completion.choices[0].message.tool_calls[0]
clean_string = tool_call.function.arguments.replace("\n", "\\n").replace("\r", "\\r")
args = json.loads(clean_string)
args

{'catalog-name': 'benchmark-s1-wework',
 'namespace': 'george_strong',
 'instruction': 'identify the main risks for WeWork',
 'chat-history': ' ',
 'user-query': 'what are the main risks for WeWork?',
 'relevance-threshold': 0.1}

In [11]:
if tool_call.function.name == "retrieving-rag":
    tool_result = pipeline.trigger(
        namespace_id=namespace,
        pipeline_id="retrieving-rag",
        data=[args]
    )['outputs'][0]
    result = tool_result['chunks']
elif tool_call.function.name == "retrieving-extract":
    tool_result = pipeline.trigger(
        namespace_id=namespace,
        pipeline_id="retrieving-extract",
        data=[args]
    )['outputs'][0]
    result = tool_result
tool_result

{'citations': ['Source: S-1 Wework.md. Chunk UID: 1770466f-f586-4979-afdb-b065a04f793c.',
  'Source: S-1 Wework.md. Chunk UID: 682e4d61-4e8f-4a5c-86cf-5e8f0fffafe7.',
  'Source: S-1 Wework.md. Chunk UID: 4a8e511e-6bb6-425c-85c9-cfcac16fac8d.',
  'Source: S-1 Wework.md. Chunk UID: f28fe51d-65bc-46c8-8f02-3d1e328b7f36.',
  'Source: S-1 Wework.md. Chunk UID: 32f595d7-f289-4d91-9f10-787632263c3e.',
  'Source: S-1 Wework.md. Chunk UID: eca95b65-91b0-4734-b980-1d96be98a85f.',
  'Source: S-1 Wework.md. Chunk UID: 039ca476-a292-4b97-9e46-59b0394fde57.',
  'Source: S-1 Wework.md. Chunk UID: c58d1712-8ddf-4559-92df-90c26b9eb146.',
  'Source: S-1 Wework.md. Chunk UID: 8358af51-0220-45f2-84bf-c983eec9329e.',
  'Source: S-1 Wework.md. Chunk UID: adf38b59-4f30-4c8f-a415-2f5fa323018b.'],
 'scores': [0.9941347,
  0.9914887,
  0.98187524,
  0.97090924,
  0.96814114,
  0.93546456,
  0.92523,
  0.43578145,
  0.38931432,
  0.2810108],
 'chunks': ['[1] # Key Performance Indicators\n\nN/R = Not reported  \n

## 6. Formulate Agent Response

Here we append the tool call result to the messages which are then used to create a new ChatCompletion object to return the result to the executing agent

In [12]:
messages.append(completion.choices[0].message)  # append model's function call message
messages.append({                               # append result message
    "role": "tool",
    "tool_call_id": tool_call.id,
    "content": str(result)
})

response_completion = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    tools=tools,
    temperature=0.0,
    top_p=0.95
)

response_completion.choices[0].message.content

2025-02-11 22:45:30,763.763 INFO     HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


"WeWork faces several significant risks that could adversely affect its business operations, financial condition, and growth prospects. Here are the main risks identified:\n\n1. **Sustainability of Rapid Growth**: WeWork has experienced rapid growth, but this may not be sustainable. The company may struggle to maintain its historical growth rates due to increased competition and market saturation. As it expands into new markets, the ability to find reasonably priced opportunities for new locations may become limited, which could hinder growth [1][2].\n\n2. **Operational Challenges**: The rapid expansion places a strain on WeWork's resources, potentially leading to operational inefficiencies. If the company fails to manage its growth effectively, it may face increased capital expenditures and operating costs that could outpace revenue growth, adversely impacting its financial results [1][4].\n\n3. **Profitability Concerns**: WeWork has a history of losses and may continue to face challe

In [13]:
print(response_completion.choices[0].message.content)

WeWork faces several significant risks that could adversely affect its business operations, financial condition, and growth prospects. Here are the main risks identified:

1. **Sustainability of Rapid Growth**: WeWork has experienced rapid growth, but this may not be sustainable. The company may struggle to maintain its historical growth rates due to increased competition and market saturation. As it expands into new markets, the ability to find reasonably priced opportunities for new locations may become limited, which could hinder growth [1][2].

2. **Operational Challenges**: The rapid expansion places a strain on WeWork's resources, potentially leading to operational inefficiencies. If the company fails to manage its growth effectively, it may face increased capital expenditures and operating costs that could outpace revenue growth, adversely impacting its financial results [1][4].

3. **Profitability Concerns**: WeWork has a history of losses and may continue to face challenges in