# This notebook compares ReAct with One Big Prompt

there are three variations to test
* one big prompt
* ReAct using RAG as the search function
* ReAct with search function that returns the whole PDF (Part.from_data('pdf'))

In [78]:
import sys
import os
import importlib
import json
from vertexai.generative_models import GenerationConfig, GenerativeModel, Content, Part  
sys.path.append(os.path.abspath('../utils'))
from rich.markdown import Markdown as rich_Markdown

# convert tool
import tool_functions
importlib.reload(tool_functions)
from tool_functions import convert_to_tool # convert self-defined functions to Tool objects

# RAG class
import rag
importlib.reload(rag)
from rag import RAG # RAG search function

# ReAct Agent
import react_agent
importlib.reload(react_agent)
from react_agent import ReactAgent

# one big prompt Agent
import big_prompt_agent
importlib.reload(big_prompt_agent)
from big_prompt_agent import BigPromptAgent

# naf23.pdf

In [7]:
pdf_path='../data/naf23.pdf'

## initialize agents, and test them on a simple question

### one big prompt agent

In [2]:
agent_one_big_prompt=BigPromptAgent(pdf_path=pdf_path)
query="what is National Ataxia Foundation's total revenue in 2023?"


In [8]:
agent_one_big_prompt=BigPromptAgent(pdf_path=pdf_path)
answer = agent_one_big_prompt.run(query=query)
print(answer)

The National Ataxia Foundation's total revenue in 2023 was $4,184,787. This is the sum of total support ($3,205,136) and total revenue ($979,651), as shown on page 7 of the provided financial report.


### ReAct agent with RAG

In [6]:
# read the data, create an RAG class on that data
data_path='../data/naf23.pdf'
rag_instance=RAG(pdf_path= data_path, chunking_method='recursive')

  from tqdm.autonotebook import tqdm, trange


In [37]:
def search1(query : str):
    """
    This is a search function on a pre-exisiting financial knoweldge base, you can use this search function to search for financial information of specific companies you do not know, treat this function as a internal wiki search that you can use

    Args:
        query (str): input query            
    """
    results = rag_instance.search(query, method='ensemble')
    return json.dumps({"relevant information in the auditor notes": results})


# convert this search function to a Tool object
search_tool_rag_chunks = convert_to_tool(search1)

In [70]:
agent_react = ReactAgent(tools=[search_tool_rag_chunks])

In [23]:
agent_react.run(query=query)

---------------------------------------- iteration 0  ----------------------------------------


{"relevant information in the auditor notes": "Revenue\nConference income 264,816           -                        264,816           \nEarned income 570,833           -                        570,833           \nInvestment income 144,002           -                        144,002           \nTotal Revenue 979,651           -                        979,651           \nNet Assets Released from Restrictions 975,954           (975,954)          -                        \nTotal Support and Revenue 3,546,026       638,761           4,184,787       \nExpenses\nProgram services\nResearch 1,574,354       -                        1,574,354       \nEducation and service 1,146,980       -                        1,146,980       \nDrug Development Collaborative 1,188,427       -                        1,188,427       \nTotal Program Services 3,909,761       -                        3,909,761       \nSupporting services\nManagement and general 538,079           -                        538,079     

"The National Ataxia Foundation's total revenue in 2023 was $979,651."

### ReAct agent that returns the full pdf

In [16]:
auditor_notes_doc=None
with open(pdf_path, 'rb') as fp:
    auditor_notes_doc=Part.from_data(data=fp.read(),mime_type='application/pdf')

In [39]:
# define a search function that returns the entire pdf read from Part.from_data()
def search2(query : str):
    """
    This is a search function on a pre-exisiting financial knoweldge base, you can use this search function to search for financial information of specific companies you do not know, treat this function as a internal wiki search that you can use

    Args:
        query (str): input query            
    """
    return auditor_notes_doc

# convert this search function to a Tool object
search_tool_rag_full_pdf = convert_to_tool(search2)

In [72]:
agent_react_full_pdf = ReactAgent(tools=[search_tool_rag_full_pdf])

In [41]:
# make sure they have different tools
print(agent_react.tools)
print(agent_react_full_pdf.tools)

[<tool_functions.Tool object at 0x7fe42a952620>]
[<tool_functions.Tool object at 0x7fe42a952950>]


In [31]:
agent_react_full_pdf.run(query=query)

---------------------------------------- iteration 0  ----------------------------------------


observation is non-printable Part object, probably the full pdf
---------------------------------------- iteration 1  ----------------------------------------


"The National Ataxia Foundation's total support and revenue for the year ended December 31, 2023, was $4,184,787."

### question 1

In [82]:
query_short="Cash and cash equivalents decreased from $1,969,164 in 2022 to $1,008,716 in 2023 for National Ataxia Foundation, what are the potential drivers"

In [83]:
rich_Markdown(agent_one_big_prompt.run(query=query_short))

In [84]:
agent_react.run(query=query_short)

---------------------------------------- iteration 0  ----------------------------------------


{"relevant information in the auditor notes": "appropriate in the circumstances, but not for the purpose of expressing an opinion on the effectiveness of the  \nFoundation\u2019s  internal control. Accordingly, no such opinion is expressed.  \n \n\u2022 Evaluate the appropriateness of accounting policies used and the reasonableness of significant accounting \nestimates made by management, as well as evaluate the overall presentation of the financial statements.  \n \n\u2022 Conclude whether, in our judgment, there are conditions or events, considered in the aggregate, that raise \nsubstantial doubt about the Foundation\u2019s  ability to continue as a going concern for a reasonable period of time.  \n \nWe are required to communicate with those charged with governance regarding, among other matters, the planned scope \nand timing of the audit, significant audit findings, and certain internal control related matters that we identified during t he \naudit.  \n \nAbdo  \nMinneapolis, Minn

{"relevant information in the auditor notes": "Annual Financial  \nReport  \nNational Ataxia Foundation   \nSt. Louis Park, Minnesota  \n \n \nFor the years ended December 31, 2023  and 2022  \n National Ataxia Foundation  \nTable of Contents  \nDecember 31, 2023  and 2022  \nPage No.  \nIndependent Auditor's Report  3 \nFinancial Statements  \nStatement s of Financial Position  6 \nStatement s of Activities  7 \nStatements of Functional Expenses  9 \nStateme nts of Cash Flows  11 \nNotes to the Financial Statements  12 \n2 \n  \n \n \n \n \n \nINDEPENDENT AUDITOR'S REPORT  \n \n \nBoard of Directors  \nNational Ataxia Foundation  \nSt. Louis Park , Minnesota  \n \nOpinion  \n \nWe have audited the accompanying financial statements of National Ataxia Foundation  (the Foundation ), which comprise \nthe statements of financial position as of December 31, 2023  and 2022 , and the related statements of activities, \nfunctional expenses and cash flows for the years then ended, and the relat

{"relevant information in the auditor notes": "Annual Financial  \nReport  \nNational Ataxia Foundation   \nSt. Louis Park, Minnesota  \n \n \nFor the years ended December 31, 2023  and 2022  \n National Ataxia Foundation  \nTable of Contents  \nDecember 31, 2023  and 2022  \nPage No.  \nIndependent Auditor's Report  3 \nFinancial Statements  \nStatement s of Financial Position  6 \nStatement s of Activities  7 \nStatements of Functional Expenses  9 \nStateme nts of Cash Flows  11 \nNotes to the Financial Statements  12 \n2 \n  \n \n \n \n \n \nINDEPENDENT AUDITOR'S REPORT  \n \n \nBoard of Directors  \nNational Ataxia Foundation  \nSt. Louis Park , Minnesota  \n \nOpinion  \n \nWe have audited the accompanying financial statements of National Ataxia Foundation  (the Foundation ), which comprise \nthe statements of financial position as of December 31, 2023  and 2022 , and the related statements of activities, \nfunctional expenses and cash flows for the years then ended, and the relat

'The decrease in cash and cash equivalents for the National Ataxia Foundation from 2022 to 2023 is primarily attributed to a negative cash flow from operating activities ($955,522). This indicates the foundation spent more than it brought in during 2023.  A significant realized and unrealized loss on investments further exacerbated the decrease.  While there was a small negative cash flow from investing activities, the operating loss was the primary driver.'

In [85]:
agent_react_full_pdf.run(query=query_short)

---------------------------------------- iteration 0  ----------------------------------------


observation is non-printable Part object, probably the full pdf
---------------------------------------- iteration 1  ----------------------------------------


observation is non-printable Part object, probably the full pdf
---------------------------------------- iteration 2  ----------------------------------------


"The National Ataxia Foundation's decrease in cash and cash equivalents from $1,969,164 in 2022 to $1,008,716 in 2023 was primarily due to an increase in program spending, outpacing the available revenue and resulting in negative cash flow from operations.  The largest increases in program spending were related to research, education and services, and the Drug Development Collaborative."

* <span style="color: blue; font-weight: bold;"> all methods fail to go to check `"Note 10: Liquidity and Availability of Financial Assets"` </span>
* <span style="color: blue; font-weight: bold;"> especially for RAG, the LLM wounld not serach for 'liquidity', so it never goes to that note </span>
* <span style="color: blue; font-weight: bold;"> **the ReAct with full pdf even breaks our system prompt's specification by outputing `<observation>`** </span>

### question 2

In [54]:
query_short="Why did deferred revenue increase significantly in 2023 for National Ataxia Foundation?"

In [55]:
agent_one_big_prompt.run(query=query_short)

"The auditor's report states that deferred revenue consists of payments received in advance that relate to the Collaborative and conferences. All deferred revenue will be recognized over the next year. Deferred conference revenue was $9,810 in 2023 and $22,290 in 2022. Deferred Drug Development Collaborative revenue was $100,000 in 2023 and $0 in 2022.\n\nTherefore, the significant increase in deferred revenue in 2023 is due to the $100,000 in deferred Drug Development Collaborative revenue received in advance."

In [56]:
agent_react.run(query=query_short)

---------------------------------------- iteration 0  ----------------------------------------


{"relevant information in the auditor notes": "the years ended December 31, 2023  and 2022 , respectively. Deferred Drug D evelopment Collaborative revenue \nwas $100,000  and $0 for the years ended December 31, 2023  and 2022 , respectively.  \n \nThe Foundation \u2019s other revenues are explicitly excluded from the scope of ASC Topic 606 and are not recorded in \naccordance with that standard.  All contributions are considered to be available for unrestricted use unless specifically \nrestricted by the donor. The Board\u2019s general policy is to allocate up to thirty percent of all restricted contributions for \nunrestricted use unless the contribution was solicited by the Foundation for a sp ecific purpose.  The board has discretion \nto use this allocation or waive it.  \n15 \n National Ataxia Foundation  \nNotes to the Financial Statements  \nDecember 31, 2023  and 2022  \n \nNote 1:  Summary of Significant Accounting Policies (Continued)  \n \nJ. Property and Equipment  \n \nPr

'The significant increase in deferred revenue for the National Ataxia Foundation in 2023 is primarily attributable to a rise in deferred Drug Development Collaborative revenue, from $0 in 2022 to $100,000 in 2023. This reflects a substantial increase in funding received in advance for collaborative drug development projects, which will be recognized as revenue over the duration of the collaborations.'

* <span style="color: blue; font-weight: bold;"> both correct </span>
* <span style="color: blue; font-weight: bold;"> ReAct just use 1 iteration </span>

### question 3

In [57]:
query="What contributed to the significant increase in net assets with donor restrictions from 2022 to 2023 for National Ataxia Foundation"

In [58]:
agent_one_big_prompt.run(query=query)

"The provided auditor's report does not contain information regarding the drivers of the increase in net assets with donor restrictions from 2022 to 2023.  Therefore, I cannot answer your question.  The Statement of Activities shows that net assets with donor restrictions increased by $638,761, but the drivers of this change are not detailed in the provided notes."

In [73]:
agent_react.run(query=query)

---------------------------------------- iteration 0  ----------------------------------------


{"relevant information in the auditor notes": "and stored at the NINDS SCA -BRAC biorepository. Both de-identified clinical data and biofluids are available \nfor request from non -participating investigators for approved research projects.  \n  \n13 \n National Ataxia Foundation  \nNotes to the Financial Statements  \nDecember 31, 2023  and 2022  \n \nNote 1:  Summary of Significant Accounting Policies (Continued)  \n \nB. Basis of Accounting and Presentation  \n \nThe accompanying financial statements have been prepared using  the accrual basis of accounting in accordance with \naccounting principles generally accepted in the United States of America.   \n \nRevenues are recorded when earned and expenses are recorded when a liability is incurred. Contributions received are \nrecorded as an increase in non -donor -restricted or donor -restricted support depending on the existence or nature of any \ndonor restrictions.  Accordingly, net assets of the Foundation and changes therein are 

{"relevant information in the auditor notes": "and stored at the NINDS SCA -BRAC biorepository. Both de-identified clinical data and biofluids are available \nfor request from non -participating investigators for approved research projects.  \n  \n13 \n National Ataxia Foundation  \nNotes to the Financial Statements  \nDecember 31, 2023  and 2022  \n \nNote 1:  Summary of Significant Accounting Policies (Continued)  \n \nB. Basis of Accounting and Presentation  \n \nThe accompanying financial statements have been prepared using  the accrual basis of accounting in accordance with \naccounting principles generally accepted in the United States of America.   \n \nRevenues are recorded when earned and expenses are recorded when a liability is incurred. Contributions received are \nrecorded as an increase in non -donor -restricted or donor -restricted support depending on the existence or nature of any \ndonor restrictions.  Accordingly, net assets of the Foundation and changes therein are 

{"relevant information in the auditor notes": "and stored at the NINDS SCA -BRAC biorepository. Both de-identified clinical data and biofluids are available \nfor request from non -participating investigators for approved research projects.  \n  \n13 \n National Ataxia Foundation  \nNotes to the Financial Statements  \nDecember 31, 2023  and 2022  \n \nNote 1:  Summary of Significant Accounting Policies (Continued)  \n \nB. Basis of Accounting and Presentation  \n \nThe accompanying financial statements have been prepared using  the accrual basis of accounting in accordance with \naccounting principles generally accepted in the United States of America.   \n \nRevenues are recorded when earned and expenses are recorded when a liability is incurred. Contributions received are \nrecorded as an increase in non -donor -restricted or donor -restricted support depending on the existence or nature of any \ndonor restrictions.  Accordingly, net assets of the Foundation and changes therein are 

"Based on the available financial statements, the National Ataxia Foundation's net assets with donor restrictions increased by $638,761 from $2,041,806 in 2022 to $2,680,567 in 2023.  While the statements confirm this increase and show that $1,614,715 in contributions were received with donor restrictions in 2023, they do not provide a specific breakdown of the reasons for the change.  Further information, such as the organization's annual report or a more detailed financial review, would be needed to determine the specific factors contributing to this increase."

In [75]:
agent_react_full_pdf.run(query=query)

---------------------------------------- iteration 0  ----------------------------------------


observation is non-printable Part object, probably the full pdf
---------------------------------------- iteration 1  ----------------------------------------


observation is non-printable Part object, probably the full pdf
---------------------------------------- iteration 2  ----------------------------------------


"The National Ataxia Foundation's net assets with donor restrictions increased by $638,761 from 2022 to 2023. This was primarily due to contributions of $1,614,715 with donor restrictions, offset by $975,954 released from restrictions during 2023.  Several specific restricted funds saw increases, as detailed in Note 6 of the financial statements."

* <span style="color: blue; font-weight: bold;"> big prompt cannot find the answer </span>
* <span style="color: blue; font-weight: bold;"> ReAct with RAG cannot, becuase RAG keeps returning the incorrect chunks </span>
* <span style="color: blue; font-weight: bold;"> ReAct with full pdf is partially correct, `"as detailed in Note 6 of the financial statements."` </span>

### question 4

In [87]:
query="What caused the significant decrease in the long-term operating lease liability from 2022 to 2023 for National Ataxia Foundation?"


In [88]:
agent_one_big_prompt.run(query=query)

'This document does not contain information regarding operating leases or the reason for the decrease in long-term operating lease liability.  Therefore, I cannot answer your question.'

In [90]:
agent_react.run(query=query)

---------------------------------------- iteration 0  ----------------------------------------


{"relevant information in the auditor notes": "according to the Foundation 's elected policy. The Foundation 's lease agreement does not contain any material residual \nvalue guarantees or material restrictive covenants.  \n \nAdditional information about the Foundation \u2019s lease for the year ended December 31, 2023 , is as follows:  \n \nLease expense (included in operating expenses)\nOperating lease expense 33,406 $          \nVariable lease expense 27,997             \nTotal Lease Expense: 61,403 $          \nOther Information\nCash paid for amounts included in the measurement of lease liabilities\nOperating cash flows from operating leases 33,240 $          \nWeighted-average remaining lease term in years for operating leases 1.75                 \nWeighted-average discount rate for operating leases 2.333%\nFuture minimum payments for leases are as follows:\nYear Ended December 31, Amount\n2024 33,693 $          \n2025 25,552             \nTotal undiscounted cash flows 59,245  

{"relevant information in the auditor notes": "according to the Foundation 's elected policy. The Foundation 's lease agreement does not contain any material residual \nvalue guarantees or material restrictive covenants.  \n \nAdditional information about the Foundation \u2019s lease for the year ended December 31, 2023 , is as follows:  \n \nLease expense (included in operating expenses)\nOperating lease expense 33,406 $          \nVariable lease expense 27,997             \nTotal Lease Expense: 61,403 $          \nOther Information\nCash paid for amounts included in the measurement of lease liabilities\nOperating cash flows from operating leases 33,240 $          \nWeighted-average remaining lease term in years for operating leases 1.75                 \nWeighted-average discount rate for operating leases 2.333%\nFuture minimum payments for leases are as follows:\nYear Ended December 31, Amount\n2024 33,693 $          \n2025 25,552             \nTotal undiscounted cash flows 59,245  

{"relevant information in the auditor notes": "according to the Foundation 's elected policy. The Foundation 's lease agreement does not contain any material residual \nvalue guarantees or material restrictive covenants.  \n \nAdditional information about the Foundation \u2019s lease for the year ended December 31, 2023 , is as follows:  \n \nLease expense (included in operating expenses)\nOperating lease expense 33,406 $          \nVariable lease expense 27,997             \nTotal Lease Expense: 61,403 $          \nOther Information\nCash paid for amounts included in the measurement of lease liabilities\nOperating cash flows from operating leases 33,240 $          \nWeighted-average remaining lease term in years for operating leases 1.75                 \nWeighted-average discount rate for operating leases 2.333%\nFuture minimum payments for leases are as follows:\nYear Ended December 31, Amount\n2024 33,693 $          \n2025 25,552             \nTotal undiscounted cash flows 59,245  

"I cannot determine the specific cause of the decrease in the National Ataxia Foundation's long-term operating lease liability from 2022 to 2023 with the information currently available to me.  Further research using their full financial statements would be required."

In [93]:
agent_react_full_pdf.run(query=query)

---------------------------------------- iteration 0  ----------------------------------------


observation is non-printable Part object, probably the full pdf
---------------------------------------- iteration 1  ----------------------------------------


observation is non-printable Part object, probably the full pdf
---------------------------------------- iteration 2  ----------------------------------------


'The decrease in the long-term operating lease liability from 2022 to 2023 is due to lease payments made by the National Ataxia Foundation during 2023.  The lease is for their office space and expires in September 2025.'

* <span style="color: blue; font-weight: bold;">one big prompt cannot answer</span>
* <span style="color: blue; font-weight: bold;">ReAct with RAG got it wrong, because RAG returns the incorrect chunks. The LLM cannot even distinguish whether the number is 2022 or 2023. For example: `<thought>The provided information shows the lease liability at the end of 2023 ($58,103) and details`. This $58,103 is actually the number for 2022.</span>
* <span style="color: blue; font-weight: bold;">ReAct with full PDF got it right.</span>
