In [1]:
%load_ext autoreload
%autoreload 2

In [None]:
# import other useful python libraries
import sys
import os
from openai import OpenAI
from IPython.display import display_markdown

# Add the parent directory (Auditbot_backend) to the system path
sys.path.append(
    os.path.abspath(
        os.path.join(
            os.path.dirname(f"{os.getcwd()}/RAG_langchain.ipynb"),
            '..'
        )
    )
)

# import custom modules
from utils.preprocessing import *
from utils.json_parser import *
from utils.content_page_parser import *
from utils.retriever import *
from utils.custom_print import *
from utils.prompt_engineering import *
from utils.initialisations import *

In [2]:
# RAG input parameters =======================================================

query = "What are the findings pertaining to grant?"

# HYPERPARAMETERS ============================================================

# preprocessing --------------------------------------------------------------

# Chunk into sentences ('s') or paragraphs ('p')
chunking='s' 

# Group smaller chunks into a bigger chunk
grouping=1

# control minimum chubk size
min_chunk_size=100

# ranking --------------------------------------------------------------------

# top k matches for ranking. 
# Both sparse and dense search find top_k matches so hybrid search will return 
# at least top_k matches and most 2 * top_k matches
top_k = 15

# top n matches for reranking
top_n = 15

# Cross encoder model
model_name = "cross-encoder/stsb-roberta-base"


In [3]:
# generate all required data structures

# generate chunks
generate_chunks(DOCUMENT_DIR,
                chunks_path,
                chunk_pageNum_pairs_path,
                s_p_pairs_path, 
                chunking, 
                grouping, 
                min_chunk_size,
                DOC_IDENTIFIER)


# generate inverted tree
has_content_page = True
generate_inverted_tree(chunk_pageNum_pairs_path, 
                       has_content_page, 
                       save_inverted_tree_path,
                       tree_path)

2.5234711170196533 seconds
number of chunks: 9127


In [4]:
# retrieve all required data structures

# load tree
inverted_tree = json_file_to_dict(save_inverted_tree_path)

# load chunks from tree's keys
chunks = list(inverted_tree.keys())
print("Number of unique chunks:", len(chunks))

# load sentence paragraph pairs
if (chunking == 's' or chunking == 'f') and grouping == 1:
    print("s_p_pairs will be filled")
    s_p_pairs = json_file_to_dict(s_p_pairs_path)

Number of unique chunks: 8210
s_p_pairs will be filled


In [5]:
# 1st rank
good_chunks, good_langchain_docs = ranking(chunks, query, top_k, OPENAI_API_KEY)

print("1ST RANKING---------------------------------------------------------\n")
pretty_print_list(good_chunks)

1ST RANKING---------------------------------------------------------

idx: 0

Details of the lapses pertaining to the enforcement of SDL collections are in the 
 
following paragraphs

----------------------------------------------------------

idx: 1

Stage 1: Grant Design and Setup
– whether processes were in place to ensure that grant programmes 
were authorised and reviewed for relevance
b

----------------------------------------------------------

idx: 2

Audit findings are conveyed by AGO to the ministries and statutory boards audited 
by way of “management letters”

----------------------------------------------------------

idx: 3

Stage 1: Grant Design and Setup
– whether there were processes and controls in place to ensure that 
grant programmes were authorised and administered in accordance 
with the objective(s) of the grant

----------------------------------------------------------

idx: 4

Audit findings are conveyed to the Government ministries, statutory boards and ot

In [6]:
# Reranking
best_chunks, scores = reranking(model_name, good_chunks, query, top_n)
print("RERANKING-----------------------------------------------------------\n")
pretty_print_rank(best_chunks, scores)

RERANKING-----------------------------------------------------------

RANK: 1

Stage 2: Grant Evaluation and Approval
– whether there were processes and controls in place to ensure that 
grant applications were properly evaluated and approved

SCORE: 0.34653497

----------------------------------------------------------

RANK: 2

Audit Observations
In this year’s audits, AGO uncovered a number of instances that indicated laxity in 
the administration of grants

SCORE: 0.34602296

----------------------------------------------------------

RANK: 3

Stage 2: Grant Evaluation and Approval
–	
Whether there were processes and controls in place to ensure 
that grant cases were properly evaluated and approved

SCORE: 0.34453192

----------------------------------------------------------

RANK: 4

Application, evaluation and award of grants 
– whether the processes to invite, receive, evaluate and approve 
proposals and contract with grant recipients2 were properly administered
b

SCORE: 0.321

In [7]:
prompt = generate_prompt(query, 
                         inverted_tree, 
                         best_chunks, 
                         chunking, 
                         s_p_pairs)

print(prompt)

Role:
You are a specialist who uses the context provided to answer the query.

Instruction:
Your response should cite sources' year and page number.

Background:
The context is taken from audit reports from the Auditor-General's Office (AGO) of Singapore. 
AGO is an independent organ of state and the national auditor. They play an important role in enhancing public accountability in the management and use of public funds and resources through their audits.

They audit
    government ministries and departments
    organs of state
    statutory boards
    government funds
    other public authorities and bodies administering public funds (upon their request for audit), e.g. government-owned companies.

They report their audit observations to the President, Parliament and the public through the Annual Report of the Auditor-General management of the organisations audited through management letters.
Their observations include system weaknesses, non-compliance with control procedures or legi

In [7]:
if True:
    # get response
    client = OpenAI(api_key='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')

    completion = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": prompt}
        ]
    )

    response = openai_completion_to_text(completion)
    display_markdown(response, raw=True)

The audit reports from the Auditor-General's Office (AGO) of Singapore over various years highlight several key findings concerning the administration and management of grants. Here are some pertinent points:

1. **Laxity in Administration**: 
   - The 2014/15 audit found instances indicating a laxity in the administration of grants. Public sector entities failed to ensure that the correct amount of grants were disbursed and that conditions for grants were adhered to (AGO, 2014/15, p. 3).

2. **Processes and Controls**:
   - The 2018/19 audit observed whether processes and controls were in place to ensure grant programs were authorized and administered according to their objectives (AGO, 2018/19, p. 54).
   - The 2022/23 audit specifically examined whether grant schemes related to COVID-19 were authorized and administered in accordance with the objectives of the schemes (AGO, 2022/23, p. 48).

3. **Grant Evaluation and Approval**:
   - The 2019/20 and 2022/23 audits examined the processes and controls in place to ensure grant applications were properly evaluated and approved, and whether proper agreements with grant recipients were established (AGO, 2019/20, p. 53; AGO, 2022/23, p. 48).

4. **Monitoring and Review**:
   - The 2022/23 audit also checked if there were processes and controls to ensure grants were managed per relevant terms and conditions and that the deliverables were achieved (AGO, 2022/23, p. 49).

5. **Oversight and Inconsistent Practices**:
   - The 2019/20 audit noted a need for better oversight of Programme Partners (PPs) administering grants, highlighting inconsistent practices across PPs in their stipulation of requirements to grant recipients and their application checks (AGO, 2019/20, p. 57).

6. **Evidence of Follow-up**:
   - The 2016/17 audit highlighted that for some projects, there was no evidence that the Economic Development Board (EDB) had followed up with grant recipients to determine that project conditions and milestones were met by stipulated due dates (AGO, 2016/17, p. 46).

7. **Disbursement**:
   - The 2017/18 audit looked into whether there were processes to ensure grants were disbursed accurately and timely (AGO, 2017/18, p. 45).

These findings underscore the need for robust controls, thorough processes, and consistent follow-up in the administration and management of grants to ensure accountability and effective use of public resources.

In [8]:
print(response)

The audit reports from the Auditor-General's Office (AGO) of Singapore over various years highlight several key findings concerning the administration and management of grants. Here are some pertinent points:

1. **Laxity in Administration**: 
   - The 2014/15 audit found instances indicating a laxity in the administration of grants. Public sector entities failed to ensure that the correct amount of grants were disbursed and that conditions for grants were adhered to (AGO, 2014/15, p. 3).

2. **Processes and Controls**:
   - The 2018/19 audit observed whether processes and controls were in place to ensure grant programs were authorized and administered according to their objectives (AGO, 2018/19, p. 54).
   - The 2022/23 audit specifically examined whether grant schemes related to COVID-19 were authorized and administered in accordance with the objectives of the schemes (AGO, 2022/23, p. 48).

3. **Grant Evaluation and Approval**:
   - The 2019/20 and 2022/23 audits examined the proces