# Agent CFO ‚Äî Performance Optimization & Design

---
This is the starter notebook for your project. Follow the required structure below.


You will design and optimize an Agent CFO assistant for a listed company. The assistant should answer finance/operations questions using RAG (Retrieval-Augmented Generation) + agentic reasoning, with response time (latency) as the primary metric.

Your system must:
*   Ingest the company‚Äôs public filings.
*   Retrieve relevant passages efficiently.
*   Compute ratios/trends via tool calls (calculator, table parsing).
*   Produce answers with valid citations to the correct page/table.


## 1. Config & Secrets

Fill in your API keys in secrets. **Do not hardcode keys** in cells.

In [1]:
import os
from dotenv import load_dotenv 

load_dotenv()  # take environment variables from .env file 

COMPANY_NAME = "Google"
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")



def generate_test_log_path_name(base_path: str): 
    # create the directory if not exist 
    os.makedirs(base_path, exist_ok=True) 
    existing_files = [f for f in os.listdir(base_path) if f.startswith("test_") and f.endswith(".json")] 
    existing_indices = [int(f.split("_")[1].split(".")[0]) for f in existing_files if f.split("_")[1].split(".")[0].isdigit()] 
    next_index = max(existing_indices) + 1 if existing_indices else 1 

    return f"{base_path}/test_{next_index}.json"

 

## STAGE 1.5 Embedding Configs to run (Data Ingestion + Retrieval Uses it !)

In [2]:
from sentence_transformers import SentenceTransformer, util

# load E5-base-v2
model = SentenceTransformer("intfloat/e5-base-v2")

def embed_text_query(s):
    # E5 expects prefix, and stripping/normalizing helps
    return model.encode(f"query: {s.strip().lower()}", normalize_embeddings=True)

def embed_text_passage(s):
    # E5 expects prefix, and stripping/normalizing helps
    return model.encode([f"passage: {chunk_text.strip().lower()}" for chunk_text in s],
                        convert_to_numpy=True, 
                        normalize_embeddings=True,
                        show_progress_bar=True)

## 2. Data Download (Dropbox)

*   Annual Reports: last 3‚Äì5 years.
*   Quarterly Results Packs & MD&A (Management Discussion & Analysis).
*   Investor Presentations and Press Releases.
*   These files must be submitted later as a deliverable in the Dropbox data pack.
*   Upload them under `/content/data/`.

Scope limit: each team will ingest minimally 15 PDF files total.


## 3. System Requirements

**Retrieval & RAG**
*   Use a vector index (e.g., FAISS, LlamaIndex) + a keyword filter (BM25/ElasticSearch).
*   Citations must include: report name, year, page number, section/table.

**Agentic Reasoning**
*   Support at least 3 tool types: calculator, table extraction, multi-document compare.
*   Reasoning must follow a plan-then-act pattern (not a single unstructured call).

**Instrumentation**
*   Log timings for: T_ingest, T_retrieve, T_rerank, T_reason, T_generate, T_total.
*   Log: tokens used, cache hits, tools invoked.
*   Record p50/p95 latencies.

## STAGE 1 DATA INGESTION! 

In [None]:
SECTION_EXAMPLES = {
    # --- Cover / Administrative ---
    "cover_page": [
        "united states securities and exchange commission form 10 k annual report pursuant to section 13 or 15d",
        "united states securities and exchange commission form 10 q quarterly report pursuant to section 13 or 15d",
        "cover page showing registrant name commission file number and state of incorporation",
        "front page identifying registrant address telephone number and fiscal year end",
    ],

    # --- Management Discussion ---
    "mdna": [
        "managements discussion and analysis of financial condition and results of operations",
        "md&a explaining liquidity capital resources and operating performance",
        "discussion and analysis of results of operations comparing current and prior periods",
        "analysis of changes in revenues costs cash flows and capital expenditures",
    ],

    # --- Risk Factors ---
    "risk_factors": [
        "risk factors that may affect future financial performance or share price",
        "discussion of material risks and uncertainties facing the company",
        "factors that could cause actual results to differ materially from forward looking statements",
    ],

    # --- Financial Highlights / Summary Data ---
    "summary_financial_data": [
        "selected financial data summarizing key performance indicators for the past five years",
        "summary of consolidated financial information and operating results",
        "selected financial highlights including revenue net income and earnings per share",
    ],

    # --- Income Statement ---
    "income_statement": [
        "consolidated statements of income showing revenue expenses and net income",
        "statement of operations or profit and loss reporting revenues and operating income",
        "consolidated statements of comprehensive income including other comprehensive income items",
        "income statement presenting total revenues cost of goods sold gross profit and net earnings",
    ],

    # --- Balance Sheet ---
    "balance_sheet": [
        "consolidated balance sheets showing assets liabilities and shareholders equity",
        "statement of financial position listing current assets long term liabilities and total equity",
        "balance sheet detailing cash accounts receivable inventories property plant and equipment",
    ],

    # --- Cash Flow Statement ---
    "cash_flow": [
        "consolidated statements of cash flows showing cash inflows and outflows from operating investing and financing activities",
        "statement of cash flows reconciling net income to net cash provided by operating activities",
        "cash flow statement detailing capital expenditures debt repayment and dividend payments",
    ],

    # --- Shareholders‚Äô Equity ---
    "equity": [
        "consolidated statements of shareholders equity showing changes in retained earnings dividends and stock issuance",
        "statement of changes in stockholders equity presenting share repurchases and comprehensive income",
        "equity statement showing common stock treasury stock retained earnings and accumulated other comprehensive income",
    ],

    # --- Notes to Financial Statements ---
    "financial_statements": [
        "notes to consolidated financial statements providing accounting policies commitments contingencies and segment information",
        "footnotes accompanying consolidated financial statements describing significant accounting policies",
        "notes to financial statements detailing income taxes stock compensation and earnings per share",
        "supplementary information supporting consolidated financial statements",
    ],

    # --- Market Risk Disclosures ---
    "market_risk_disclosures": [
        "quantitative and qualitative disclosures about market risk",
        "discussion of exposure to interest rate foreign currency commodity and credit risk",
        "sensitivity analysis of market risk instruments",
    ],

    # --- Controls and Procedures ---
    "controls_procedures": [
        "controls and procedures section discussing disclosure controls and internal control over financial reporting",
        "evaluation of disclosure controls and procedures and changes in internal control",
        "managements report on internal control over financial reporting",
    ],

    # --- Legal Proceedings ---
    "legal_proceedings": [
        "description of material pending legal proceedings and litigation",
        "legal proceedings section detailing lawsuits claims and regulatory actions",
        "information about legal matters affecting the company",
    ],

    # --- Segment Information ---
    "segment_info": [
        "segment information describing operating segments geographic areas and major customers",
        "disclosure of business segments including revenue and profit by segment",
        "note providing details of segment performance and intersegment eliminations",
    ],

    # --- Signatures ---
    "signatures": [
        "signatures section signed on behalf of the registrant and principal officers",
        "signatures of directors executive officers and principal accounting officer",
        "signed by the registrant pursuant to the securities exchange act of 1934",
    ],

    # --- Exhibits ---
    "exhibits": [
        "exhibits and financial statement schedules",
        "list of exhibits and certifications required by form 10k or 10q",
        "exhibit index listing contracts and subsidiary information",
    ],

    # --- Fallback ---
    "other": [
        "miscellaneous sections not classified elsewhere including general disclosures appendices or cover letters",
    ],
}


{

    # --- Exhibits ---
    "exhibits": [
        [1,2,4,67,454,734],
        [1,2,4,67,454,734],
        [1,2,4,67,454,734],
    ],

    # --- Fallback ---
    "other": [
        [1,2,4,67,454,734],
        [1,2,4,67,454,734],
        [1,2,4,67,454,734],
    ],
}



In [None]:
# --- Helpers ---
import re

def clean_table(table):
    """Clean raw Camelot table output."""
    print ("Raw table:", table) 
    return [
        [(cell or "").strip().replace("\n", " ") for cell in row]
        for row in table
    ]

def _normalize(s: str) -> str:
    s = (s or "").lower()
    # unify whitespace & quotes
    s = s.replace("\n", " ").replace("‚Äô", "'").replace("‚Äì", "-").replace("‚Äî", "-")
    s = " ".join(s.split())
    return s


def is_valid_table(table, numeric_threshold: float = 0.25) -> bool:
    """Return True if the table has enough numeric-looking cells to be considered real data."""
    if not table or not table[0]:
        return False
    
    cells = sum(len(r) for r in table)
    numeric_cells = 0
    num_pattern = re.compile(r"^\(?[+-]?\d[\d,\.]*\)?$")  # matches 5,439 or (1,200) etc.

    for row in table:
        for cell in row:
            cell = str(cell).strip().replace("$", "").replace("%", "")
            if num_pattern.match(cell):
                numeric_cells += 1

    return (numeric_cells / cells) >= numeric_threshold
    


In [5]:

SECTION_EMBS = {
    sec: [embed_text_query(ex) for ex in examples]
    for sec, examples in SECTION_EXAMPLES.items()
}


def classify_section(text, table):
    page_text = _normalize(text)
    headers = _normalize(" ".join(table[0])) if table else ""
    first_col = _normalize(" ".join(row[0] for row in table[1:])) if table else ""

    combined = f"{page_text} {headers} {first_col}"
    emb = embed_text_query(combined)

    scores = {
        sec: max(util.cos_sim(emb, e).item() for e in embs)
        for sec, embs in SECTION_EMBS.items()
    }

    best = max(scores, key=scores.get)
    return best if scores[best] > 0.35 else "other"

In [None]:
# TODO: Implement ingestion pipeline

import pdfplumber
import camelot
import json
import os


output = {}
documents_base_dir = "Google"
pdf_path = [os.path.join(documents_base_dir, f) for f in os.listdir(documents_base_dir) if f.lower().endswith(".pdf")] 

print ("Processing PDFs:", pdf_path) 

# keep track sections 
sections = {} 

for pdfFile in pdf_path: 
    pdf_name = os.path.basename(pdfFile) 
    output[pdf_name] = {} 
    
    # Step 1: extract raw text with pdfplumber
    with pdfplumber.open(pdfFile) as pdf:
        for i, page in enumerate(pdf.pages, start=1):
            text = page.extract_text() or "" 
            section_text = classify_section(text, [[]])  # classifying page text only without tables 
            sections[section_text] = sections.get(section_text, 0) + 1 

            # Step 2: extract tables with Camelot (try lattice first, then stream)
            tables_stream = camelot.read_pdf(
                pdfFile, 
                pages=str(i), 
                flavor="stream", 
                row_tol=15, 
                column_tol=8,
                strip_text ='\n' 
                )
            tables_lattice = camelot.read_pdf(pdfFile, pages=str(i), flavor="lattice")
           
            candidate_tables = tables_stream if tables_stream.n > 0 else tables_lattice 
            tables = []
            for t in candidate_tables: 
                raw_table = t.df.values.tolist() 


                if not is_valid_table(raw_table):
                    # Skip tables that are mostly text, like footnotes or headers
                    print(f"[SKIP] Page {i} ‚Äì Non-numeric table filtered out")
                    continue
                
                cleaned_table = clean_table(raw_table)
                section_table = classify_section(text, cleaned_table)

                # Track section counts 
                sections[section_table] = sections.get(section_table, 0) + 1 

                # Skip noise like signatures
                if section_table == "other" and "signature" in text.lower():
                    continue
                    
                tables.append({ 
                    "section": section_table,
                    "header" : cleaned_table[0] if cleaned_table else [], 
                    "rows" : cleaned_table[1:] if len(cleaned_table) > 1 else [] 
                })

            

            print(f"Page {i} ‚Üí Text length: {len(text) if text else 0}, Tables Kept: {len(tables)}")

            output[pdf_name][i] = {
                "page_section": section_text,   
                "text": text,
                "tables": tables
            }

print ("Section distribution:", sections) 

# Step 3: Create directory if it doesn't exist and dump to JSON
output_path = "Google/data/test.json"
os.makedirs(os.path.dirname(output_path), exist_ok=True)

with open(output_path, "w") as f: 
    json.dump(output, f, indent=4)

## 4. Baseline Pipeline

**Baseline (starting point)**
*   Naive chunking.
*   Single-pass vector search.
*   One LLM call, no caching.

## STAGE 2 DATA PROCESSING INTO FAISS !

In [6]:
# TODO: Implement baseline retrieval + generation

# ! CHUNK ! 

import json
import numpy as np
import faiss


# load the json file 
with open("Google/data/test.json", "r") as f:
    doc = json.load(f)

chunks = [] 

for fileDoc , docContent in doc.items(): 
    for page_num, content in docContent.items(): 
        page_section = content.get("page_section", "unknown")
        text = content.get("text", "")
        tables = content.get("tables", [])

        if text.strip(): 
            chunks.append({
                "id": f"{fileDoc}-page-{page_num}-text",
                "text": f"Financial filing text section: {text}",
                "metadata": {"document": fileDoc, "page_number": page_num, "page_section": page_section, "chunk_type": "prose"}
            })

        if tables: 
            for t_index, table in enumerate(tables):
                    table_text = "\n".join([", ".join(row) for row in table.get("rows", [])]) 
                    chunks.append({
                        "id": f"{fileDoc}-page-{page_num}-table-{t_index}",
                        "text": f"Financial statement table: {table_text}",
                        "metadata": {
                            "document": fileDoc,
                            "page_number": page_num,
                            "page_section": page_section,
                            "chunk_type": "table",
                            "table_index": t_index
                            }
                    }) 

In [None]:
# LOAD intfloat/e5-base-v2
import os 


#! EMBEDDINGS YOUR CHUNKS ! 
#! the embeddings should return in the order of your chunk , so when you 
#! dump into FAISS index, you can use the index to retrieve the original chunk metadata


text = [ chunk["text"] for chunk in chunks ] 


embeddings = embed_text_passage (text)

print (f"Embeddings shape: {embeddings.shape}")   

# Create a FAISS index
index = faiss.IndexFlatL2(embeddings.shape[1])
index.add(embeddings)
print (f"FAISS index contains {index.ntotal} vectors.") 

# save it locally 
os.makedirs("Google/base", exist_ok=True) 
# storing the index 
faiss.write_index(index, "Google/base/base.index") 


# store the chunks 
with open("Google/base/chunks.json", "w") as f: 
    json.dump(chunks, f, indent=4) 

Batches:   0%|          | 0/30 [00:00<?, ?it/s]

Embeddings shape: (958, 768)
FAISS index contains 958 vectors.


## STAGE 3 : RETRIEVAL

In [3]:
#! query -> embed query -> retrieve top k chunks -> return chunks with metadata 
import json

def init_indexes(): 
    global index 
    documents_base_dir = "Google/base/base.index" 
    index["index"] = faiss.read_index(documents_base_dir) 
    index["chunks"] = json.load(open("Google/base/chunks.json"))
    print (f"chunks type : {type(index['chunks'])}, length: {len(index['chunks'])}")
#! ======



def search_query(query, k=5):
    
    global index 
    query_embedding = embed_text_query(query) 

    D, I = index["index"].search(np.array([query_embedding]), k=k)

    print (f"Search distances: {D}") 
    print (f"Search indices: {I}") 
    results = [
        {
            "rank": rank + 1,
            "score": float(D[0][rank]),
            "text": index["chunks"][identified_chunk_idx]["text"], 
            "metadata": index["chunks"][identified_chunk_idx]["metadata"] 
        }
        for rank, identified_chunk_idx in enumerate(I[0])
    ]

    # retrieve the proper name for the logs 

    file_name = generate_test_log_path_name("Google/logs/base/")
    # add the query then save the results as json 
    with open(file_name, "w") as f: 
        json.dump({
            "query": query, 
            "results": results 
        }, f, indent=4) 


    return results 

In [4]:
def build_context_from_results(results):
    """
    Build a readable text context for LLM input from structured retrieval results.
    """
    context_parts = []
    for r in results:
        text = r["text"].strip()
        meta = r["metadata"]
        doc = meta.get("document", "unknown")
        page = meta.get("page_number", "?")
        context_parts.append(f"[{doc}, page {page}] {text}\n")

    context = "\n".join(context_parts)
    return context.strip()   

def chat(userQuery , context) : 
    messages = [ 
        { "role": "system", 
         "content":(
                    "You are a helpful financial analyst assistant. "
                    "When calculating financial ratios or margins, always use the correct definitions. "
                    "Identify the proper formula from the user query and context. "
                    "Use line items like 'Revenue', 'Cost of Revenue', 'Operating Income', or 'Interest Income/Expense' appropriately. "
                    "If data is missing, explain what‚Äôs missing instead of guessing. "
                    "Always show step-by-step calculations and cite which values you used from the context."
         ) },
        { "role": "user", "content": f"Context:\n{context}\n\nUser Query:\n{userQuery}"}
    ] 

    response = client.chat.completions.create(
        model="gpt-4.1-mini",
        messages=messages,
        temperature=0.2,
        max_tokens=1600
    )

    return response.choices[0].message.content 

## 5. Benchmark Runner

Run these 3 standardized queries. Produce JSON then prose answers with citations. These are the standardized queries.

*   Gross Margin Trend (or NIM if Bank)
    *   Query: "Report the Gross Margin (or Net Interest Margin, if a bank) over the last 5 quarters, with values."
    *   Expected Output: A quarterly table of Gross Margin % (or NIM % if bank).

*   Operating Expenses (Opex) YoY for 3 Years
    *   Query: "Show Operating Expenses for the last 3 fiscal years, year-on-year comparison."
    *   Expected Output: A 3-year Opex table (absolute numbers and % change).

*   Operating Efficiency Ratio
    *   Query: "Calculate the Operating Efficiency Ratio (Opex √∑ Operating Income) for the last 3 fiscal years, showing the working."
    *   Expected Output: Table with Opex, Operating Income, and calculated ratio for 3 years.




    #####

    
    ##### 

In [5]:
# TODO: Implement benchmark runner

import os 
import faiss
import json
import numpy as np 
from collections import defaultdict 
from openai import OpenAI 

#! workflow
index = defaultdict()
client = OpenAI(api_key=OPENAI_API_KEY)

#! init indexes 
#! depending on session you may need to run this 
init_indexes() 
print (f"Indexes loaded: {list(index.keys())}") 
#!

output = [] 
query =  ["Report the Gross Margin (or Net Interest Margin, if a bank) over the last 3 quarters, with values.",
          "Show Operating Expenses for the last 3 fiscal years, year-on-year comparison.",
          "Calculate the Operating Efficiency Ratio (Opex √∑ Operating Income) for the last 3 fiscal years, showing the working."]



for q in query[2:3]:

    relevant_result_context : list[dict] = search_query(query=q, k=10) 
    print (relevant_result_context)
    
    context = build_context_from_results(relevant_result_context)
    response = chat( userQuery= q , context = context) 

    output.append({
       "query": q,
       "response": response,
       "context": context
   })
    
print ("-" *50) 
# Display as formatted markdown
for item in output:
    print(f"""
    ## üîç Query
    {item['query']}

    ## ü§ñ Response
    {item['response']}

    ## üìä Retrieved Context Summary
    - top related in index searched: {len(relevant_result_context)}
    - Total context length: {len(item['context'])} characters
    """)

chunks type : <class 'list'>, length: 958
Indexes loaded: ['index', 'chunks']
Search distances: [[0.38375872 0.38482228 0.39624548 0.40480545 0.40700623 0.40870818
  0.4104779  0.41101027 0.41117346 0.412428  ]]
Search indices: [[520 760  62 700 281 927 767 526 503 673]]
[{'rank': 1, 'score': 0.3837587237358093, 'text': 'Financial filing text section: ‚Ä¢ third-party services fees, including audit, consulting, outside legal, and other outsourced administrative\nservices.\nOther Income (Expense), Net\nOI&E, net primarily consists of interest income (expense), the effect of foreign currency exchange gains\n(losses), net gains (losses) and impairment on our marketable and non-marketable securities, performance fees,\nand income (loss) and impairment from our equity method investments.\nFor additional information, including how we account for our investments and factors that can drive fluctuations\nin the value of our investments, see Note 1 of the Notes to Consolidated Financial Statement

## 6. Instrumentation

Log timings: T_ingest, T_retrieve, T_rerank, T_reason, T_generate, T_total. Log tokens, cache hits, tools.

In [None]:
# Example instrumentation schema
import pandas as pd
logs = pd.DataFrame(columns=['Query','T_ingest','T_retrieve','T_rerank','T_reason','T_generate','T_total','Tokens','CacheHits','Tools'])
logs

## 7. Optimizations

**Required Optimizations**

Each team must implement at least:
*   2 retrieval optimizations (e.g., hybrid BM25+vector, smaller embeddings, dynamic k).
*   1 caching optimization (query cache or ratio cache).
*   1 agentic optimization (plan pruning, parallel sub-queries).
*   1 system optimization (async I/O, batch embedding, memory-mapped vectors).

In [None]:
# TODO: Implement optimizations


#*  Classify Chunks (Section/Hierachical Chunking)

 # turn each section into a faiss index
indexes = {} 
sections = defaultdict(list) 

# --- Group chunks by section ---
for c in chunks: 
    section = c["metadata"].get("page_section", "unknown") 
    sections[section].append(c) 



# build per section index 
count = 0 
for section , chunk_list in sections.items(): 
    print(f"Building index for section: {section} ({len(chunk_list)} chunks)")
    text = [ chunk["text"] for chunk in chunk_list ] 
    embeddings = embed_text_passage (text) 

    idx = faiss.IndexFlatIP(embeddings.shape[1])
    idx.add(embeddings)


    os.makedirs(f"Google/sections/{section}", exist_ok=True)

    # storing the index 
    faiss.write_index(idx, f"Google/sections/{section}/faiss_index_{section}.index")

    # storing the chunks 
    with open(f"Google/sections/{section}/chunks_{section}.json", "w") as f:
        json.dump(chunk_list, f, indent=4)

    # storing the embeddings 
    np.save(f"Google/sections/{section}/embeddings_{section}.npy", embeddings) 
    count += 1
    
print(f"‚úÖ Built {count} FAISS sub-indexes.") 

In [7]:
#! query -> embed query -> retrieve top k chunks -> return chunks with metadata 
from openai import OpenAI 
import os


indexes = {}


#! load up the indexes and the chunks metadata in case it is a new session 
def init_indexes(): 
    """this dont need run if its in the same session"""

    
    documents_base_dir = "Google/sections" 
    sections_path = [os.path.join(documents_base_dir, f) for f in os.listdir(documents_base_dir) if os.path.isdir(os.path.join(documents_base_dir, f)) ]
    print ("Sections found:", sections_path) 

    for sec_path in sections_path: 
        section = os.path.basename(sec_path) 
        #print (f"Loading section: {section}") 

        # load faiss 
        idx = faiss.read_index(f"Google/sections/{section}/faiss_index_{section}.index") 
        indexes[section] = {
            "index": idx,
            "chunks": json.load(open(f"Google/sections/{section}/chunks_{section}.json"))
        }

#! ======


client = OpenAI(api_key=OPENAI_API_KEY)


def choose_sections_for_query(query, available_sections : list): 
    section_list = ", ".join(available_sections) 

    prompt = f"""
    You are a financial data retrieval router.
    Given the user's question and the available 10-Q sections, 
    select the most relevant section(s) to search for an answer.
    
    Available sections: {section_list}

    User query: "{query}"

    Return a JSON array of section names from the list above. Strictly start with '[' and end with ']'.  
    Example output: ["income_statement", "balance_sheet"]
    """

    resp = client.chat.completions.create(
        model="gpt-4.1-mini",
        messages=[{"role": "user", "content": prompt}],
        temperature=0
    )

    try:
        import json
        selected = json.loads(resp.choices[0].message.content)
    except Exception:
        selected = [resp.choices[0].message.content.strip()]


    return selected

def expand_query_for_retrieval(query: str):
    """
    Expands a financial query into a richer, semantically broader form
    for better retrieval coverage (e.g., including synonyms and context terms).
    """
    prompt = f"""
Expands a financial query into a richer, semantically broader form
for improved retrieval coverage from SEC filings (10-Ks, 10-Qs, etc.).

Instructions:
- Include synonyms and related terms (e.g., "operating expenses" ‚Üí "total expenses", "operating costs", "SG&A").
- Include both annual and quarterly phrasing (e.g., ‚Äúfiscal year‚Äù, ‚Äúquarter ended‚Äù).
- Add relevant accounting context (e.g., ‚ÄúConsolidated Statements of Income‚Äù, ‚ÄúStatements of Operations‚Äù).
- Focus only on expanding the query ‚Äî no explanations or meta text.
- Keep it concise (2‚Äì3 sentences max) but semantically rich and keyword-dense.

User query: "{query}"
    """

    resp = client.chat.completions.create(
        model="gpt-4.1-mini",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.3
    )

    expanded_query = resp.choices[0].message.content.strip()
    return expanded_query



def search_query(query, available_sections : list ,k=10):

    expanded_query = expand_query_for_retrieval(query) 

    sections = choose_sections_for_query(expanded_query, available_sections=available_sections) 
    print (f"Sections chosen for query: {sections}") 


    query_embedding = embed_text_query(expanded_query) 
    #D, I = index.search(np.array([query_embedding]), k=k)

    D_all, I_all = [], [] 
    results = []


    for sec in sections: 
        if sec in indexes: 
            idx = indexes[sec]["index"]
            D, I = idx.search(np.array([query_embedding]), k=min(k, idx.ntotal))
            D_all.append(D)
            I_all.append(I)

            results.append(
                {
                    "section": sec, 
                    "ranking" : [
                        {
                            "rank": rank + 1 , 
                            "score": float(D[0][rank]), 
                            "text": indexes[sec]["chunks"][identified_chunk_idx]["text"], 
                            "metadata": indexes[sec]["chunks"][identified_chunk_idx]["metadata"] 
                        }   for rank, identified_chunk_idx in enumerate(I[0])           
                    ]
                }
            )
        else:
            print(f"[WARN] Section '{sec}' not found in indexes.") 


    # store the search results with the query
    # create the directory if not exist 
    file_path = generate_test_log_path_name("Google/logs/multi_section/") 
    with open(file_path, "w") as f:
        json.dump({
            "query": query,
            "expanded_query": expanded_query,
            "results": results
        }, f, indent=4)

    return results

In [None]:
def build_context_from_results(results, top_per_section=10):
    """
    Build a readable text context for LLM input from structured retrieval results.
    """
    context_parts = []
    for section_data in results:
        section = section_data.get("section", "unknown")
        context_parts.append(f"\n=== SECTION: {section.upper()} ===\n")

        for r in section_data.get("ranking", [])[:top_per_section]:
            text = r["text"].strip()
            meta = r["metadata"]
            doc = meta.get("document", "unknown")
            page = meta.get("page_number", "?")
            context_parts.append(f"[{doc}, page {page}] {text}\n")

    context = "\n".join(context_parts)
    return context.strip()



def chat(userQuery, context):
    messages = [
        {
            "role": "system",
            "content": (
                "You are a **financial analyst assistant** specializing in interpreting SEC filings. "
                "You must **base all answers strictly and only on the provided context** below ‚Äî "
                "do not use any outside knowledge, even if you think you know the answer. "
                "If information required for a calculation or definition is missing from the context, clearly state what‚Äôs missing. "
                "You must use available quarterly data to infer annual or year-over-year trends if full-year data is missing. "
                "If only quarterly values exist, clearly state you‚Äôre annualizing or approximating based on those quarters. "
                "Always show your step-by-step reasoning and explicitly cite which line items or values you used from the context."
            ),
        },
        {
            "role": "user",
            "content": (
                f"### CONTEXT START ###\n{context}\n### CONTEXT END ###\n\n"
                f"### USER QUERY ###\n{userQuery}\n\n"
                "Now, based strictly on the context above, provide a structured answer."
            ),
        },
    ]

    response = client.chat.completions.create(
        model="gpt-4.1-mini",
        messages=messages,
        temperature=0.1,     # more deterministic
        max_tokens=1800
    )

    return response.choices[0].message.content


In [12]:
# TODO: Implement benchmark runner
#! workflow


#! init indexes 
#! depending on session you may need to run this 
init_indexes() 
print (f"Indexes loaded: {list(indexes.keys())}") 
#!

output = [] 
query =  ["Report the Gross Margin (or Net Interest Margin, if a bank) over the last 3 quarters, with values.",
          "Show Operating Expenses for the last 3 fiscal years, year-on-year comparison.",
          "Calculate the Operating Efficiency Ratio (Opex √∑ Operating Income) for the last 3 fiscal years, showing the working."]


for q in query[1:2]:
    relevant_result_context : list[dict] = search_query(q, available_sections=list(indexes.keys()))
    context = build_context_from_results(relevant_result_context) # keep top k as 5 per sectio

    print ("-" *50) 
    print (f"Context : {context}")
    print ("-" *50) 
    response = chat( userQuery= q , context = context)
    
    output.append({
       "query": q,
       "response": response,
       "context": context
   })
    
# Display as formatted markdown
for item in output:
    print(f"""
    ## üîç Query
    {item['query']}

    ## ü§ñ Response
    {item['response']}

    ## üìä Retrieved Context Summary
    - Sections searched: {len(relevant_result_context)}
    - Total context length: {len(item['context'])} characters
    """)

Sections found: ['Google/sections\\balance_sheet', 'Google/sections\\cash_flow', 'Google/sections\\controls_procedures', 'Google/sections\\cover_page', 'Google/sections\\equity', 'Google/sections\\exhibits', 'Google/sections\\financial_statements', 'Google/sections\\income_statement', 'Google/sections\\legal_proceedings', 'Google/sections\\market_risk_disclosures', 'Google/sections\\mdna', 'Google/sections\\risk_factors', 'Google/sections\\segment_info', 'Google/sections\\signatures', 'Google/sections\\summary_financial_data']
Indexes loaded: ['balance_sheet', 'cash_flow', 'controls_procedures', 'cover_page', 'equity', 'exhibits', 'financial_statements', 'income_statement', 'legal_proceedings', 'market_risk_disclosures', 'mdna', 'risk_factors', 'segment_info', 'signatures', 'summary_financial_data']
Sections chosen for query: ['income_statement', 'financial_statements']
--------------------------------------------------
Context : === SECTION: INCOME_STATEMENT ===

[goog-10-q-q1-2025.pd

## 8. Results & Plots

Show baseline vs optimized. Include latency plots (p50/p95) and accuracy tables.

In [None]:
# TODO: Generate plots with matplotlib
