## Agentic Workflow with RAG
Author: **Peeyush Sharma**; Feedback: **PSharma3@gmail.com**

This notebooks combines Agentic AI workflow with RAG. To skip directly to the final results, compare the extractions at the bottom of this notebook with major financial returns captured in the following 10-Q submission: https://www.wellsfargo.com/assets/pdf/about/investor-relations/sec-filings/2025/first-quarter-10q.pdf

Agentic AI is usually interpreted in two different ways:
1. An autonomous set of agents that proactively monitor a condition and take action without explicit human intervention.
2. A workflow of parallel, sequential, or iterative steps where each step is governed by a prompt and an LLM agent.

This notebook demonstrate the 2nd use-case above in conjunction with RAG. To achieve that, a very detailed sequence of prompts has been used to sequentially retrieve, cleanse, and format data attributes. This specific sample covers quarterly filing of a public firm but can be extended for any data extraction use-case. The first prompt is run against "similarity" embeddings for commonly reported financial terms. All subsequent prompts are then called on the output of the previous prompt iteration resulting in a nicely formatted table-like structure that can easily be imported into Excel as delimited CSV.

In [22]:
import os
from uuid import uuid4

import chromadb
import torch
from langchain.chains import RetrievalQA
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain_chroma import Chroma
from langchain_huggingface import HuggingFaceEmbeddings
from typing import List, Dict, Callable
from langchain_ollama import OllamaLLM
from util import llm_call, extract_xml

In [23]:
DOC_DIR = "../documents"
DOC_TYPE = "10-Q"
EMBEDDING_MODEL_NAME = "all-MiniLM-L6-v2"
LLM_MODEL_NAME = "llama3.2"
CHUNK_SIZE = 4096
CHUNK_OVERLAP = 512
PURGE_VECTOR_DB = True # Purging everytime for demo env

In [24]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu")
embeddings = HuggingFaceEmbeddings(model_name=EMBEDDING_MODEL_NAME)
collection_name = "rag_collection"
chroma_client = chromadb.PersistentClient()

try:
    if PURGE_VECTOR_DB: # Keep option for hard reset
        chroma_client.delete_collection(name=collection_name)
    chroma_client.get_or_create_collection(collection_name)
except ValueError:
    chroma_client.delete_collection(collection_name)

collection = chroma_client.get_or_create_collection(collection_name)

vector_store = Chroma(
    client=chroma_client,
    collection_name=collection_name,
    embedding_function=embeddings
)

In [25]:
# Split the document into chunks
def insert_document_to_vector_db(doc_path):
    loader = PyPDFLoader(doc_path)
    documents = loader.load()
    text_splitter = CharacterTextSplitter(chunk_size=CHUNK_SIZE, chunk_overlap=CHUNK_OVERLAP, separator="\n")
    docs = text_splitter.split_documents(documents=documents)
    uuids = [str(uuid4()) for _ in range(len(docs))]
    # print(len(uuids))
    ids = vector_store.add_documents(documents=docs, ids=uuids)
    return ids

In [30]:
dir_path = os.path.join(DOC_DIR, DOC_TYPE)
vector_indices = []
# Add optionality to cover only the
# suggested files within the folder
file_list = ['wfc-first-quarter-10q.pdf',]
             #'wfc-first-quarter-10q-loan-section.pdf']
if os.path.exists(dir_path) and os.path.isdir(dir_path):
    temp_indices = [insert_document_to_vector_db(os.path.join(dir_path, doc))
                    for doc in os.listdir(dir_path)
                    if doc in file_list]
    vector_indices.extend(temp_indices)
# vector_indices

In [27]:
def workflow(input: str, prompts: List[str]) -> str:
    """Chain multiple LLM calls sequentially, passing results between steps."""
    result = input
    print(result)
    llm = OllamaLLM(model=LLM_MODEL_NAME)
    for i, prompt in enumerate(prompts, 1):
        print(f"\nStep {i}:")
        if i==1:
            retriever = vector_store.as_retriever(search_type="similarity",
                                      search_kwargs={'k': 20})
            result = retriever.invoke(prompt)
        # prompt = ChatPromptTemplate.from_template(template=prompt)
        # chain = prompt | llm
        # result = chain.invoke({"documentation": [result], "question": prompt})
        result = llm_call(f"{prompt}\nInput: {result}")
        print(result)

    return result

In [28]:
# Example 1: Chain workflow for structured data extraction and formatting
# Each step progressively transforms raw text into a formatted table
data_processing_steps = [
    """You are a financial analyst. You are given excerpts from a single 10-Q statement
     for a public company:  {documentation}

    You are to identify all major financial reports such as
    cash flow, balance sheet from the financial report and
    return that to the user. Keep the originally reported format
    in your output as much as possible.
    Example format:
    (in millions, except share data)
    March 31, 2025
    December 31,2024
    Assets
    Cash and due from banks
    $
    22,066
    $
    23,372
    Deposits with banks
    403,837
    445,945
    """,

    """Extract only the numerical values and their associated metrics from the text.
    Format each as 'value: metric: quarter' on a new line.
    Keep one number per line.
    Example format:
    22,066: Cash and due from banks: March 31st, 2025
    403,837: Deposit with banks: March 31st, 2025""",

    """Retrieve applicable unit (millions, thousands) for all numerical values
     wherever applicable
    Example format:
    22,066 million: Cash and due from banks: March 31st, 2025
    403,837 thousands: Deposit with banks: March 31st, 2025""",

    """Make units consistent so that all values have a single unit preferably thousands.
     wherever applicable
    Example format:
    22,066 thousands: Cash and due from banks: March 31st, 2025
    403,837 thousands: Deposit with banks: March 31st, 2025""",

    """Convert all numerical values to percentages where possible.
    If not a currency amount convert to decimal (e.g., 92 points -> 92%).
    Keep one number per line.
    Example format:
    45%: revenue growth
    33%: credit loss allocation""",

    """Sort all lines in descending order by numerical value.
    Keep the format 'value: metric: quarter' on each line.
    Example format:
    92%: customer satisfaction
    87%: employee satisfaction""",

    """Format the sorted data as a markdown table with columns:
    Example format:
    Timestamp | Metric | Value |
    |--:|--:|--:|
    March 31st, 2025 | Customer Satisfaction | 92% |"""
]

formatted_result = workflow('', data_processing_steps)




Step 1:
Based on the provided 10-Q documentation, I can identify the following major financial reports:

1. Consolidated Statement of Income (Unaudited)
Quarter ended March 31, 2025 vs 2024
Key figures:
- Total revenue: $20,149M vs $20,863M
- Net income: $4,894M vs $4,619M
- Diluted earnings per share: $1.39 vs $1.20

2. Consolidated Statement of Comprehensive Income (Unaudited)
Quarter ended March 31, 2025 vs 2024
Key figures:
- Net income before noncontrolling interests: $4,804M vs $4,623M
- Total comprehensive income: $7,072M vs $3,653M

3. Consolidated Statement of Changes in Equity (Unaudited)
Quarter ended March 31, 2025 vs 2024
Key sections:
- Preferred stock
- Common stock
- Additional paid-in capital
- Retained earnings
- Accumulated other comprehensive income (loss)
- Treasury stock
- Noncontrolling interests

4. Consolidated Statement of Cash Flows (Unaudited)
Quarter ended March 31, 2025 vs 2024
Major sections:
- Cash flows from operating activities
- Cash flows from inve