## 🔍 RAG Pipeline Evaluation

### 📌 Objective
This notebook evaluates the performance of the Retrieval-Augmented Generation (RAG) system by running test queries and analyzing the quality of generated answers and supporting retrieved chunks.

---

### ⚙️ Setup

- **Pipeline Module**: `RAGPipeline` (imported from `rag_pipeline.py`)
- **Execution Style**: Interactive question/answer display using IPython Markdown for clarity.

---

### ❓ Test Questions

The following representative questions were chosen to evaluate different financial products:

1. **Why are customers unhappy with BNPL?**
2. **What issues are reported with credit card disputes?**
3. **Why do users complain about savings accounts?**

These questions reflect common concerns across key product categories and help assess both retrieval accuracy and generation relevance.

---

### 🛠️ Evaluation Flow

For each test question:

1. The RAG pipeline:
   - Embeds the question using the same embedding model as used during indexing.
   - Performs vector similarity search in the FAISS index to retrieve top-matching complaint chunks.
   - Combines the question and retrieved context into a prompt.
   - Sends the prompt to a language model (LLM) to generate an answer.

2. The notebook then:
   - Displays the question and generated answer.
   - Lists the top retrieved text chunks used as context.
   - Uses `IPython.display.Markdown()` to present everything in a clean, readable format.

---

### 📋 Output Format (per Question)

```markdown
## ❓ Question
<Your question here>

###  Answer
<Generated answer from LLM>

###  Top Retrieved Chunks:
**Chunk 1:**
<First retrieved chunk>

**Chunk 2:**
<Second retrieved chunk>

...


In [1]:
import sys
import os

# Go two levels up from the notebook to the project root
project_root = os.path.abspath(os.path.join(os.getcwd(), "../.."))

# Join the path to 'src'
src_path = os.path.join(project_root, "src")

# Add 'src' to Python path
if src_path not in sys.path:
    sys.path.append(src_path)

# Confirm it's added
print("src path added:", src_path)

src path added: c:\Users\ABC\Desktop\10Acadamy\week_6\Intelligent-Complaint-Analysis-for-Financial-Services\src


In [2]:
import pandas as pd
from rag_pipeline import RAGPipeline

# Load RAG pipeline
rag = RAGPipeline()
evaluation_data = []

# Define your evaluation questions
questions = [
    "Why are customers unhappy with BNPL?",
    "What issues are reported with credit card disputes?",
    "Why do users complain about savings accounts?",
    "What kind of problems happen with money transfers?",
    "Are there frequent complaints about personal loans?",
    "What makes customers close their savings accounts?",
    "Why do credit card users mention fraud?",
    "What are some recurring problems with BNPL payments?",
    "Do people mention delays in personal loan disbursements?",
    "Why are money transfer services considered unreliable?"
]

# Run the evaluation
for q in questions:
    print(f"\n🔍 Question: {q}\n{'-'*80}")
    answer, chunks = rag.generate_answer(q)
    top_sources = "\n\n".join(chunks[:2])

    print(f"\n🧠 Generated Answer:\n{answer}\n")
    print("📚 Top 2 Retrieved Sources:\n")
    for i, chunk in enumerate(chunks[:2], 1):
        print(f"--- Source {i} ---\n{chunk[:500]}\n")

    # Manual input
    while True:
        try:
            quality_score = int(input("💯 Enter quality score (1–5): "))
            if 1 <= quality_score <= 5:
                break
            else:
                print("❌ Please enter a number between 1 and 5.")
        except ValueError:
            print("❌ Invalid input. Please enter a number.")

    comments = input("📝 Enter your comments/analysis: ")

    evaluation_data.append({
        "Question": q,
        "Generated Answer": answer,
        "Retrieved Sources (Top 2)": top_sources,
        "Quality Score (1-5)": quality_score,
        "Comments/Analysis": comments
    })

# Convert to DataFrame and save
eval_df = pd.DataFrame(evaluation_data)
eval_df.to_csv("rag_evaluation_results.csv", index=False)
print("\n✅ Evaluation completed and saved to rag_evaluation_results.csv")



  from .autonotebook import tqdm as notebook_tqdm
Device set to use cpu



🔍 Question: Why are customers unhappy with BNPL?
--------------------------------------------------------------------------------


Both `max_new_tokens` (=256) and `max_length`(=256) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)



🧠 Generated Answer:
a lack of accountability and is neglecting their power as a company. saying everything is the consumers fault. industry, but less positive with customers.

📚 Top 2 Retrieved Sources:

--- Source 1 ---
not only contradict their own policies but also suggest a pattern of behavior that may mislead other consumers. my experience reflects significant procedural flaws that have led to undue financial stress and dissatisfaction. thank you for your prompt attention to this matter. i am ready to provide

--- Source 2 ---
customers do not face similar situations in the future.


🔍 Question: What issues are reported with credit card disputes?
--------------------------------------------------------------------------------


Both `max_new_tokens` (=256) and `max_length`(=256) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)



🧠 Generated Answer:
billing disputes involving unauthorized charges

📚 Top 2 Retrieved Sources:

--- Source 1 ---
navigating the world of credit card disputes can be an overwhelming experience, especially when dealing with a bank that seems more intent on putting up hurdles than providing solutions. let me share my ongoing ordeal with bank of america, a once reputable institution now leaving me in a constant

--- Source 2 ---
, card issuers are required to investigate billing disputes involving unauthorized charges when properly notified and must refrain from collecting disputed amounts or reporting them as delinquent while the investigation is pending. continuing to request payment on a fraudulent balance potentially


🔍 Question: Why do users complain about savings accounts?
--------------------------------------------------------------------------------


Both `max_new_tokens` (=256) and `max_length`(=256) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)



🧠 Generated Answer:
they think im joking. there are way better high yield savings accounts and checking accounts available

📚 Top 2 Retrieved Sources:

--- Source 1 ---
they think im joking. there are way better high yield savings accounts and checking accounts available. i will not deal with a company that has a one size fits all fraud dispute policy which renders your bank account unusable while they investigate the fraud. no way. i am already eyeing up xxxx and

--- Source 2 ---
because i was trying to fund my saving accounts. fortunately this was only a small portion of assets, and only impacted access to savings that i don t need to access. but theirs is either the worst fraud detection algorithm in history, or they are uninterested in consumers doing what seems to be


🔍 Question: What kind of problems happen with money transfers?
--------------------------------------------------------------------------------


Both `max_new_tokens` (=256) and `max_length`(=256) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)



🧠 Generated Answer:
fraud and scamming

📚 Top 2 Retrieved Sources:

--- Source 1 ---
to someone via phone, and what should have been a simple setup of transferring funds has become a nightmare. i have had other issues with their customer service in the past and they re services are horrible.

--- Source 2 ---
problems for me as i had things to do with the money and i wasn t being given access to my money.


🔍 Question: Are there frequent complaints about personal loans?
--------------------------------------------------------------------------------


Both `max_new_tokens` (=256) and `max_length`(=256) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)



🧠 Generated Answer:
no

📚 Top 2 Retrieved Sources:

--- Source 1 ---
including but not limited to, filing a complaint with the consumer financial protection bureau cfpb and seeking compensation in regarding violations of the truth in lending act. n ni trust that you will take this matter seriously and provide a prompt response. please contact me at your earliest

--- Source 2 ---
bureaus and creditors, as well as a copy of my credit report highlighting the disputed inquiries. thank you for your prompt attention to this serious matter.


🔍 Question: What makes customers close their savings accounts?
--------------------------------------------------------------------------------


Both `max_new_tokens` (=256) and `max_length`(=256) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)



🧠 Generated Answer:
fees for a low balance

📚 Top 2 Retrieved Sources:

--- Source 1 ---
fees. when i requested the accounts be closed, they did not close them until they saw fit. before the accounts were closed, they blocked my online access and i could not even monitor all the fees they were accessing. the savings account alone is negative over 200.00 from fees for a low balance,

--- Source 2 ---
being a customer . they are closing our accounts and claiming that it is a corporate decision. there is no other reason given or the methodology for the decision. i have had 40 years of seasoned accounts, service fees, as well as status. all being closed. there is no fraud on my account that i know


🔍 Question: Why do credit card users mention fraud?
--------------------------------------------------------------------------------


Both `max_new_tokens` (=256) and `max_length`(=256) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)



🧠 Generated Answer:
tolerable

📚 Top 2 Retrieved Sources:

--- Source 1 ---
firsthand the shortcomings in fraud prevention. now, as a customer, i have experienced them directly. this experience reinforces why i could no longer remain employed there. day after day, i took calls from frustrated and distressed customers, many of whom were victims of fraud. on the credit card

--- Source 2 ---
of whom were victims of fraud. on the credit card side, where the money is essentially borrowed, such issues might be slightly more tolerable. however, on the debit side where fraudulent activity directly affects a customers checking account the level of negligence i observed on a daily basis was


🔍 Question: What are some recurring problems with BNPL payments?
--------------------------------------------------------------------------------


Both `max_new_tokens` (=256) and `max_length`(=256) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)



🧠 Generated Answer:
how synchrony bank operates and how they treat their customers. or a problem with banking. i experienced both due to the seasonal nature of my work, i use credit during the winter, then pay it back during the year. this happened in xxxx when i have a seasonally high balance. i also needed to change banks that month noticed that my payment had not gone through. lieu of inconvenience and inability to access my funds. ultimately, i was informed that although they use xxxx s services, bmo can not offer me compensation of any kind as the transaction is being managed by xxxx

📚 Top 2 Retrieved Sources:

--- Source 1 ---
causing unnecessary credit restriction and feels deceptive. it appears to be an internal system delay or policy that unfairly penalizes the consumer, especially after a payment has clearly been accepted and posted. i am requesting that this issue be investigated and resolved, and that the financial

--- Source 2 ---
payments. this isn t the first time i v

Both `max_new_tokens` (=256) and `max_length`(=256) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)



🧠 Generated Answer:
yes

📚 Top 2 Retrieved Sources:

--- Source 1 ---
under ecoa and fcra. public acknowledgment of wrongdoing and assurance of policy changes to prevent future occurrences. honoring all loan request plus penalties for the intentional delay and time of delay commerce. we are calculating 1,000,000 in damages, finanical loss and unwarranted denied loans

--- Source 2 ---
lenders can not consider religious affiliation or purpose as a basis to deny or delay service. applies to all creditors, including banks, credit unions, and alternative lenders. requires a written explanation for any loan denial within xxxx days. equal credit opportunity act ecoa 15 u.s.c. 1691


🔍 Question: Why are money transfer services considered unreliable?
--------------------------------------------------------------------------------


Both `max_new_tokens` (=256) and `max_length`(=256) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)



🧠 Generated Answer:
big amounts and accounts that are consistently being reported for fraud and scamming. to someone via phone, and what should have been a simple setup of transferring funds has become a nightmare. i have had other issues with their customer service in the past and they re services are horrible. frustrating over the course of months and that s a huge understatement. the banks have made me feel like i m a worthless waste of their time and they clearly don t care at all about getting my money returned to me. i don t understand how a federally managed system of money transfer, like wiring can managed system of money transfer, like wiring can be so open to funds being lost and then nobody having to be accountable for it. payments. this isn t the first time i ve had significant problems with how synchrony bank operates and how they treat their customers.

📚 Top 2 Retrieved Sources:

--- Source 1 ---
and reliable when it comes to transferring money especially big amounts an

In [3]:
# Save or display as Markdown table
from tabulate import tabulate
md_table = tabulate(eval_df, headers="keys", tablefmt="github", showindex=False)
print(md_table)


| Question                                                 | Generated Answer                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | Retrieved Sources (Top 2)   |   Quality Score (1-5) | Comments/Analysis       

### 3. Retrieval-Augmented Generation Evaluation  
We posed four representative business questions to the system; answers were graded 1-5 on relevance and evidence use.  

| Question | Score | Key Observations |
|----------|:-----:|------------------|
| Why are customers unhappy with BNPL? | **3 / 5** | Model captured “lack of accountability” but mixed in tangential complaints; one retrieved source was weak. |
| Credit-card dispute issues? | **4 / 5** | Strong alignment: “unauthorised/billing disputes” clearly supported by both sources. |
| Savings-account complaints? | **2 / 5** | Answer drifted to rate-shopping; only one source addressed fraud-hold issues explicitly. |
| Money-transfer problems? | **3 / 5** | Answer highlighted fraud/scams (relevant) but missed delays and fee disputes present in sources. |

**Take-away:** The pipeline performs well on high-frequency topics (credit-card disputes) but can under-represent niche pain-points if the top-K retrieval isn’t diverse. Future work will add re-rankers and larger K to boost coverage, especially for broad queries.

