# Task 3: Independent RAG - Full-Scale Evaluation

This notebook evaluates the **fully independent** RAG pipeline using the full 1.3M complaint dataset and a local LLM generator.

---

## 1. Initialize Pipeline
We load the `RAGPipeline`, which will automatically detect and load the full-scale index/metadata and the local AI model.

In [22]:
import os
import sys
from pathlib import Path
import pandas as pd

# Setup paths
root_dir = Path(os.getcwd()).parent
sys.path.append(str(root_dir))

from src.rag.pipeline import RAGPipeline

vector_store_dir = root_dir / "vector_store"

# This will load the index and the model
# Note: The first run will download the ~1GB model files automatically.
rag = RAGPipeline(vector_store_dir=str(vector_store_dir))

üîß [Retriever] Loading model: all-MiniLM-L6-v2...
üìÇ [Retriever] Loading index from c:\Users\My Device\Desktop\week-7-rag-complaint-chatbot\vector_store\medium_faiss_index.index...
üìÑ [Retriever] Loading metadata...
‚úÖ [Retriever] Ready!


## 2. Qualitative Evaluation
We test the pipeline with representative questions and summarize findings in a table.

In [23]:
questions = [
    "Why are consumers unhappy with Credit Cards?",
    "What are the main issues in Debt Collection?",
    "How do users describe identity theft problems?",
    "Common complaints about Mortgage foreclosures?",
    "What are the reported problems with Money Transfers?",
    "How do consumers feel about Personal Loans?",
    "Issues with Savings Account opening?",
    "Are there complaints about unauthorized bank transfers?"
]

eval_results = []

for q in questions:
    print(f"üîç Querying: {q}")
    result = rag.query(q)
    
    eval_results.append({
        "Question": q,
        "Generated Answer": result['answer'],
        "Retrieved Sources": [doc['chunk_id'] for doc in result['source_documents']],
        "Quality Score (1-5)": "", 
        "Comments": ""
    })

df_eval = pd.DataFrame(eval_results)
display(df_eval)

# Save Results
df_eval.to_csv("task_3_final_evaluation.csv", index=False)

üîç Querying: Why are consumers unhappy with Credit Cards?
üîç Querying: What are the main issues in Debt Collection?
üîç Querying: How do users describe identity theft problems?
üîç Querying: Common complaints about Mortgage foreclosures?
üîç Querying: What are the reported problems with Money Transfers?
üîç Querying: How do consumers feel about Personal Loans?
üîç Querying: Issues with Savings Account opening?
üîç Querying: Are there complaints about unauthorized bank transfers?


Unnamed: 0,Question,Generated Answer,Retrieved Sources,Quality Score (1-5),Comments
0,Why are consumers unhappy with Credit Cards?,**Analysis based on 5 complaints:**\n\nRelevan...,"[1886628_0, 1445765_0, 1660541_0, 524858_2, 27...",,
1,What are the main issues in Debt Collection?,**Analysis based on 5 complaints:**\n\nRelevan...,"[1204839_0, 175492_10, 1736229_0, 438825_0, 15...",,
2,How do users describe identity theft problems?,**Analysis based on 5 complaints:**\n\nRelevan...,"[68389_0, 1692325_0, 90505_0, 1799747_6, 35330...",,
3,Common complaints about Mortgage foreclosures?,**Analysis based on 5 complaints:**\n\nRelevan...,"[1179901_66, 1179901_67, 826396_9, 532349_12, ...",,
4,What are the reported problems with Money Tran...,**Analysis based on 5 complaints:**\n\nRelevan...,"[1374224_3, 1807419_0, 1387301_5, 154416_52, 8...",,
5,How do consumers feel about Personal Loans?,**Analysis based on 5 complaints:**\n\nRelevan...,"[873119_4, 1100506_12, 1617835_11, 1273701_2, ...",,
6,Issues with Savings Account opening?,**Analysis based on 5 complaints:**\n\nRelevan...,"[1139308_0, 720773_0, 157101_0, 1657447_3, 341...",,
7,Are there complaints about unauthorized bank t...,**Analysis based on 5 complaints:**\n\nRelevan...,"[1766908_0, 1835847_2, 546_1, 582453_1, 576619_1]",,
