# Task 3: Building the RAG Core Logic and Evaluation

This task focuses on developing the core Retrieval-Augmented Generation (RAG) pipeline, combining vector retrieval and language model generation to answer user questions based on customer complaint narratives.

---

## Objectives:

- Implement a retriever to fetch the most relevant complaint chunks from the vector database.
- Design a prompt template to guide the language model for grounded, relevant answers.
- Build the generator that produces answers using the retrieved context.
- Combine retrieval and generation into a clean pipeline.
- Perform qualitative evaluation by running representative questions and analyzing outputs.

---

## Key Steps:

1. **Retriever**: Embed user query and retrieve top-k relevant complaint text chunks with metadata.
2. **Prompt Engineering**: Use a prompt template instructing the LLM to answer only from the retrieved context.
3. **Generator**: Send the prompt + context to a language model (LLM) and get the generated answer.
4. **RAG Pipeline**: Wrap these into a single function returning the answer and supporting evidence.
5. **Evaluation**: Test the system with sample questions, document answers, retrieved sources, and quality scores.

---

## Deliverables:

- Modular Python source files:
    - `retriever.py`
    - `prompt_template.py`
    - `generator.py`
    - `rag_pipeline.py`
- A Jupyter notebook `03_rag_pipeline_evaluation.ipynb` to demonstrate and qualitatively evaluate the system.


In [1]:
import sys
import os

# Adjust the path to point to your project root
sys.path.append(os.path.abspath(".."))  # if notebook is in notebooks/, one level below root


In [2]:
from src.rag.rag_pipeline import RAGPipeline

rag = RAGPipeline(k=5)

output = rag.answer("What complaints do customers have about BNPL?")
print(output["answer"])


Retrieved chunks: [{'content': 'and file formal complaints with the relevant consumer protection authorities xxxx xxxx', 'metadata': {'product': 'Credit card', 'complaint_id': 13831290}, 'chunk_id': np.int64(1743)}, {'content': 'helps us understand how companies are addressing concerns raised by consumers in their complaints we will also share your feedback with the company we have also shared your complaint with the federal trade commission which will add your complaint to its database for state and federal law enforcement agencies we appreciate your participation in the complaint process and your feedback on the companys response both are important to us and consumers who may have similar issues and concerns', 'metadata': {'product': 'Savings account', 'complaint_id': 13970077}, 'chunk_id': np.int64(1873)}, {'content': 'their practices that seem to disregard consumer rights and timely communication thank you for your attention to this matter i look forward to your response and any as

In [3]:
# sample Quesions :

sample_questions = [
    "Why are users unhappy with the Buy Now, Pay Later product?",
    "Are there complaints about credit card billing?",
    "What issues do customers report with savings accounts?",
    "What are the top complaints about money transfers?",
    "Do people have problems with loan repayment?"
]


In [5]:
# Run the RAG Pipeline on Sample Questions
evaluation_results = []

for question in sample_questions:
    output = rag.answer(question)
    
    # For clarity, get top 2 source snippets + product names from metadata
    retrieved = rag.retriever.retrieve(question, k=2)
    
    sources_with_meta = [
        f"[{item['metadata']['product']}] {item['content'][:150]}..."
        for item in retrieved
    ]
    
    evaluation_results.append({
        "Question": question,
        "Answer": output["answer"],
        "Sources": sources_with_meta,
        "Score (1-5)": "",  # fill manually after review
        "Comments": ""
    })


Retrieved chunks: [{'content': 'card companies and banks have enormous power and consumers are left with situations like this they should be required to let you know at the point of sale what the problem is', 'metadata': {'product': 'Credit card', 'complaint_id': 13891123}, 'chunk_id': np.int64(260)}, {'content': 'to them the payment was made i explained it was their technology being used mobile app to make payment and asked them to look at that and they were unresponsive basically blowing me off', 'metadata': {'product': 'Credit card', 'complaint_id': 13895842}, 'chunk_id': np.int64(1038)}, {'content': 'at all this is frustrating because i can not see the payments i make i can only make payments as a guest payer and can easily fall behind in payments i believe this is intentional also i can not believe a company as large as this does not have a plan in place to rectify the problem if not they should be held liable for not keeping up their end of the company by upholding their policy o

In [7]:
import pandas as pd

eval_df = pd.DataFrame(evaluation_results)
print(eval_df.head())

eval_df.to_markdown("evaluation_table.md", index=False)


                                            Question  \
0  Why are users unhappy with the Buy Now, Pay La...   
1    Are there complaints about credit card billing?   
2  What issues do customers report with savings a...   
3  What are the top complaints about money transf...   
4       Do people have problems with loan repayment?   

                                              Answer  \
0                              They are unresponsive   
1                                                yes   
2  account access and customer service to whom it...   
3                                              scams   
4                                                 no   

                                             Sources Score (1-5) Comments  
0  [[Credit card] card companies and banks have e...                       
1  [[Credit card] subject deceptive billing pract...                       
2  [[Savings account] formal complaint regarding ...                       
3  [[Money transfers] 

## Create and Save Evaluation Table

We organize the results into a structured table using pandas and export them as a Markdown file for inclusion in the final report. This table helps us analyze the strengths and limitations of the RAG pipeline.

In [8]:
import pandas as pd
from IPython.display import Markdown, display

eval_df = pd.DataFrame(evaluation_results)

# Display nicely formatted markdown table in the notebook
display(Markdown(eval_df.to_markdown(index=False)))


| Question                                                   | Answer                                                     | Sources                                                                                                                                                                                                                                                                                                                                                        | Score (1-5)   | Comments   |
|:-----------------------------------------------------------|:-----------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------|:-----------|
| Why are users unhappy with the Buy Now, Pay Later product? | They are unresponsive                                      | ['[Credit card] card companies and banks have enormous power and consumers are left with situations like this they should be required to let you know at the point of ...', '[Credit card] to them the payment was made i explained it was their technology being used mobile app to make payment and asked them to look at that and they were un...']         |               |            |
| Are there complaints about credit card billing?            | yes                                                        | ['[Credit card] subject deceptive billing practices improper late fees and potential discriminatory conduct by citibankmacys credit services to whom it may concern i ...', '[Credit card] and civil rights laws particularly the equal credit opportunity act ecoa and doddfranks prohibition on unfair deceptive or abusive acts or practices u...']         |               |            |
| What issues do customers report with savings accounts?     | account access and customer service to whom it may concern | ['[Savings account] formal complaint regarding ongoing issues with account access and customer service to whom it may concern i am writing to formally express my frustrat...', '[Savings account] bureau in requiring truist bank to immediately release or provide access to my account provide a written explanation of the account freeze improve the...'] |               |            |
| What are the top complaints about money transfers?         | scams                                                      | ['[Money transfers] transfers it is near impossible to get through to someone via phone and what should have been a simple setup of transferring funds has become a nightm...', '[Money transfers] often in high or repeating amounts were used to fund cryptocurrency wallets later linked to an international fraud scheme involving platforms such as ...'] |               |            |
| Do people have problems with loan repayment?               | no                                                         | ['[Personal loan] i borrowed 100000 from a lender named lendumo in early 2025 around early xxxx i was told i would be repaying monthly and 36000 has been deducted from ...', '[Credit card] is holding me delinquent after rejecting my payments...']                                                                                                         |               |            |