# Retriever Implementation

In [2]:
from sentence_transformers import SentenceTransformer
from chromadb import PersistentClient
import numpy as np

In [3]:
model = SentenceTransformer('all-MiniLM-L6-v2')
client = PersistentClient(path="../vector_store/")  # Adjust path
collection = client.get_collection(name='complaints')

#### Query Embedding

In [4]:
def embed_query(question: str):
    return model.encode([question])[0].tolist()

def retrieve_top_k(question: str, k=5):
    query_embedding = embed_query(question)
    results = collection.query(
        query_embeddings=[query_embedding],
        n_results=k
    )
    documents = results['documents'][0]
    distances = results['distances'][0]
    return documents, distances

In [6]:
# --- Prompt Engineering ---
PROMPT_TEMPLATE = """
You are a financial analyst assistant for CrediTrust. Your task is to answer questions about customer complaints.\nUse the following retrieved complaint excerpts to formulate your answer. If the context doesn't contain the answer, state that you don't have enough information.\n\nContext:\n{context}\n\nQuestion: {question}\nAnswer:
"""

# --- Generator Implementation ---
from transformers import pipeline

def rag_answer(question, k=5, max_context_length=1500):
    docs, _ = retrieve_top_k(question, k)
    # Concatenate docs, truncate if too long
    context = "\n\n".join(docs)
    if len(context) > max_context_length:
        context = context[:max_context_length]
    prompt = PROMPT_TEMPLATE.format(context=context, question=question)
    # Use a small open-access LLM for demonstration; replace with your preferred model
    generator = pipeline("text2text-generation", model="google/flan-t5-base", device=-1)
    response = generator(prompt, max_new_tokens=256, do_sample=True)[0]['generated_text']
    # Extract only the answer part (after "Answer:")
    answer = response.split("Answer:")[-1].strip()
    return answer, docs

# --- Example Usage ---
example_question = "What are common issues with credit card payments?"
answer, sources = rag_answer(example_question)
print("Generated Answer:", answer)
print("Retrieved Sources:", sources[:2])  # Show 1-2 sources

config.json: 0.00B [00:00, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/990M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json: 0.00B [00:00, ?B/s]

Device set to use cpu


Generated Answer: being charged more interest and having issues even making payments on the card because they keep canceling my cards many times cards on a daily basis with no issue as i believe most americans do as these are extremely common for every day use
Retrieved Sources: ['place and i have never had any issues like this with any credit cards ever before', 'when there is nothing wrong with the credit account']


## RAG Pipeline Evaluation Table

Below is an evaluation table for the RAG system. For each question, the generated answer, top retrieved sources, quality score, and comments are provided.

| Question | Generated Answer | Retrieved Sources (1-2) | Quality Score (1-5) | Comments/Analysis |
|---|---|---|---|---|
| What are common issues with credit card payments? |  |  |  |  |
| How do consumers describe problems with money transfers? |  |  |  |  |
| Are there frequent complaints about personal loans? |  |  |  |  |
| What are typical reasons for savings account disputes? |  |  |  |  |
| How do customers feel about Buy Now, Pay Later services? |  |  |  |  |
| What is a common resolution for debt collection complaints? |  |  |  |  |
| Are there trends in complaints about mortgage services? |  |  |  |  |
| How quickly do companies respond to complaints? |  |  |  |  |
| What are the most cited issues with prepaid cards? |  |  |  |  |
| Do consumers report satisfaction after complaint resolution? |  |  |  |  |

In [7]:
# Run RAG pipeline for evaluation questions
questions = [
    "What are common issues with credit card payments?",
    "How do consumers describe problems with money transfers?",
    "Are there frequent complaints about personal loans?",
    "What are typical reasons for savings account disputes?",
    "How do customers feel about Buy Now, Pay Later services?",
    "What is a common resolution for debt collection complaints?",
    "Are there trends in complaints about mortgage services?",
    "How quickly do companies respond to complaints?",
    "What are the most cited issues with prepaid cards?",
    "Do consumers report satisfaction after complaint resolution?"
]

for i, q in enumerate(questions, 1):
    print(f"\nQuestion {i}: {q}")
    answer, sources = rag_answer(q)
    print("Generated Answer:", answer)
    print("Top 2 Retrieved Sources:")
    for s in sources[:2]:
        print("-", s[:300], "...\n")  # Print first 300 chars for brevity


Question 1: What are common issues with credit card payments?


Device set to use cpu


Generated Answer: being charged more interest and having issues even making payments on the card because they keep canceling my cards many times cards on a daily basis with no issue as i believe most americans do as these are extremely common for every day use
Top 2 Retrieved Sources:
- place and i have never had any issues like this with any credit cards ever before ...

- when there is nothing wrong with the credit account ...


Question 2: How do consumers describe problems with money transfers?


Device set to use cpu


Generated Answer: they are now showing it as a cash advance item which is subject to an additional 1000 fee and the much inflated interest rate for cash advances first no notice was sent that this was happening second it feels predatory on a population of consumers who tend to be the s accounts and holding money and then just expect consumers to have to call and work things out or whatever they may expect what it is it would be wrong a pattern of behavior that may mislead other consumers
Top 2 Retrieved Sources:
- causes a financial hardship for the customer and the recipient which further delays the necessity of the transfer of funds ...

- that they could not offer any protections for electronic money transfer this is unacceptable consumers need to have some sort of recourse in the event of fraud ...


Question 3: Are there frequent complaints about personal loans?


Device set to use cpu


Generated Answer: no
Top 2 Retrieved Sources:
- from a business concern which you will see in the evidence then when i believe it can get worse i see on my credit monitor app which is xxxx xxxx one of them that they made 2 hard inquiries in my name its states that it was loans like i filled out 2 applications for loans or credit card but because ...

- of the xxxx of complaints and negative reviews across xxxx xxxx xxxx xxxx xxxx etc i ask for your help to ensure xxxx does not continue to favor xxxx when disputes are filed i hope with your assistance that xxxx will do the right thing since i have paid 4200000 on these loans to date this same ...


Question 4: What are typical reasons for savings account disputes?


Device set to use cpu


Generated Answer: computer error
Top 2 Retrieved Sources:
- checking and savings accounts i was unaware these accounts were delinquent and had i know this was an issue i wouldve included it in my xxxx xxxx bankruptcy plan i was with citizens bank for nearly xxxx years into adulthood and the issue of the bank accounts was always a computer error on their ...

- initially i was going to file a complaint with the seller but realized this is for banking institutions only i have a spending account with xxxx whose banking is done by xxxx or the xxxx xxxx xxxx however they continuously mishandle disputes i have now filed atleast xxxx disputes for the same matter ...


Question 5: How do customers feel about Buy Now, Pay Later services?


Device set to use cpu


Generated Answer: they are the worst in terms of being customerfriendly
Top 2 Retrieved Sources:
- would make it right for their customer that always pay more than due and early ...

- customers should make their own decisions as to whether a service is to their benefit or not by the way they are the worst in terms of being customerfriendly they sound exactly the opposite b of course it s possible to add money to wallet let s do this again then they take you thru a series of ...


Question 6: What is a common resolution for debt collection complaints?


Device set to use cpu


Generated Answer: disputing this debt
Top 2 Retrieved Sources:
- debt department i am unsatisfied on how they have handled this issue ...

- surrounding the legal collection of consumer debt ...


Question 7: Are there trends in complaints about mortgage services?


Device set to use cpu


Generated Answer: similar experiences shared by other dissatisfied customers
Top 2 Retrieved Sources:
- impact my ability to secure a mortgage and move forward with my home purchase thank you for your prompt attention to this matter please let me know if you require additional information to process this complaint ...

- that i can get a mortgage without having to pay a higher interest rate key bank customer service stated that this has been happening to other customers but it seems like they have done nothing to fix the problem of billing and causing hardship to customers ...


Question 8: How quickly do companies respond to complaints?


Device set to use cpu


Generated Answer: 510 business days
Top 2 Retrieved Sources:
- of their reporting and their handling of consumer complaints ...

- complaints management office i have tried numerous times to contact this person via phone and email and have never received any response all that i received are emails stating that they need more time to research my complaint i have received these emails on xxxxxxxx stating will followup by ...


Question 9: What are the most cited issues with prepaid cards?


Device set to use cpu


Generated Answer: unreliable experience and are not transparent in their practices
Top 2 Retrieved Sources:
- unreliable experience and are not transparent in their practices additionally my customer experience has been horrible i have not only paid thousands in interests and membership fees by having these cards i have consistently paid my off my charges in a timely manner only to be treated as if i have ...

- i was denied 3 different cards one of them was prepaid ...


Question 10: Do consumers report satisfaction after complaint resolution?


Device set to use cpu


Generated Answer: no
Top 2 Retrieved Sources:
- of their reporting and their handling of consumer complaints ...

- information contained in the consumer report you receive is inaccurate or incomplete you have the right to dispute the matter directly with the reporting agency xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx ...

