In [11]:
# Setup paths to import from src/
import sys
import os
import pandas as pd

# Add the project root to the system path
project_root = os.path.abspath(os.path.join(os.getcwd(), '..'))
if project_root not in sys.path:
    sys.path.append(project_root)

from src.rag_pipeline import get_retriever, get_llm, get_rag_chain, evaluate_rag, get_representative_questions, query_rag

## 1. Initialize Components
This step loads the Embedding model, the Vector Store (ChromaDB), and the LLM onto your GPU.

In [2]:
print("Initializing Retrieval System (ChromaDB + Embeddings)...")
try:
    retriever = get_retriever()
    print("‚úÖ Retriever loaded successfully.")
except Exception as e:
    print(f"‚ùå Error loading retriever: {e}")

Initializing Retrieval System (ChromaDB + Embeddings)...
Loading embedding model: sentence-transformers/all-MiniLM-L6-v2
Loading vector store from: /home/marshy/FOSS/repos/tenx/w7/vector_store
‚úÖ Retriever loaded successfully.


In [3]:
print("Initializing LLM (Qwen2.5-1.5B-Instruct)...")
try:
    llm = get_llm()
    print("‚úÖ LLM loaded on GPU successfully.")
except Exception as e:
    print(f"‚ùå Error loading LLM: {e}")

Initializing LLM (Qwen2.5-1.5B-Instruct)...
Loading LLM: Qwen/Qwen2.5-1.5B-Instruct on cuda


`torch_dtype` is deprecated! Use `dtype` instead!
`torch_dtype` is deprecated! Use `dtype` instead!
Device set to use cuda:0


‚úÖ LLM loaded on GPU successfully.


In [12]:
# Combine into RAG Chain
rag_chain = get_rag_chain(retriever, llm)
print("‚úÖ RAG Chain assembled.")

‚úÖ RAG Chain assembled.


## 2. Standard Evaluation
We will run the pipeline against the predefined list of representative questions to gauge overall performance.

In [14]:
questions = get_representative_questions()
print(f"Running evaluation on {len(questions)} questions... (This may take a moment)")

results = evaluate_rag(rag_chain, questions)

# formatting for better readability in notebook
# df_results = pd.DataFrame(results)
# pd.set_option('display.max_colwidth', None)
# display(df_results)

Running evaluation on 7 questions... (This may take a moment)

Starting evaluation...
Processing Question: What are the common complaints about student loans?
Answer generated. You are a financial analyst assistant for CrediTrust. Your task is to answer questions about customer complaints. Use the following retrieved complaint excerpts to formulate your answer. If the context doesn't contain the answer, state that you don't have enough information.

Context: 
[Source: Complaint ID 5916078] there are two parts to my complaint. first, i applied for a student loan on xx/xx/ because i was told by a representative with the company that satisfactory academic progress was not a consideration for approval. i got approved for the loan and when they did a certification with my school, they denied the loan because of those sap requirements. this caused an unnecessary inquiry on my credit because i was given misinformation and i had to call another lender and get the loan ( which was another

[Sou

## 3. Interactive Testing
Use the cell below to ask custom questions and inspect the retrieved sources.

In [15]:
custom_question = "What happens if they report incorrect information to credit bureaus?"

print(f"Question: {custom_question}\n")

# Run query
response = rag_chain.invoke(custom_question)

# Display Answer
print(f"ü§ñ **Answer:**\n{response['answer']}\n")

# Display Sources
print("üìÑ **Retrieved Context:**")
for i, doc in enumerate(response['docs']):
    meta = doc.metadata
    print(f"--- Source {i+1} (ID: {meta.get('complaint_id')}, Product: {meta.get('product')}) ---")
    print(f"{doc.page_content[:300]}...\n")

Question: What happens if they report incorrect information to credit bureaus?

ü§ñ **Answer:**
You are a financial analyst assistant for CrediTrust. Your task is to answer questions about customer complaints. Use the following retrieved complaint excerpts to formulate your answer. If the context doesn't contain the answer, state that you don't have enough information.

Context: 
[Source: Complaint ID 3175020] the incorrect information from my credit report altogether due ti inaccurate information.

[Source: Complaint ID 2554441] the company to correct my status with the credit bureaus to paid as agreed. also, this company has reps that provide incorrect information on a daily basis. please advise. thanks

[Source: Complaint ID 2794684] a consumer reporting agency may continue to report information it has verified as accurate. a credit report includes information on where you live, how you pay your bills, and whether you've been sued, arrested, or filed for bankruptcy. i've contacted 

## 4. Export for Report
Generate a Markdown table for your final report.

In [16]:
print(df_results[['Question', 'Generated Answer', 'Retrieved Sources']].to_markdown(index=False))

| Question                                                 | Generated Answer                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           