# RAG vs. LLM Comparison Notebook

This notebook compares the performance (accuracy, latency, and quality) of the Retrieval-Augmented Generation (RAG) pipeline versus a pure LLM (no retrieval) for the Jupiter FAQ Bot.

## Instructions
- Set your HuggingFace API key in the environment variable `HF_API_KEY`.
- Run all cells to compare answers for a set of test questions.
- Results will include latency and answer quality for both approaches.

In [None]:
import os
import time
import requests
import pandas as pd
from models.rag_pipeline import rag_answer
from models.llm_inference import query_huggingface_llm  

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
# Prompt for API key if not set
if "HF_API_KEY" not in os.environ or not isinstance(os.environ["HF_API_KEY"], str) or not os.environ["HF_API_KEY"]:
    api_key = input("Enter your HuggingFace API key: ")
    os.environ["HF_API_KEY"] = str(api_key)

# Validate the API key with a quick test request
try:
    test_response = requests.get(
        "https://api-inference.huggingface.co/models/bert-base-uncased",
        headers={"Authorization": f"Bearer {os.environ['HF_API_KEY']}"}
    )
    if test_response.status_code == 401:
        raise ValueError("Invalid HuggingFace API key. Please check and try again.")
    elif test_response.status_code != 200:
        print(f"Warning: Unexpected response ({test_response.status_code}): {test_response.text}")
    else:
        print("HuggingFace API key is valid.")
except Exception as e:
    raise RuntimeError(f"Error validating HuggingFace API key: {e}")

## Define Test Questions
You can edit or expand this list as needed.

In [3]:
test_questions = [
    "what is jupiter money",
    "how to do kyc",
    "how can I get a debit card",
    "how to transfer money",
    "is jupiter a bank",
    "how to get passbook",
    "how to set pin for debit card"
]

## Run RAG and LLM-only for Each Question
We will time each approach and collect the answers.

In [9]:
results = []
for q in test_questions:
    # RAG
    start = time.time()
    rag_result = rag_answer(q)
    rag_time = time.time() - start

    # LLM only
    llm_prompt = f"Answer the following user question in a friendly, helpful way. If you do not know, say so.\nUser question: {q}"
    start = time.time()
    llm_only_answer = query_huggingface_llm(llm_prompt)
    llm_time = time.time() - start

    results.append({
        "question": q,
        "rag_answer": rag_result["llm_response"],
        "faq_match": rag_result["retrieved_faq"]["answer"],
        "rag_time": rag_time,
        "llm_only_answer": llm_only_answer,
        "llm_time": llm_time
    })

df = pd.DataFrame(results)
df

Unnamed: 0,question,rag_answer,faq_match,rag_time,llm_only_answer,llm_time
0,what is jupiter money,Jupiter Money is an all-in-one mobile app that...,Jupiter is the 1-app for everything money that...,1.420948,Answer the following user question in a friend...,0.695624
1,how to do kyc,Great question! To complete your KYC (Know You...,To open a free Savings or Salary Bank Account ...,0.806431,Answer the following user question in a friend...,0.845341
2,how can I get a debit card,You can order a new physical Debit Card by tap...,You can order a new physical Debit Card by tap...,0.863916,Answer the following user question in a friend...,0.366537
3,how to transfer money,There are many ways to transfer money from Jup...,There are many ways to transfer money from Jup...,11.187971,Answer the following user question in a friend...,18.817686
4,is jupiter a bank,Jupiter is itself not a bank and doesn’t hold ...,Jupiter is itself not a bank and doesn’t hold ...,1.819305,Answer the following user question in a friend...,5.346659
5,how to get passbook,You can request for a passbook by visiting you...,You can request for a passbook by visiting you...,6.703232,Answer the following user question in a friend...,20.653836
6,how to set pin for debit card,You can set/ reset your Debit Card PIN by tapp...,You can set/ reset your Debit Card PIN by tapp...,13.705848,Answer the following user question in a friend...,16.326591


In [11]:
df.to_csv("rag_vs_llm_results.csv", index=False)


In [10]:
print("Average RAG latency:", df['rag_time'].mean())
print("Average LLM-only latency:", df['llm_time'].mean())

Average RAG latency: 5.215378761291504
Average LLM-only latency: 9.007467678615026
