# Debug Drill: The Hallucinating RAG

**Scenario:**
Your RAG-powered support bot is giving confident but wrong answers.

"It told me I have 60 days for a refund, but your policy says 30 days!" a customer complains.

Despite having the correct information in the knowledge base, the bot is hallucinating details.

**Your Task:**
1. Identify where the hallucination is occurring
2. Implement faithfulness checking
3. Fix the prompt to reduce hallucination
4. Write a 3-bullet postmortem

---

In [None]:
import re
from typing import Dict, List, Tuple

# Note: In a real system, you'd use an actual LLM API

In [None]:
# Knowledge base with CORRECT information
knowledge_base = {
    "refund_policy": {
        "title": "Refund Policy",
        "content": "We offer full refunds within 30 days of purchase. After 30 days, only store credit is available. Items must be unused and in original packaging."
    },
    "shipping": {
        "title": "Shipping Information", 
        "content": "Standard shipping takes 5-7 business days. Express shipping takes 2-3 business days. Free shipping on orders over $50."
    },
    "return_process": {
        "title": "How to Return",
        "content": "Start a return from Order History. Select the item, click Return, and print the prepaid label. Drop off at any UPS location."
    }
}

print("=== Knowledge Base (Ground Truth) ===")
for key, doc in knowledge_base.items():
    print(f"\n{doc['title']}:")
    print(f"  {doc['content'][:80]}...")

In [None]:
# ===== COLLEAGUE'S CODE (BUG: Weak prompt allows hallucination) =====

def weak_rag_prompt(query: str, context: str) -> str:
    """Weak prompt that allows hallucination."""
    return f"""You are a helpful support agent.

Context: {context}

Question: {query}

Answer:"""

# Simulated LLM response (hallucinating!)
def simulate_weak_llm(query: str, context: str) -> str:
    """Simulate an LLM that hallucinate details not in context."""
    # This simulates common hallucination patterns
    hallucinated_responses = {
        "refund": "You have 60 days to request a full refund. After that, you can still get 50% back for another 30 days. Just contact support!",  # WRONG: says 60 days
        "shipping": "Standard shipping is 3-5 days, express is next day. All orders over $25 get free shipping!",  # WRONG: different numbers
        "return": "You can return items by mailing them back or dropping at any FedEx or UPS location. We'll process your refund in 24 hours.",  # WRONG: adds FedEx, 24 hours
    }
    
    for key, response in hallucinated_responses.items():
        if key in query.lower():
            return response
    
    return "I'm not sure about that. Please contact support."

# Test the broken RAG
query = "What's your refund policy?"
context = knowledge_base['refund_policy']['content']

print("=== Broken RAG Response ===")
print(f"\nQuery: {query}")
print(f"\nContext (ground truth):")
print(f"  {context}")

response = simulate_weak_llm(query, context)
print(f"\nLLM Response (HALLUCINATED):")
print(f"  {response}")

print("\n❌ Response says 60 days, but policy says 30 days!")

---

## Your Investigation

### Step 1: Implement faithfulness checking

In [None]:
def check_number_faithfulness(response: str, context: str) -> Tuple[bool, List[str]]:
    """
    Check if numbers in the response appear in the context.
    Common hallucination: making up specific numbers.
    """
    # Extract numbers from both
    response_numbers = set(re.findall(r'\d+', response))
    context_numbers = set(re.findall(r'\d+', context))
    
    # Find hallucinated numbers
    hallucinated = response_numbers - context_numbers
    
    is_faithful = len(hallucinated) == 0
    return is_faithful, list(hallucinated)

# Test faithfulness checker
is_faithful, issues = check_number_faithfulness(response, context)

print("=== Faithfulness Check ===")
print(f"\nContext numbers: {set(re.findall(r'd+', context))}")
print(f"Response numbers: {set(re.findall(r'd+', response))}")
print(f"\nIs faithful: {is_faithful}")
if not is_faithful:
    print(f"❌ Hallucinated numbers: {issues}")

### Step 2: TODO - Create a stronger prompt

In [None]:
# TODO: Create a prompt that prevents hallucination

# Uncomment and complete:

# def strong_rag_prompt(query: str, context: str) -> str:
#     """Strong prompt that reduces hallucination."""
#     return f"""You are a support agent. Answer questions using ONLY the information provided below.
#     
# IMPORTANT RULES:
# 1. Only use facts from the CONTEXT below
# 2. Do NOT add information that isn't explicitly stated
# 3. If the context doesn't contain the answer, say "I don't have that information"
# 4. Quote specific numbers and policies exactly as stated
# 
# CONTEXT:
# {context}
# 
# QUESTION: {query}
# 
# ANSWER (using only the context above):"""
# 
# # Display the improved prompt
# print("=== Improved Prompt ===")
# print(strong_rag_prompt(query, context))

In [None]:
# TODO: Simulate a faithful response

# Uncomment:

# def simulate_faithful_llm(query: str, context: str) -> str:
#     """Simulate an LLM that only uses provided context."""
#     # Extract key facts from context
#     if "30 days" in context:
#         return "According to our policy, we offer full refunds within 30 days of purchase. After 30 days, only store credit is available. Items must be unused and in original packaging."
#     elif "5-7" in context:
#         return "Standard shipping takes 5-7 business days, and express shipping takes 2-3 business days. Free shipping is available on orders over $50."
#     else:
#         return "Based on the information provided: " + context
# 
# faithful_response = simulate_faithful_llm(query, context)
# 
# print("=== Faithful Response ===")
# print(f"\n{faithful_response}")
# 
# # Check faithfulness
# is_faithful, issues = check_number_faithfulness(faithful_response, context)
# print(f"\n✓ Is faithful: {is_faithful}")

In [None]:
# TODO: Implement response validation pipeline

# Uncomment:

# def validate_response(response: str, context: str) -> Dict:
#     """Validate that response is faithful to context."""
#     result = {
#         'response': response,
#         'checks': []
#     }
#     
#     # Check 1: Number faithfulness
#     is_faithful, hallucinated = check_number_faithfulness(response, context)
#     result['checks'].append({
#         'name': 'number_faithfulness',
#         'passed': is_faithful,
#         'issues': hallucinated
#     })
#     
#     # Check 2: No "I think" or uncertainty markers combined with specific claims
#     uncertain_specific = bool(re.search(r'(I think|probably|maybe|around).*\d+', response))
#     result['checks'].append({
#         'name': 'no_uncertain_specifics',
#         'passed': not uncertain_specific,
#         'issues': ['Combines uncertainty with specific numbers'] if uncertain_specific else []
#     })
#     
#     result['all_passed'] = all(c['passed'] for c in result['checks'])
#     return result
# 
# # Test validation
# print("=== Response Validation ===")
# print("\nHallucinated response:")
# validation = validate_response(response, context)
# for check in validation['checks']:
#     status = '✓' if check['passed'] else '❌'
#     print(f"  {status} {check['name']}: {check['issues'] if check['issues'] else 'OK'}")
# 
# print("\nFaithful response:")
# validation_good = validate_response(faithful_response, context)
# for check in validation_good['checks']:
#     status = '✓' if check['passed'] else '❌'
#     print(f"  {status} {check['name']}: {check['issues'] if check['issues'] else 'OK'}")

In [None]:
# ============================================
# SELF-CHECK
# ============================================

# Uncomment:

# assert callable(check_number_faithfulness), "Should have faithfulness checker"
# assert callable(strong_rag_prompt), "Should have improved prompt"
# assert validation_good['all_passed'], "Faithful response should pass all checks"
# 
# print("✓ Hallucination detection implemented!")
# print("✓ Strong prompt created")
# print("✓ Validation pipeline working")

### Step 3: Write your postmortem

In [None]:
postmortem = """
## Postmortem: The Hallucinating RAG

### What happened:
- (Your answer: What incorrect information did the bot give?)

### Root cause:
- (Your answer: Why did the LLM hallucinate despite having correct context?)

### How to prevent:
- (Your answer: What prompt techniques and validations reduce hallucination?)

"""

print(postmortem)

---

## ✅ Drill Complete!

**Key lessons:**

1. **RAG doesn't prevent hallucination automatically.** The LLM can still make things up.

2. **Strong prompts constrain the model.** Explicitly tell it to only use provided context.

3. **Validate outputs.** Check that specific claims (numbers, dates) appear in context.

4. **Numbers are common hallucination targets.** Always verify numerical claims.

---

## Anti-Hallucination Checklist

| Technique | Purpose |
|-----------|----------|
| "Only use provided context" | Constrains model |
| "Say 'I don't know' if unsure" | Prevents guessing |
| Number verification | Catches made-up stats |
| Quote exact policy text | Forces faithfulness |
| LLM-as-judge | Automated validation |