# Week 4: Advanced Prompting III - RAG & Prompt Chaining

## MBA 590 - Advanced AI Strategy: Prompting and Agentic Frameworks

---

## Overview

This week focuses on two powerful advanced techniques: Retrieval-Augmented Generation (RAG) and Prompt Chaining. RAG enables LLMs to access and ground their responses in specific knowledge bases, while prompt chaining allows us to orchestrate complex multi-step workflows.

### Key Topics
- Retrieval-Augmented Generation (RAG) concepts and architecture
- Benefits of grounding LLMs in specific knowledge
- Prompt chaining techniques for multi-step tasks
- Combining RAG and prompt chaining for business workflows
- Practical implementation patterns

## Learning Objectives

By the end of this week, you will be able to:

1. Understand the architecture and benefits of Retrieval-Augmented Generation (RAG)
2. Identify business scenarios where RAG provides significant value
3. Design and implement prompt chaining workflows for multi-step tasks
4. Recognize the limitations and challenges of RAG systems
5. Combine RAG and prompt chaining to automate complex business processes
6. Evaluate trade-offs between different RAG implementation approaches

## Academic Readings

1. **Lewis, P., Perez, E., Piktus, A., et al. (2020).** *Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.* arXiv preprint arXiv:2005.11401.

2. **Wu, C., Korbak, T., Pinto, L., et al. (2021).** *AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts.* arXiv preprint arXiv:2110.02491.

## 1. Understanding Retrieval-Augmented Generation (RAG)

### What is RAG?

Retrieval-Augmented Generation combines two key components:
1. **Retrieval System**: Searches and retrieves relevant information from a knowledge base
2. **Generation System**: Uses the retrieved information to generate accurate, grounded responses

### Why RAG Matters for Business

**Key Benefits:**
- **Factual Grounding**: Reduces hallucinations by providing source documents
- **Up-to-date Information**: Access current data without retraining models
- **Domain Specificity**: Leverage proprietary or specialized knowledge
- **Transparency**: Responses can cite sources for verification
- **Cost Efficiency**: Avoid expensive fine-tuning for knowledge updates

### RAG Architecture Components

1. **Document Store**: Repository of knowledge (databases, documents, wikis)
2. **Embedding Model**: Converts text to vector representations
3. **Vector Database**: Stores and indexes document embeddings
4. **Retriever**: Finds relevant documents based on query similarity
5. **LLM**: Generates responses using retrieved context

In [None]:
# Setup: Import required libraries
import numpy as np
import pandas as pd
from typing import List, Dict, Tuple
import json

# Note: In production, you would use libraries like:
# - chromadb, pinecone, or faiss for vector databases
# - sentence-transformers for embeddings
# - langchain or llamaindex for RAG orchestration

print("Libraries imported successfully")

## 2. Simulating a Simple RAG System

Let's build a simplified RAG system to understand the core concepts.

In [None]:
# Simulated company knowledge base
knowledge_base = [
    {
        "id": "doc1",
        "title": "Return Policy",
        "content": "Customers can return items within 30 days of purchase for a full refund. Items must be in original condition with tags attached. Return shipping is free for defective items."
    },
    {
        "id": "doc2",
        "title": "Warranty Information",
        "content": "All electronics come with a 1-year manufacturer warranty. Extended warranties up to 3 years are available for purchase. Warranty covers manufacturing defects but not accidental damage."
    },
    {
        "id": "doc3",
        "title": "Shipping Policy",
        "content": "Standard shipping takes 5-7 business days and costs $5.99. Express shipping takes 2-3 business days and costs $15.99. Free shipping on orders over $50."
    },
    {
        "id": "doc4",
        "title": "Price Match Guarantee",
        "content": "We will match any competitor's price within 14 days of purchase. Bring proof of the lower price and we'll refund the difference. Applies to identical products only."
    }
]

print(f"Knowledge base loaded with {len(knowledge_base)} documents")
print("\nSample document:")
print(json.dumps(knowledge_base[0], indent=2))

In [None]:
# Simplified retrieval function (keyword-based)
# In production, this would use semantic search with embeddings

def simple_retrieve(query: str, knowledge_base: List[Dict], top_k: int = 2) -> List[Dict]:
    """
    Simple keyword-based retrieval (simplified for demonstration).
    In production, use semantic similarity with embeddings.
    """
    query_words = set(query.lower().split())
    
    # Score documents based on keyword overlap
    scored_docs = []
    for doc in knowledge_base:
        doc_words = set((doc['title'] + ' ' + doc['content']).lower().split())
        overlap = len(query_words.intersection(doc_words))
        scored_docs.append((overlap, doc))
    
    # Sort by score and return top-k
    scored_docs.sort(reverse=True, key=lambda x: x[0])
    return [doc for score, doc in scored_docs[:top_k]]

# Test the retrieval function
test_query = "What is your return policy for defective items?"
retrieved_docs = simple_retrieve(test_query, knowledge_base, top_k=2)

print(f"Query: {test_query}")
print(f"\nRetrieved {len(retrieved_docs)} documents:\n")
for i, doc in enumerate(retrieved_docs, 1):
    print(f"{i}. {doc['title']}")
    print(f"   {doc['content'][:100]}...\n")

In [None]:
# RAG Prompt Construction

def build_rag_prompt(query: str, retrieved_docs: List[Dict]) -> str:
    """
    Constructs a RAG prompt by combining retrieved context with the query.
    """
    context = "\n\n".join([
        f"Document {i+1}: {doc['title']}\n{doc['content']}"
        for i, doc in enumerate(retrieved_docs)
    ])
    
    prompt = f"""You are a helpful customer service assistant. Use the following information to answer the customer's question accurately.

CONTEXT:
{context}

CUSTOMER QUESTION:
{query}

INSTRUCTIONS:
- Base your answer on the provided context
- If the context doesn't contain the answer, say so clearly
- Be specific and cite relevant policy details
- Keep the response professional and helpful

ANSWER:"""
    
    return prompt

# Generate RAG prompt
rag_prompt = build_rag_prompt(test_query, retrieved_docs)
print("RAG PROMPT:")
print("="*70)
print(rag_prompt)
print("="*70)

## 3. RAG vs. Standard Prompting Comparison

In [None]:
# Comparison: Standard vs. RAG approach

comparison_data = {
    'Aspect': [
        'Knowledge Source',
        'Accuracy',
        'Hallucination Risk',
        'Update Frequency',
        'Domain Specificity',
        'Response Verification',
        'Implementation Complexity',
        'Latency'
    ],
    'Standard Prompting': [
        'Pre-trained knowledge only',
        'Variable, depends on training data',
        'Higher',
        'Only when model is retrained',
        'Limited to training data',
        'Difficult to verify sources',
        'Simple',
        'Lower'
    ],
    'RAG Approach': [
        'Pre-trained + retrieved documents',
        'Higher with relevant docs',
        'Lower (grounded in sources)',
        'Real-time (update knowledge base)',
        'High (custom knowledge base)',
        'Can cite source documents',
        'More complex',
        'Higher (retrieval step)'
    ]
}

df_comparison = pd.DataFrame(comparison_data)
print("Standard Prompting vs. RAG Comparison")
print("="*70)
print(df_comparison.to_string(index=False))

## 4. Introduction to Prompt Chaining

### What is Prompt Chaining?

Prompt chaining involves breaking down complex tasks into a sequence of simpler prompts, where the output of one prompt serves as input to the next.

### Benefits of Prompt Chaining:

1. **Modularity**: Each step focuses on a specific sub-task
2. **Transparency**: Clear visibility into multi-step reasoning
3. **Control**: Ability to intervene or modify between steps
4. **Quality**: Better results through specialized prompts
5. **Debugging**: Easier to identify and fix issues

### Common Chaining Patterns:

- **Sequential**: Linear flow (Step 1 → Step 2 → Step 3)
- **Conditional**: Branch based on intermediate results
- **Parallel**: Multiple chains running simultaneously
- **Iterative**: Loops with refinement

In [None]:
# Example: Customer Feedback Analysis Chain

# Sample customer feedback
customer_feedback = """
I recently purchased your laptop model X500 and I'm extremely disappointed. 
The battery life is terrible - it only lasts 3 hours when you advertised 8 hours. 
The keyboard feels cheap and several keys are already sticking after just 2 weeks. 
On the positive side, the screen quality is excellent and the processing speed is good. 
Customer service was helpful when I called, but they couldn't resolve the battery issue.
I'm considering returning it if these problems aren't fixed.
"""

print("CUSTOMER FEEDBACK:")
print(customer_feedback)

### Prompt Chain Design: Multi-Step Feedback Analysis

In [None]:
# Step 1: Extract Key Issues

prompt_step1 = f"""
Analyze the following customer feedback and extract all specific issues mentioned.
Format your response as a JSON list with the following structure:
{{
    "issues": [
        {{
            "category": "category name",
            "severity": "low/medium/high",
            "description": "brief description"
        }}
    ],
    "positive_aspects": ["aspect1", "aspect2"]
}}

FEEDBACK:
{customer_feedback}

JSON OUTPUT:
"""

print("STEP 1 PROMPT: Extract Key Issues")
print("="*70)
print(prompt_step1)
print("="*70)

In [None]:
# Simulated Step 1 Output
step1_output = {
    "issues": [
        {
            "category": "Battery Performance",
            "severity": "high",
            "description": "Battery life only 3 hours vs advertised 8 hours"
        },
        {
            "category": "Hardware Quality",
            "severity": "medium",
            "description": "Keyboard feels cheap and keys sticking after 2 weeks"
        },
        {
            "category": "Customer Service",
            "severity": "medium",
            "description": "Unable to resolve battery issue"
        }
    ],
    "positive_aspects": [
        "Excellent screen quality",
        "Good processing speed",
        "Helpful customer service interaction"
    ]
}

print("STEP 1 OUTPUT (Simulated LLM Response):")
print(json.dumps(step1_output, indent=2))

In [None]:
# Step 2: Prioritize Issues

prompt_step2 = f"""
Based on the following extracted issues, prioritize them for action.
Consider severity, customer impact, and likelihood of return/churn.

EXTRACTED ISSUES:
{json.dumps(step1_output['issues'], indent=2)}

Provide:
1. Priority ranking (1 = highest priority)
2. Recommended action for each issue
3. Urgency level (immediate/this week/this month)

Format as JSON:
{{
    "prioritized_actions": [
        {{
            "priority": 1,
            "issue": "issue description",
            "recommended_action": "specific action",
            "urgency": "timeframe",
            "rationale": "why this priority"
        }}
    ]
}}
"""

print("STEP 2 PROMPT: Prioritize Issues")
print("="*70)
print(prompt_step2)
print("="*70)

In [None]:
# Simulated Step 2 Output
step2_output = {
    "prioritized_actions": [
        {
            "priority": 1,
            "issue": "Battery Performance",
            "recommended_action": "Immediate replacement or full refund; investigate if systemic issue with model X500",
            "urgency": "immediate",
            "rationale": "High severity, false advertising concern, customer considering return"
        },
        {
            "priority": 2,
            "issue": "Hardware Quality - Keyboard",
            "recommended_action": "Offer keyboard replacement or device exchange; quality control review",
            "urgency": "this week",
            "rationale": "Indicates potential manufacturing defect, affects daily use"
        },
        {
            "priority": 3,
            "issue": "Customer Service Resolution",
            "recommended_action": "Follow-up with customer service team training; empower with more resolution options",
            "urgency": "this month",
            "rationale": "Process improvement, prevents future escalations"
        }
    ]
}

print("STEP 2 OUTPUT (Simulated LLM Response):")
print(json.dumps(step2_output, indent=2))

In [None]:
# Step 3: Draft Response Plan

prompt_step3 = f"""
Create a customer response email and internal action plan based on the following:

ORIGINAL FEEDBACK:
{customer_feedback}

PRIORITIZED ACTIONS:
{json.dumps(step2_output, indent=2)}

POSITIVE ASPECTS TO ACKNOWLEDGE:
{json.dumps(step1_output['positive_aspects'], indent=2)}

Provide two outputs:

1. CUSTOMER EMAIL (200-250 words):
   - Acknowledge issues with empathy
   - Reference positive feedback
   - Propose concrete solutions
   - Professional and apologetic tone

2. INTERNAL ACTION PLAN:
   - Immediate actions (next 24 hours)
   - Short-term actions (this week)
   - Follow-up tasks
"""

print("STEP 3 PROMPT: Draft Response Plan")
print("="*70)
print(prompt_step3)
print("="*70)

### Visualizing the Chain

In [None]:
# Chain Visualization
chain_steps = {
    'Step': [1, 2, 3],
    'Task': [
        'Extract Issues & Positives',
        'Prioritize & Recommend Actions',
        'Draft Response & Action Plan'
    ],
    'Input': [
        'Raw customer feedback',
        'Extracted issues from Step 1',
        'Priorities + Original feedback'
    ],
    'Output': [
        'Structured JSON of issues',
        'Prioritized action list',
        'Email + Internal plan'
    ],
    'Focus': [
        'Analysis',
        'Decision-making',
        'Communication'
    ]
}

df_chain = pd.DataFrame(chain_steps)
print("\nPROMPT CHAIN WORKFLOW")
print("="*70)
print(df_chain.to_string(index=False))
print("\nBenefits of this chain:")
print("- Each step has a clear, focused objective")
print("- Intermediate outputs can be validated")
print("- Easy to modify individual steps without rewriting entire prompt")
print("- Structured data flows naturally between steps")

## 5. Combining RAG and Prompt Chaining

The real power comes from combining RAG with prompt chaining for complex business workflows.

In [None]:
# Example: Policy-Compliant Customer Service Workflow

def rag_chain_workflow(customer_query: str, knowledge_base: List[Dict]) -> Dict:
    """
    Combines RAG and prompt chaining for customer service.
    
    Chain:
    1. Retrieve relevant policies (RAG)
    2. Analyze customer intent
    3. Generate policy-compliant response
    4. Add empathy and personalization
    """
    
    workflow = {
        "query": customer_query,
        "steps": []
    }
    
    # Step 1: RAG - Retrieve relevant policies
    retrieved_docs = simple_retrieve(customer_query, knowledge_base, top_k=2)
    workflow["steps"].append({
        "step": 1,
        "name": "Retrieve Policies",
        "docs_retrieved": [doc['title'] for doc in retrieved_docs]
    })
    
    # Step 2: Analyze intent (would be LLM call)
    intent_prompt = f"""Analyze customer intent from: {customer_query}
    Classify as: refund_request, warranty_question, shipping_inquiry, price_match, or other."""
    
    workflow["steps"].append({
        "step": 2,
        "name": "Analyze Intent",
        "prompt": intent_prompt[:100] + "..."
    })
    
    # Step 3: Generate policy-compliant response
    response_prompt = build_rag_prompt(customer_query, retrieved_docs)
    workflow["steps"].append({
        "step": 3,
        "name": "Generate Response",
        "prompt_length": len(response_prompt)
    })
    
    # Step 4: Add empathy layer
    empathy_prompt = """Review the following customer service response and enhance it with:
    - Empathetic language
    - Personal touches
    - Professional warmth
    While maintaining all policy-accurate information."""
    
    workflow["steps"].append({
        "step": 4,
        "name": "Add Empathy",
        "prompt": empathy_prompt[:100] + "..."
    })
    
    return workflow

# Test the workflow
test_customer_query = "I bought a laptop 3 weeks ago but it has a defect. Can I return it?"
workflow_result = rag_chain_workflow(test_customer_query, knowledge_base)

print("RAG + CHAIN WORKFLOW EXAMPLE")
print("="*70)
print(f"Customer Query: {workflow_result['query']}\n")
print("Workflow Steps:")
for step_info in workflow_result['steps']:
    print(f"\nStep {step_info['step']}: {step_info['name']}")
    for key, value in step_info.items():
        if key not in ['step', 'name']:
            print(f"  - {key}: {value}")

## 6. Practical Business Applications

### Use Cases for RAG + Prompt Chaining

In [None]:
# Business applications matrix
use_cases = {
    'Business Function': [
        'Customer Support',
        'Legal/Compliance',
        'Sales Enablement',
        'HR/Recruiting',
        'Product Development',
        'Financial Analysis'
    ],
    'RAG Component': [
        'FAQs, policies, product docs',
        'Regulations, case law, contracts',
        'Product specs, case studies, pricing',
        'Job descriptions, benefits, policies',
        'Customer feedback, market research',
        'Financial statements, market data'
    ],
    'Chain Workflow': [
        'Analyze → Retrieve Policy → Draft Response → Quality Check',
        'Extract Terms → Check Compliance → Risk Assessment → Report',
        'Qualify Lead → Retrieve Materials → Customize Pitch → Follow-up',
        'Screen Resume → Match Requirements → Schedule → Notify',
        'Aggregate Feedback → Identify Themes → Prioritize → Roadmap',
        'Retrieve Data → Calculate Metrics → Trend Analysis → Insights'
    ],
    'Business Value': [
        'Faster, consistent responses',
        'Reduced risk, audit trails',
        'Higher conversion rates',
        'Improved candidate experience',
        'Data-driven decisions',
        'Faster insights, accuracy'
    ]
}

df_use_cases = pd.DataFrame(use_cases)
print("\nRAG + PROMPT CHAINING: BUSINESS APPLICATIONS")
print("="*70)
for idx, row in df_use_cases.iterrows():
    print(f"\n{row['Business Function']}:")
    print(f"  Knowledge Base: {row['RAG Component']}")
    print(f"  Workflow: {row['Chain Workflow']}")
    print(f"  Value: {row['Business Value']}")

## 7. Implementation Challenges and Considerations

In [None]:
# Challenges and mitigation strategies
challenges = {
    'Challenge': [
        'Retrieval Quality',
        'Context Window Limits',
        'Chain Complexity',
        'Error Propagation',
        'Latency/Performance',
        'Cost Management',
        'Knowledge Base Maintenance'
    ],
    'Impact': [
        'Irrelevant or missing information',
        'Cannot fit all retrieved docs',
        'Difficult to debug and maintain',
        'Early errors affect final output',
        'Multiple LLM calls increase latency',
        'Each chain step costs money',
        'Outdated/incorrect information'
    ],
    'Mitigation Strategy': [
        'Use semantic embeddings; tune retrieval parameters; hybrid search',
        'Rank and select most relevant chunks; use summarization',
        'Modular design; clear documentation; monitoring each step',
        'Validation checkpoints; confidence scoring; human-in-loop',
        'Parallel chains where possible; caching; optimize prompts',
        'Monitor token usage; optimize chain length; batch processing',
        'Automated updates; version control; regular audits'
    ]
}

df_challenges = pd.DataFrame(challenges)
print("\nIMPLEMENTATION CHALLENGES & MITIGATION")
print("="*70)
for idx, row in df_challenges.iterrows():
    print(f"\n{idx + 1}. {row['Challenge']}")
    print(f"   Impact: {row['Impact']}")
    print(f"   Mitigation: {row['Mitigation Strategy']}")

## 8. Hands-On Practice Activity

### Design Your Own RAG + Chain Workflow

Choose a business process from your organization that requires:
1. Access to specific knowledge/documentation
2. Multiple processing steps
3. A final output or action

In [None]:
# YOUR TURN: Define your business process

my_business_process = """
Business Process Description:
[Describe your chosen business process]

Example: Analyzing contract proposals for compliance and risk
"""

my_knowledge_sources = """
Knowledge Sources (for RAG):
[List the documents/data sources needed]

Example: 
- Company contract templates
- Legal compliance requirements
- Industry standard terms
- Historical contract issues
"""

my_chain_design = """
Chain Design (list steps):

Step 1: [Task]
  Input: 
  Output:

Step 2: [Task]
  Input:
  Output:

Step 3: [Task]
  Input:
  Output:
"""

print(my_business_process)
print(my_knowledge_sources)
print(my_chain_design)

In [None]:
# YOUR TURN: Write prompts for each chain step

my_step1_prompt = """
[Write your Step 1 prompt here]
"""

my_step2_prompt = """
[Write your Step 2 prompt here]
"""

my_step3_prompt = """
[Write your Step 3 prompt here]
"""

print("Step 1 Prompt:")
print(my_step1_prompt)
print("\nStep 2 Prompt:")
print(my_step2_prompt)
print("\nStep 3 Prompt:")
print(my_step3_prompt)

## 9. Discussion Questions

Consider and discuss the following:

1. **RAG Application**: Consider a business process requiring information retrieval and subsequent action (e.g., summarizing recent customer feedback and drafting a response plan). How could a combination of RAG and prompt chaining automate or assist this?

2. **Knowledge Base Design**: What are the key considerations when designing a knowledge base for RAG? How do you ensure information quality and relevance?

3. **Chain Optimization**: When would you choose a longer, more detailed chain versus a shorter, simpler one? What are the trade-offs?

4. **Error Handling**: In a multi-step chain, how would you handle situations where:
   - Retrieval returns no relevant documents?
   - An intermediate step produces poor output?
   - The final output doesn't meet quality standards?

5. **Human-in-the-Loop**: Which steps in your designed workflow would benefit most from human review or intervention? Why?

6. **Measurement**: How would you measure the success of a RAG + chain implementation? What metrics matter most for your use case?

### Your Reflections:

**Question 1 - RAG Application:**

[Your response]

**Question 2 - Knowledge Base Design:**

[Your response]

**Question 3 - Chain Optimization:**

[Your response]

**Question 4 - Error Handling:**

[Your response]

**Question 5 - Human-in-the-Loop:**

[Your response]

**Question 6 - Measurement:**

[Your response]

## 10. Key Takeaways

1. **RAG grounds LLMs in specific knowledge**, reducing hallucinations and enabling access to proprietary or current information

2. **Prompt chaining breaks complex tasks into manageable steps**, improving quality, transparency, and control

3. **Combining RAG and chaining creates powerful workflows** for complex business processes requiring both knowledge retrieval and multi-step reasoning

4. **Implementation requires careful consideration** of retrieval quality, chain design, error handling, and performance

5. **Different business functions benefit differently** - identify where document grounding and structured workflows add most value

6. **Maintenance is ongoing** - knowledge bases need updates, chains need optimization, and workflows need monitoring

## 11. Looking Ahead to Week 5

Next week, we'll focus on **Evaluating LLM Outputs: Metrics and Frameworks**.

We'll explore:
- Key evaluation metrics (BLEU, ROUGE, perplexity, F1-score)
- Frameworks for assessing relevance, coherence, fluency, safety, and bias
- Business-specific evaluation criteria
- Quantitative and qualitative assessment methods

**Preparation:** Consider how you would evaluate the quality of outputs from the RAG and chain systems we discussed this week. What criteria matter most for your use cases?

## Additional Resources

### RAG Resources:
- [LangChain RAG Documentation](https://python.langchain.com/docs/use_cases/question_answering/)
- [Building RAG Systems with LlamaIndex](https://docs.llamaindex.ai/en/stable/)
- [Pinecone Vector Database Guide](https://www.pinecone.io/learn/vector-database/)

### Prompt Chaining Resources:
- [LangChain Sequential Chains](https://python.langchain.com/docs/modules/chains/)
- [OpenAI Prompt Chaining Guide](https://platform.openai.com/docs/guides/prompt-engineering)
- [Anthropic Claude Chaining Patterns](https://docs.anthropic.com/claude/docs/)

### Tools and Libraries:
- **Vector Databases**: Pinecone, Weaviate, ChromaDB, Faiss
- **Embedding Models**: OpenAI Embeddings, Sentence-Transformers
- **Orchestration**: LangChain, LlamaIndex, Haystack

---

*End of Week 4 Notebook*