# Assignment 2: RAG-Enhanced Pricing Agent with Historical Knowledge

## Objective
Add factual grounding to our pricing agent using a **pricing knowledge base** with historical data, category benchmarks, and proven pricing strategies.

## Requirements
**RAG Knowledge Base Includes:**
- Category-level elasticity benchmarks
- Historical margins
- 20-50 sample past pricing decisions
- Competitor trends
- Price recommendation guidelines

**Agent Behavior:**
- Pulls past examples
- Justifies price using retrieved data
- Reduces hallucination compared to Iteration 1

## Setup & Dependencies

Install the required packages for RAG implementation.

In [1]:
# Install required packages for RAG
!pip install -q langchain langchain-groq langchain-community langchain-text-splitters
!pip install -q chromadb sentence-transformers
!pip install -q pandas numpy

In [7]:
# Import required libraries
from langchain_groq import ChatGroq
from langchain_core.prompts import PromptTemplate
from langchain_core.messages import SystemMessage, HumanMessage
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import SentenceTransformerEmbeddings
from langchain_text_splitters import CharacterTextSplitter
from langchain_core.documents import Document
import os
import getpass
import pandas as pd
import json
from typing import List, Dict, Any, Sequence
from dataclasses import dataclass

In [11]:
# Set up your Groq API key
print("Please enter your Groq API key:")
print("(You can get one free at: https://console.groq.com/)")
groq_api_key = getpass.getpass("Groq API Key: ")
os.environ["GROQ_API_KEY"] = groq_api_key
print("API key set successfully!")

Please enter your Groq API key:
(You can get one free at: https://console.groq.com/)
API key set successfully!


## Create Pricing Knowledge Base

Create a comprehensive pricing knowledge base with historical data and benchmarks.

In [1]:
# TODO: Create pricing knowledge base data structures
# Create the following data structures:

# 1. Category-level elasticity benchmarks
# HINT: elasticity_benchmarks = [ {"category": "Electronics", "elasticity": "High", ...} ]
elasticity_benchmarks = [
    # TODO: Add at least 4 categories with elasticity data
    {"category": "Electronics", "elasticity": "High"},
    {"category": "Home & Garden", "elasticity": "Medium"},
    {"category": "Fashion", "elasticity": "High"},
    {"category": "Automotive", "elasticity": "Low"},
    {"category": "Groceries", "elasticity": "Low"}
]

# 2. Historical margin data by category  
# HINT: historical_margins = [ {"category": "Electronics", "avg_margin": "12-18%", ...} ]
historical_margins = [
    # TODO: Add margin data for each category
    {"category": "Electronics", "avg_margin": "15-20%"},
    {"category": "Home & Garden", "avg_margin": "25-35%"},
    {"category": "Fashion", "avg_margin": "40-60%"},
    {"category": "Automotive", "avg_margin": "30-45%"},
    {"category": "Groceries", "avg_margin": "5-10%"}
]

# 3. Sample past pricing decisions (20-50 examples)
# HINT: Include product, category, cost, recommended_price, margin, competitor_price, outcome, reasoning
pricing_decisions = [
    # TODO: Add at least 7 pricing decision examples
    {
        "product": "Wireless Noise-Canceling Headphones", "category": "Electronics", "cost": 50.00, "recommended_price": 79.99, "margin": "37.5%", "competitor_price": 85.00, "outcome": "High Sales Volume", "reasoning": "Priced slightly below market leader to capture share; high elasticity category."
    },
    {
        "product": "Winter Parka Jacket", "category": "Fashion", "cost": 45.00, "recommended_price": 129.99, "margin": "65.4%", "competitor_price": 140.00, "outcome": "Successful Season", "reasoning": "Seasonal demand allows for higher margins; competitive pricing still maintained."
    },
    {
        "product": "Ceramic Brake Pads", "category": "Automotive", "cost": 18.00, "recommended_price": 49.99, "margin": "64%", "competitor_price": 45.00, "outcome": "Stable Sales", "reasoning": "Low elasticity; customers prioritize quality/safety over small price differences."
    },
    {
        "product": "Organic Coffee Beans (1lb)", "category": "Groceries", "cost": 8.50, "recommended_price": 14.99, "margin": "43.3%", "competitor_price": 15.50, "outcome": "High Turnover", "reasoning": "Competitive staple product; slight undercut drives volume."
    },
    {
        "product": "Smart LED Bulb", "category": "Home & Garden", "cost": 6.00, "recommended_price": 12.99, "margin": "53.8%", "competitor_price": 14.00, "outcome": "Moderate Growth", "reasoning": "Entry-level smart home product; pricing for adoption."
    },
    {
        "product": "4K Gaming Monitor", "category": "Electronics", "cost": 200.00, "recommended_price": 299.99, "margin": "33.3%", "competitor_price": 320.00, "outcome": "Best Seller", "reasoning": "Aggressive pricing in competitive segment to clear inventory."
    },
    {
        "product": "Leather Handbag", "category": "Fashion", "cost": 80.00, "recommended_price": 250.00, "margin": "68%", "competitor_price": 280.00, "outcome": "High Profitability", "reasoning": "Premium positioning; margin prioritized over volume."
    }
]

# 4. Price recommendation guidelines
# HINT: Include rules for high elasticity, competitive positioning, etc.
pricing_guidelines = [
    # TODO: Add pricing guidelines and rules
    "For High Elasticity products (e.g., Electronics, Fashion), price within 5% of key competitors to maintain market share.",
    "For Low Elasticity products (e.g., Automotive parts, Essentials), prioritize margin; pricing 10-15% above competitors is acceptable if quality is differentiated.",
    "Ensure a minimum gross margin of 20% on all hardware products unless in clearance.",
    "Fashion items should target 50%+ margin to account for end-of-season markdowns.",
    "If competitor stock is low (stockout), increase price by 5-10% to capture demand surplus.",
    "Avoid price wars on commoditized goods; focus on value-add or bundle pricing instead."
]

print(f"Created knowledge base with:")
print(f"- {len(elasticity_benchmarks)} category elasticity benchmarks")
print(f"- {len(historical_margins)} historical margin references") 
print(f"- {len(pricing_decisions)} past pricing decisions")
print(f"- {len(pricing_guidelines)} pricing guidelines")

Created knowledge base with:
- 5 category elasticity benchmarks
- 5 historical margin references
- 7 past pricing decisions
- 6 pricing guidelines


## Create Vector Store for RAG

Convert our knowledge base into searchable documents and create embeddings.

In [9]:
@dataclass
class Document:
    page_content: str
    metadata: Dict[str, Any]

def create_pricing_documents() -> List[Document]:
    """Convert pricing knowledge into searchable documents"""
    documents = []
    
    # TODO: Add elasticity benchmarks to documents
    # HINT: Create Document objects with page_content and metadata
    for benchmark in elasticity_benchmarks:
        # TODO: Format content and create Document
        content = f"Category: {benchmark['category']}\nElasticity: {benchmark['elasticity']}"
        metadata = {"type": "elasticity_benchmark", "category": benchmark['category']}
        documents.append(Document(page_content=content, metadata=metadata))
    
    # TODO: Add historical margins to documents
    for margin in historical_margins:
        # TODO: Format content and create Document
        content = f"Category: {margin['category']}\nAverage Margin: {margin['avg_margin']}"
        metadata = {"type": "historical_margin", "category": margin['category']}
        documents.append(Document(page_content=content, metadata=metadata))
    
    # TODO: Add pricing decisions to documents
    for decision in pricing_decisions:
        # TODO: Format content and create Document with all fields
        content = (
            f"Product: {decision['product']}\n"
            f"Category: {decision['category']}\n"
            f"Cost: {decision['cost']}\n"
            f"Recommended Price: {decision['recommended_price']}\n"
            f"Margin: {decision['margin']}\n"
            f"Competitor Price: {decision['competitor_price']}\n"
            f"Outcome: {decision['outcome']}\n"
            f"Reasoning: {decision['reasoning']}"
        )
        metadata = {"type": "pricing_decision", "category": decision['category'], "product": decision['product']}
        documents.append(Document(page_content=content, metadata=metadata))
    
    # TODO: Add pricing guidelines to documents
    for guideline in pricing_guidelines:
        # TODO: Format content and create Document
        content = guideline
        metadata = {"type": "pricing_guideline"}
        documents.append(Document(page_content=content, metadata=metadata))
    
    return documents

# Create documents
pricing_docs = create_pricing_documents()
print(f"Created {len(pricing_docs)} searchable documents")

Created 23 searchable documents


In [10]:
# TODO: Initialize embeddings and vector store
# HINT: Use SentenceTransformerEmbeddings with "all-MiniLM-L6-v2" model
print("Setting up embeddings...")
embeddings = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")

print("Creating vector store...")
# TODO: Create Chroma vector store from documents
# HINT: Use Chroma.from_documents() with persist_directory
vectorstore = Chroma.from_documents(documents=pricing_docs, embedding=embeddings, persist_directory="./chroma_db")

print("Vector store created successfully!")
print(f"Indexed {len(pricing_docs)} documents for retrieval")

Setting up embeddings...


  embeddings = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")


Creating vector store...
Vector store created successfully!
Indexed 23 documents for retrieval


## RAG-Enhanced Pricing Agent

Create our enhanced pricing agent that uses retrieval to ground its recommendations.

In [12]:
class RAGPricingAgent:
    def __init__(self, vectorstore, embeddings):
        # TODO: Initialize the LLM
        self.llm = ChatGroq(
            model="llama-3.1-8b-instant",
            temperature=0,
            max_tokens=1000,
            api_key=groq_api_key
        )
        # Store the vector store and embeddings
        self.vectorstore = vectorstore
        self.embeddings = embeddings
        
        # TODO: Create system message for RAG-enhanced agent
        self.system_message = SystemMessage(content="""
        You are a retail pricing agent who provides reasoning in a short and easy to follow manner.
        Use knowledge provided to complete the answer: elasticity benchmarks, historical margins and prior pricing decisions.
        """)
        
    def retrieve_relevant_knowledge(self, query: str, k: int = 5):
        """Retrieve relevant pricing knowledge for the query"""
        # TODO: Implement retrieval logic
        # HINT: Use self.vectorstore.as_retriever()
        retriever = self.vectorstore.as_retriever()
        relevant_docs = retriever.query(query_texts=[query], n_results=k)
        return relevant_docs
    
    def format_retrieved_context(self, docs):
        """Format retrieved documents into context string"""
        # TODO: Format documents into a readable context string
        # HINT: Include document content and metadata type
        context_parts = []
        for i, doc in enumerate(docs, 1):
            context_parts.append(f"{doc.page_content} {doc.metadata}")
        
        return "\n\n".join(context_parts)
    
    def get_rag_price_recommendation(self, product_name, category, cost_price, 
                                   current_price=None, target_margin=None, 
                                   competitor_price=None, price_elasticity=None):
        """Get price recommendation using RAG"""
        
        # TODO: Create search query from inputs
        # search_query = TODO: Build search query
            
        # TODO: Retrieve relevant knowledge
        # relevant_docs = TODO: Get relevant docs
        # context = TODO: Format context
        
        # TODO: Create the prompt with retrieved context
        prompt = f"""
        # TODO: Create comprehensive prompt that includes:
        # - Product details
        # - Retrieved historical data and benchmarks
        # - Request for structured analysis
        """
        
        # Get response from LLM
        messages = [self.system_message, HumanMessage(content=prompt)]
        response = self.llm.invoke(messages)
        
        return response.content

# Initialize the RAG pricing agent
print("Initializing RAG Pricing Agent...")
# rag_agent = RAGPricingAgent(vectorstore, embeddings)
print("RAG Pricing Agent Ready!")

Initializing RAG Pricing Agent...
RAG Pricing Agent Ready!


## Testing the RAG-Enhanced Agent

Test our RAG agent with the assignment example.

In [None]:
# Test with the assignment example: Puma sneakers
print("Testing Assignment Example: Puma Sneakers")
print("=" * 60)

# result = rag_agent.get_rag_price_recommendation(
#     product_name="Puma Sneakers",
#     category="Footwear", 
#     cost_price=1800,  # Note: High cost from assignment
#     target_margin=30,
#     price_elasticity="High"
# )

# print(result)

In [None]:
# TODO: Compare RAG vs Baseline approaches
# Create a function that gets both RAG and prompt-only recommendations
# and shows the difference

def compare_rag_vs_baseline(product_name, category, cost_price, target_margin=None, 
                           competitor_price=None, price_elasticity=None):
    """Compare RAG recommendation with baseline prompt-only approach"""
    
    # TODO: Get RAG recommendation
    # rag_result = TODO: Get RAG recommendation
    
    # TODO: Get baseline recommendation (prompt-only)
    # baseline_prompt = TODO: Create simple prompt without RAG
    # baseline_response = TODO: Get baseline response
    
    # TODO: Format and return comparison
    return f"""
    # TODO: Format comparison output
    """

# Test comparison
print("COMPARISON: RAG vs Baseline Approaches")
print("=" * 70)

# comparison = compare_rag_vs_baseline(
#     "Puma Sneakers",
#     "Footwear",
#     1800,
#     target_margin=30,
#     price_elasticity="High"
# )

# print(comparison)

## Additional Test Cases

Test with various scenarios to see how RAG improves recommendations.

In [None]:
# TODO: Test Case 1: Electronics (High Elasticity)
print("Test Case 1: Electronics Category")
print("=" * 40)

# TODO: Test electronics pricing with RAG agent
# electronics_result = TODO: Get recommendation for electronics product

# print(electronics_result)

In [None]:
# TODO: Test Case 2: Luxury Goods (Low Elasticity)
print("Test Case 2: Luxury Goods Category")
print("=" * 40)

# TODO: Test luxury goods pricing with RAG agent
# luxury_result = TODO: Get recommendation for luxury product

# print(luxury_result)

## Exploring the Knowledge Base

See what the RAG system retrieves for different queries.

In [None]:
def explore_knowledge_retrieval(query, k=3):
    """Show what gets retrieved for a given query"""
    print(f"Query: '{query}'")
    print("=" * 50)
    
    # TODO: Implement retrieval exploration
    # Get documents for the query and display them
    # docs = TODO: Get relevant documents
    
    # for i, doc in enumerate(docs, 1):
    #     # TODO: Display each retrieved document
    #     pass
    
    # return docs
    pass

# Test different retrieval queries
# explore_knowledge_retrieval("high elasticity pricing strategy")
# print("\n" + "=" * 70 + "\n")
# explore_knowledge_retrieval("footwear sneakers margin")

## Your Turn: Expand the Knowledge Base

**Exercise 1:** Add 5 more pricing decisions to the knowledge base
**Exercise 2:** Add a new category with appropriate benchmarks
**Exercise 3:** Test how the new data affects recommendations

In [None]:
# Exercise 1: Add your pricing decisions here
additional_pricing_decisions = [
    # TODO: Add 5 more pricing decision examples
    # Follow the format from the original pricing_decisions list
]

# Exercise 2: Add a new category
new_category_data = {
    "elasticity_benchmark": {
        # TODO: Add elasticity data for your new category
    },
    "historical_margins": {
        # TODO: Add margin data for your new category
    }
}

# Exercise 3: Test with new category
# TODO: Test your new category with the agent

print("Exercises completed! Test your enhancements.")