# Assignment 2: RAG-Enhanced Pricing Agent with Historical Knowledge

## Objective
Add factual grounding to our pricing agent using a **pricing knowledge base** with historical data, category benchmarks, and proven pricing strategies.

## Requirements
**RAG Knowledge Base Includes:**
- Category-level elasticity benchmarks
- Historical margins
- 20-50 sample past pricing decisions
- Competitor trends
- Price recommendation guidelines

**Agent Behavior:**
- Pulls past examples
- Justifies price using retrieved data
- Reduces hallucination compared to Iteration 1

## Setup & Dependencies

Install the required packages for RAG implementation.

In [None]:
# Install required packages for RAG
!pip install -q langchain langchain-groq langchain-community langchain-text-splitters
!pip install -q chromadb sentence-transformers
!pip install -q pandas numpy

In [None]:
# Import required libraries
from langchain_groq import ChatGroq
from langchain_core.prompts import PromptTemplate
from langchain_core.messages import SystemMessage, HumanMessage
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import SentenceTransformerEmbeddings
from langchain_text_splitters import CharacterTextSplitter
from langchain_core.documents import Document
import os
import getpass
import pandas as pd
import json
from typing import List, Dict

In [None]:
# Set up your Groq API key
print("Please enter your Groq API key:")
print("(You can get one free at: https://console.groq.com/)")
groq_api_key = getpass.getpass("Groq API Key: ")
os.environ["GROQ_API_KEY"] = groq_api_key
print("API key set successfully!")

## Create Pricing Knowledge Base

Create a comprehensive pricing knowledge base with historical data and benchmarks.

In [None]:
# TODO: Create pricing knowledge base data structures
# Create the following data structures:

# 1. Category-level elasticity benchmarks
# HINT: elasticity_benchmarks = [ {"category": "Electronics", "elasticity": "High", ...} ]
elasticity_benchmarks = [
    # TODO: Add at least 4 categories with elasticity data
]

# 2. Historical margin data by category  
# HINT: historical_margins = [ {"category": "Electronics", "avg_margin": "12-18%", ...} ]
historical_margins = [
    # TODO: Add margin data for each category
]

# 3. Sample past pricing decisions (20-50 examples)
# HINT: Include product, category, cost, recommended_price, margin, competitor_price, outcome, reasoning
pricing_decisions = [
    # TODO: Add at least 7 pricing decision examples
    {
        "product": "Samsung Galaxy Smartphone",
        "category": "Electronics",
        "cost": 450,
        "recommended_price": 499,
        "margin": "11%",
        "competitor_price": 529,
        "outcome": "Successful - gained 15% market share",
        "reasoning": "Aggressive pricing in high elasticity category drove volume"
    }
    # TODO: Add more examples here
]

# 4. Price recommendation guidelines
# HINT: Include rules for high elasticity, competitive positioning, etc.
pricing_guidelines = [
    # TODO: Add pricing guidelines and rules
]

print(f"Created knowledge base with:")
print(f"- {len(elasticity_benchmarks)} category elasticity benchmarks")
print(f"- {len(historical_margins)} historical margin references") 
print(f"- {len(pricing_decisions)} past pricing decisions")
print(f"- {len(pricing_guidelines)} pricing guidelines")

## Create Vector Store for RAG

Convert our knowledge base into searchable documents and create embeddings.

In [None]:
def create_pricing_documents() -> List[Document]:
    """Convert pricing knowledge into searchable documents"""
    documents = []
    
    # TODO: Add elasticity benchmarks to documents
    # HINT: Create Document objects with page_content and metadata
    for benchmark in elasticity_benchmarks:
        # TODO: Format content and create Document
        pass
    
    # TODO: Add historical margins to documents
    for margin in historical_margins:
        # TODO: Format content and create Document
        pass
    
    # TODO: Add pricing decisions to documents
    for decision in pricing_decisions:
        # TODO: Format content and create Document with all fields
        pass
    
    # TODO: Add pricing guidelines to documents
    for guideline in pricing_guidelines:
        # TODO: Format content and create Document
        pass
    
    return documents

# Create documents
pricing_docs = create_pricing_documents()
print(f"Created {len(pricing_docs)} searchable documents")

In [None]:
# TODO: Initialize embeddings and vector store
# HINT: Use SentenceTransformerEmbeddings with "all-MiniLM-L6-v2" model
print("Setting up embeddings...")
# embeddings = TODO: Initialize embeddings

print("Creating vector store...")
# TODO: Create Chroma vector store from documents
# HINT: Use Chroma.from_documents() with persist_directory
# vectorstore = TODO: Create vectorstore

print("Vector store created successfully!")
print(f"Indexed {len(pricing_docs)} documents for retrieval")

## RAG-Enhanced Pricing Agent

Create our enhanced pricing agent that uses retrieval to ground its recommendations.

In [None]:
class RAGPricingAgent:
    def __init__(self, vectorstore, embeddings):
        # TODO: Initialize the LLM
        # self.llm = TODO: Initialize ChatGroq
        
        # Store the vector store and embeddings
        self.vectorstore = vectorstore
        self.embeddings = embeddings
        
        # TODO: Create system message for RAG-enhanced agent
        self.system_message = SystemMessage(content="""
        # TODO: Write system message that explains the agent's RAG capabilities
        # Mention access to historical data, category benchmarks, etc.
        """)
        
    def retrieve_relevant_knowledge(self, query: str, k: int = 5):
        """Retrieve relevant pricing knowledge for the query"""
        # TODO: Implement retrieval logic
        # HINT: Use self.vectorstore.as_retriever()
        # retriever = TODO: Create retriever
        # relevant_docs = TODO: Get relevant documents
        return relevant_docs
    
    def format_retrieved_context(self, docs):
        """Format retrieved documents into context string"""
        # TODO: Format documents into a readable context string
        # HINT: Include document content and metadata type
        context_parts = []
        for i, doc in enumerate(docs, 1):
            # TODO: Format each document
            pass
        
        return "\n\n".join(context_parts)
    
    def get_rag_price_recommendation(self, product_name, category, cost_price, 
                                   current_price=None, target_margin=None, 
                                   competitor_price=None, price_elasticity=None):
        """Get price recommendation using RAG"""
        
        # TODO: Create search query from inputs
        # search_query = TODO: Build search query
            
        # TODO: Retrieve relevant knowledge
        # relevant_docs = TODO: Get relevant docs
        # context = TODO: Format context
        
        # TODO: Create the prompt with retrieved context
        prompt = f"""
        # TODO: Create comprehensive prompt that includes:
        # - Product details
        # - Retrieved historical data and benchmarks
        # - Request for structured analysis
        """
        
        # Get response from LLM
        messages = [self.system_message, HumanMessage(content=prompt)]
        response = self.llm.invoke(messages)
        
        return response.content

# Initialize the RAG pricing agent
print("Initializing RAG Pricing Agent...")
# rag_agent = RAGPricingAgent(vectorstore, embeddings)
print("RAG Pricing Agent Ready!")

## Testing the RAG-Enhanced Agent

Test our RAG agent with the assignment example.

In [None]:
# Test with the assignment example: Puma sneakers
print("Testing Assignment Example: Puma Sneakers")
print("=" * 60)

# result = rag_agent.get_rag_price_recommendation(
#     product_name="Puma Sneakers",
#     category="Footwear", 
#     cost_price=1800,  # Note: High cost from assignment
#     target_margin=30,
#     price_elasticity="High"
# )

# print(result)

In [None]:
# TODO: Compare RAG vs Baseline approaches
# Create a function that gets both RAG and prompt-only recommendations
# and shows the difference

def compare_rag_vs_baseline(product_name, category, cost_price, target_margin=None, 
                           competitor_price=None, price_elasticity=None):
    """Compare RAG recommendation with baseline prompt-only approach"""
    
    # TODO: Get RAG recommendation
    # rag_result = TODO: Get RAG recommendation
    
    # TODO: Get baseline recommendation (prompt-only)
    # baseline_prompt = TODO: Create simple prompt without RAG
    # baseline_response = TODO: Get baseline response
    
    # TODO: Format and return comparison
    return f"""
    # TODO: Format comparison output
    """

# Test comparison
print("COMPARISON: RAG vs Baseline Approaches")
print("=" * 70)

# comparison = compare_rag_vs_baseline(
#     "Puma Sneakers",
#     "Footwear",
#     1800,
#     target_margin=30,
#     price_elasticity="High"
# )

# print(comparison)

## Additional Test Cases

Test with various scenarios to see how RAG improves recommendations.

In [None]:
# TODO: Test Case 1: Electronics (High Elasticity)
print("Test Case 1: Electronics Category")
print("=" * 40)

# TODO: Test electronics pricing with RAG agent
# electronics_result = TODO: Get recommendation for electronics product

# print(electronics_result)

In [None]:
# TODO: Test Case 2: Luxury Goods (Low Elasticity)
print("Test Case 2: Luxury Goods Category")
print("=" * 40)

# TODO: Test luxury goods pricing with RAG agent
# luxury_result = TODO: Get recommendation for luxury product

# print(luxury_result)

## Exploring the Knowledge Base

See what the RAG system retrieves for different queries.

In [None]:
def explore_knowledge_retrieval(query, k=3):
    """Show what gets retrieved for a given query"""
    print(f"Query: '{query}'")
    print("=" * 50)
    
    # TODO: Implement retrieval exploration
    # Get documents for the query and display them
    # docs = TODO: Get relevant documents
    
    # for i, doc in enumerate(docs, 1):
    #     # TODO: Display each retrieved document
    #     pass
    
    # return docs
    pass

# Test different retrieval queries
# explore_knowledge_retrieval("high elasticity pricing strategy")
# print("\n" + "=" * 70 + "\n")
# explore_knowledge_retrieval("footwear sneakers margin")

## Your Turn: Expand the Knowledge Base

**Exercise 1:** Add 5 more pricing decisions to the knowledge base
**Exercise 2:** Add a new category with appropriate benchmarks
**Exercise 3:** Test how the new data affects recommendations

In [None]:
# Exercise 1: Add your pricing decisions here
additional_pricing_decisions = [
    # TODO: Add 5 more pricing decision examples
    # Follow the format from the original pricing_decisions list
]

# Exercise 2: Add a new category
new_category_data = {
    "elasticity_benchmark": {
        # TODO: Add elasticity data for your new category
    },
    "historical_margins": {
        # TODO: Add margin data for your new category
    }
}

# Exercise 3: Test with new category
# TODO: Test your new category with the agent

print("Exercises completed! Test your enhancements.")