## Conversational Shopping Assistant with Reinforcement Learning

#### Objective

- Recommends products based on user queries.
- Learns from user feedback over multiple conversation rounds.
- Dynamically adapts recommendations to enhance personalization.

#### Prerequisites
- Programming Environment
- Python 3.8+
- Libraries: transformers, Flask (or FastAPI), pandas, scikit-learn, numpy, gym.

#### Dataset

- Use the Amazon Product Dataset or a CSV file containing product details:
-- Columns: product_id, name, category, price, description.

#### Tools

- Pre-trained Language Models: For natural language understanding (NLU), e.g., Hugging Face’s DistilBERT.
- Content-Based Filtering: For initial product recommendations.
- RL Environment: Using gym to simulate the recommendation process.


#### Step 1: Define User Scenarios and Product data

**Scenarios:**

- Product Recommendations: Input: "Recommend a laptop under $1000." Output: List of laptops within the price range.
- Product Details: Input: "Tell me about the Dell XPS 13." Output: Product description, price, and availability.

**Dataset:** A CSV file (products.csv) containing the following:

product_name, category, price, description, availability.


#### Step 2: Build the Conversational Layer

**Intent Classification**

Use a zero-shot classifier (e.g., Hugging Face's BART) to identify user intent.

**Entity Extraction**

Extract entities like price range and category using keyword parsing or spaCy.

In [None]:
from transformers import pipeline

# Zero-shot intent classifier
classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")

def classify_intent(user_input):
    intents = ["recommend", "product_info", "greeting"]
    result = classifier(user_input, intents)
    return result['labels'][0]  # Top intent

# OpenAI or Hugging Face's GPT-4-like model
llm_pipeline = pipeline("text-generation", model="gpt-3.5-turbo")

def extract_entities_with_llm(user_input):
    prompt = f"""
    Extract the category and price range from the following text:
    "{user_input}"
    Format your response as:
    Category: <category>
    Price Range: <price range>
    """
    response = llm_pipeline(prompt, max_length=50)
    result = response[0]["generated_text"]

    # Parse results
    category = None
    price_range = None
    for line in result.split("\n"):
        if "Category:" in line:
            category = line.split(":")[-1].strip()
        elif "Price Range:" in line:
            price_str = line.split(":")[-1].strip()
            if price_str.isdigit():
                price_range = int(price_str)
    
    return {"category": category, "price_range": price_range}


#### Step 3: Implement the Recommendation Engine using LLM

In [None]:
import pandas as pd
from transformers import pipeline

# Load product data
products = pd.read_csv("products.csv")

# LLM for natural language recommendations
llm = pipeline("text-generation", model="gpt-3.5-turbo")

def recommend_products(entities):
    category = entities.get("category")
    price_range = entities.get("price_range")

    # Filter products
    filtered_products = products[
        (products["category"] == category) & (products["price"] <= price_range)
    ]
    
    if filtered_products.empty:
        return "No products found in the specified range."

    # Format recommendations
    recommendations = "\n".join(filtered_products["product_name"].tolist()[:5])
    return f"Here are some {category}s under ${price_range}:\n{recommendations}"

def get_product_details(product_name):
    product = products[products["product_name"].str.contains(product_name, case=False)]
    if product.empty:
        return "Product not found."
    
    details = product.iloc[0]
    return (f"Product: {details['product_name']}\n"
            f"Price: ${details['price']}\n"
            f"Description: {details['description']}\n"
            f"Availability: {details['availability']}")


#### Step 4: Integrate Reinforcement Learning for Multi-Turn Optimization

** RL Environment Setup**

Define the RL environment using gym. This environment simulates the recommendation process, where:

- State Space: User preferences, conversation history, previous recommendations.
- Action Space: Products available for recommendation.
- Reward Signal: User feedback (e.g., clicks, purchases).

In [None]:
import gym
from gym import spaces

class RecommendationEnv(gym.Env):
    def __init__(self):
        super().__init__()
        self.state = []
        self.action_space = spaces.Discrete(len(products))
        self.observation_space = spaces.Box(low=0, high=1, shape=(len(products),), dtype=float)
    
    def reset(self):
        self.state = []
        return self.state

    def step(self, action):
        # Simulate user feedback
        reward = 1 if action in self.state else 0
        done = len(self.state) >= 5
        return self.state, reward, done, {}


#### Step 5: Build Gradio Demo


In [None]:
import gradio as gr

def chat_with_ner(user_input, conversation_history):
    intent = classify_intent(user_input)
    
    # Choose between NER or LLM-based extraction
    entities = extract_entities_with_llm(user_input)  # For GPT-4
    # entities = extract_entities_with_ner(user_input)  # For NER Pipeline
    
    if intent == "recommend":
        response = recommend_products(entities)
    elif intent == "product_info":
        product_name = user_input.split("about")[-1].strip()
        response = get_product_details(product_name)
    else:
        response = "Hello! How can I assist you today?"

    # Maintain conversation history
    conversation_history.append((user_input, response))
    return conversation_history, conversation_history

with gr.Blocks() as demo:
    gr.Markdown("### Conversational Shopping Assistant")
    
    chatbot = gr.Chatbot()
    user_input = gr.Textbox(placeholder="Type your query here...")
    state = gr.State([])

    user_input.submit(chat_with_ner, [user_input, state], [chatbot, state])

demo.launch()
