## Conversational Shopping Assistant with Reinforcement Learning

#### Objective

- Recommends products based on user queries.
- Learns from user feedback over multiple conversation rounds.
- Dynamically adapts recommendations to enhance personalization.

#### Prerequisites
- Programming Environment
- Python 3.8+
- Libraries: transformers, Flask (or FastAPI), pandas, scikit-learn, numpy, gym.

#### Dataset

- Use the Amazon Product Dataset or a CSV file containing product details:
-- Columns: product_id, name, category, price, description.

#### Tools

- Pre-trained Language Models: For natural language understanding (NLU), e.g., Hugging Face’s DistilBERT.
- Content-Based Filtering: For initial product recommendations.
- RL Environment: Using gym to simulate the recommendation process.


#### Define User Scenarios and Dialogue Flow

**Scenarios:**

- Product Recommendations: Input: "Recommend a laptop under $1000." Output: List of laptops within the price range.
- Product Details: Input: "Tell me about the Dell XPS 13." Output: Product description, price, and availability.

**Dialogue Flow:**

- Recognize user intent: recommend, product_info, or greeting.
- Extract entities (e.g., category, price range).
- Fetch and respond with relevant data.
- Maintain conversation state across turns.


#### Build the Conversational Layer

**Intent Classification**

Use a zero-shot classifier (e.g., Hugging Face's BART) to identify user intent.

**Entity Extraction**

Extract entities like price range and category using keyword parsing or spaCy.

In [None]:
from transformers import pipeline

classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")
intents = ["recommend", "product_info", "greeting"]

def classify_intent(user_input):
    result = classifier(user_input, intents)
    return result['labels'][0]

# Example usage
user_input = "Can you recommend a laptop under $1000?"
print(classify_intent(user_input))  # Output: "recommend"


#### Implement the Recommendation Engine

Use content-based filtering to provide initial recommendations.

In [None]:
#### Integrate Reinforcement Learning for Multi-Turn Optimization

**RL Environment Setup**

Define the RL environment using gym. This environment simulates the recommendation process, where:

- State Space: User preferences, conversation history, previous recommendations.
- Action Space: Products available for recommendation.
- Reward Signal: User feedback (e.g., clicks, purchases).

In [None]:
import gym

class RecommendationEnv(gym.Env):
    def __init__(self, products):
        self.products = products
        self.current_state = 0
        self.action_space = gym.spaces.Discrete(len(products))
        self.observation_space = gym.spaces.Discrete(len(products))

    def reset(self):
        self.current_state = 0
        return self.products.iloc[self.current_state]

    def step(self, action):
        reward = 1 if action == self.current_state else 0  # Simulated feedback
        self.current_state = (self.current_state + 1) % len(self.products)
        return self.products.iloc[self.current_state], reward, False, {}

env = RecommendationEnv(products)


** RL Agent**

Use Q-learning to optimize recommendations over multiple interactions.

In [None]:
import numpy as np

Q_table = np.zeros((len(products), len(products)))  # Q-table

def q_learning(state, action, reward, learning_rate=0.1, discount_factor=0.9):
    future_rewards = np.max(Q_table[state])
    Q_table[state, action] = Q_table[state, action] + learning_rate * (
        reward + discount_factor * future_rewards - Q_table[state, action]
    )

# Example interaction
state = 0
action = 1  # Recommended product index
reward = 1  # Positive user feedback
q_learning(state, action, reward)


** Backend Integration **

Use Flask or FastAPI to handle user inputs and manage multi-turn conversations.

In [None]:
from flask import Flask, request, jsonify

app = Flask(__name__)

@app.route('/chat', methods=['POST'])
def chat():
    user_input = request.json.get('message', '')
    intent = classify_intent(user_input)
    if intent == "recommend":
        category = "laptop" if "laptop" in user_input.lower() else None
        max_price = int(user_input.split('$')[-1]) if '$' in user_input else None
        recommendations = recommend_products(category, max_price)
        return jsonify(recommendations.to_dict(orient='records'))
    elif intent == "product_info":
        product_name = user_input.split("about")[-1].strip()
        details = products[products['name'].str.contains(product_name, case=False)].to_dict(orient='records')
        return jsonify(details[0] if details else {"response": "Product not found."})
    else:
        return jsonify({"response": "Hello! How can I assist you?"})

if __name__ == '__main__':
    app.run(debug=True)
