# Product Keyword Extraction Demo (Dialogflow Version)
This notebook previously demonstrated how to train and test a simple ML model for extracting product keywords from e-commerce queries.
**Note:** The project now uses Dialogflow for all intent detection and keyword extraction. The custom ML logic is deprecated and not used in production. You can use Dialogflow's built-in NLP and intent features for product search and recommendations.

## 1. Set Up Project Structure with ml_service Folder

We separate all ML logic into the `ml_service` folder. This keeps the notebook clean and allows you to reuse the ML code in your backend or other scripts.

- `ml_service/keyword_extractor.py`: Contains the keyword extraction model and helper functions.
- This notebook: For training, testing, and demo purposes only.

**This approach is recommended for maintainability and production readiness.**

In [1]:
# 2. Import Required Libraries
import pandas as pd
from ml_service.keyword_extractor import train_keyword_extractor, test_keyword_extractor

In [2]:
# 3. Load and Preprocess Data
# Create a random dataset of e-commerce product names and categories
products = [
    {'name': 'Red Dress', 'category': 'Fashion'},
    {'name': 'Blue Jeans', 'category': 'Fashion'},
    {'name': 'Wireless Earbuds', 'category': 'Electronics'},
    {'name': 'Smartphone', 'category': 'Electronics'},
    {'name': 'Coffee Maker', 'category': 'Home Appliances'},
    {'name': 'Yoga Mat', 'category': 'Fitness'},
    {'name': 'Running Shoes', 'category': 'Footwear'},
    {'name': 'Leather Wallet', 'category': 'Accessories'},
    {'name': 'Sunglasses', 'category': 'Accessories'},
    {'name': 'Bluetooth Speaker', 'category': 'Electronics'}
]
df = pd.DataFrame(products)
product_names = df['name'].tolist()
df

Unnamed: 0,name,category
0,Red Dress,Fashion
1,Blue Jeans,Fashion
2,Wireless Earbuds,Electronics
3,Smartphone,Electronics
4,Coffee Maker,Home Appliances
5,Yoga Mat,Fitness
6,Running Shoes,Footwear
7,Leather Wallet,Accessories
8,Sunglasses,Accessories
9,Bluetooth Speaker,Electronics


## 4. Define and Save ML Model in ml_service

The ML model and helper functions are defined in `ml_service/keyword_extractor.py`. This file contains a simple TF-IDF + cosine similarity based extractor for product keywords. You can extend this with more advanced NLP or ML models as needed.

In [3]:
# 5. Train the Model Using ml_service
extractor = train_keyword_extractor(product_names)
print('Model trained on product names.')

Model trained on product names.


In [4]:
# 6. Test/Evaluate the Model Using ml_service
sample_queries = [
    'Show me some dresses',
    'I want wireless headphones',
    'Buy a coffee machine',
    'Need running shoes',
    'Looking for sunglasses',
    'Order a yoga mat',
    'Find a leather wallet',
    'Bluetooth music box',
    'Smart mobile phone',
    'Blue pants'
]
results = test_keyword_extractor(extractor, sample_queries)
for query, matches in results.items():
    print(f"Query: {query}")
    for name, score in matches:
        print(f"  Match: {name} (score: {score:.2f})")
    print()

Query: Show me some dresses
  Match: Bluetooth Speaker (score: 0.00)

Query: I want wireless headphones
  Match: Wireless Earbuds (score: 0.71)

Query: Buy a coffee machine
  Match: Coffee Maker (score: 0.71)

Query: Need running shoes
  Match: Running Shoes (score: 1.00)

Query: Looking for sunglasses
  Match: Sunglasses (score: 1.00)

Query: Order a yoga mat
  Match: Yoga Mat (score: 1.00)

Query: Find a leather wallet
  Match: Leather Wallet (score: 1.00)

Query: Bluetooth music box
  Match: Bluetooth Speaker (score: 0.71)

Query: Smart mobile phone
  Match: Bluetooth Speaker (score: 0.00)

Query: Blue pants
  Match: Blue Jeans (score: 0.71)

