### Faceted Product Search
This is a demo of product search scenario based on facets (topics) mined from Amazon reviews.<br/>

Following demo is based on topic modeling and indexing a subset of data from the following dataset <br/>
https://s3.amazonaws.com/amazon-reviews-pds/tsv/amazon_reviews_us_Electronics_v1_00.tsv.gz

This demo used metapy library to build, host & query from inverted index <br/>
[https://github.com/meta-toolkit/metapy](https://github.com/meta-toolkit/metapy)

In [43]:
import pandas as pd

import metapy
import pytoml
import re

### Build Index
- We indexed mined topics, goal is to do faceted search. instead of title search
- Uses inbuilt BM25 ranker using default hyper parameters

In [44]:
config_file = 'config.toml'
index = metapy.index.make_inverted_index(config_file)

# user Bm25 ranker, with optional hyper params
ranker = metapy.index.OkapiBM25()

### Load Product metadata from sentiments feed

In [62]:
product_meta = pd.read_json('../../data/product_sentiments.product_aggregated.slim.json')

### Query Index
Issue few queries againt the index

In [72]:
sample_query = 'durable charging cables'
#'music player with good battery'
#'best noise cancelling head phones'

#Build Index Query
index_query = metapy.index.Document()
index_query.content(sample_query)

In [73]:
#Query Index and get top ranker documents
top_docs = ranker.score(index, index_query, num_results=50)

# collect product_ids
result_productids = []
for num, (d_id, _) in enumerate(top_docs):
    content = index.metadata(d_id).get('content')
    if content is not None:
        product_id = re.split(r'\t+', content)[0]
        result_productids.append(product_id)

### Search Results for the given query

In [74]:
#Get product search results
pd.set_option('display.max_colwidth', -1)
search_results = pd.DataFrame(columns=['product_id','product_title'])

for product in result_productids:
    product_id = str(product)
    found_df = product_meta[product_meta['product_id'] == product_id][['product_id', 'product_title']]
    if found_df.shape[0] > 0:
        search_results.append(found_df[['product_id', 'product_title']])
        search_results = pd.concat([search_results, found_df])
    #print(found_df[['product_id', 'product_title']])

search_results

Unnamed: 0,product_id,product_title
279,B00CYXYC32,"Apple USB Lightning Cable 6ft - iSmooth First Generation - Apple Lightning Compatible Cable Designed to Charge iPhone 5, iPad Mini, iPad 4th Generation"
429,B00O850BW0,"EMEMO® 3 pack! USB Lightning Cable for iPhone 6 5 5S 5C, iPad mini, iPod Nano (7th generation) iPod touch (5th Generation) - Compatible Charger Cord for Data and Syncing - 3ft Long!"
378,B00KBFZDI8,"Smart battery charger, iGrace NITECORE NEW i4 (2016 New Version) Universal Smart battery Charger for Li-ion/IMR/LiFeP04/Ni-MH/Ni-Cd with 12V Car Adapter and iGrace Battery Box"
384,B00KW2ZDHQ,Nitecore Charger with EASTSHINE EB182 Battery Box and Car Charger
162,B005ILYG5Q,Sanyo NEW 1500 eneloop AA Ni-MH Pre-Charged Rechargeable Batteries
350,B00IHT2AUE,EBL 4c 4d batteries with charger
252,B00A35KPQQ,"IOGEAR GearPower 11,000mAh Ultra Capacity Mobile Power Station for Smartphones/Tablets/Mobile Devices, White, GMP10K"
179,B006OSQALU,Tenergy Advanced Universal Charger TN190 (4 Channel AA/AAA/C/D/9V Ni-MH Charger with LCD Display and USB Power Outlet)
366,B00JHKSL1O,Panasonic eneloop pro NEW High Capacity Power Pack
142,B004NSUMTO,AmazonBasics Charger with International Plug Adapters (includes 4-AA Rechargeable Batteries)
