# E-Commerce Recommendation Workflow Demo

## Introduction

This is a very simple demonstration of a natural language enhanced personalized shopping experience. We will focus on how we can apply Large Language Models (LLMs) to enhance a recommender engine workflow for a e-commerce application. 

This is not a demonstration of end-to-end recommendation capability with LLMs. Rather this focuses on narrow application areas where user experience is enhanced by introducing a LLM workflow by enabling the user to ask queries in natural language to find most similar products based on reviews, summarize QnA threads for a product, and compare summarized reviews between two products.

The workflow is divided in these major sections:
* Dataset description
* Data processing modules
* Recommender Workflow
    * Generate Embeddings of catalogues
    * LLMs taking user queries



### Aggregate all imports needed for Project

In [1]:
import os
import json
import pandas as pd
import cohere
from llama_index.node_parser import SimpleNodeParser
from llama_index import (Document, 
                         VectorStoreIndex, 
                         ServiceContext, 
                         LLMPredictor, 
                         StorageContext, 
                         get_response_synthesizer,)
from llama_index.embeddings.cohereai import CohereEmbedding
from llama_index.indices.postprocessor import CohereRerank
# Finetuner
from llama_index.finetuning import CohereRerankerFinetuneEngine
from llama_index.finetuning.rerankers.dataset_gen import CohereRerankerFinetuneDataset
from langchain.llms import Cohere
import chromadb
from llama_index.vector_stores import ChromaVectorStore
# from llama_index.retrievers import VectorIndexRetriever
# from llama_index.query_engine import RetrieverQueryEngine
from sklearn.model_selection import train_test_split
from IPython.display import Markdown, display

### Some basic API Key setups

In [2]:
def load_credentials(pth):
    """
    Loads API credential keys of different services 
    :param pth: Path to credentials file
    :return: Dictionary of API credentials
    """
    cred_dict = {}
    with open(pth, 'r') as f:
        cred_dict = json.load(f)
    return cred_dict

In [3]:
parent_dir = os.path.dirname(os.getcwd())

In [4]:
cred_pth = os.path.join(parent_dir, 'cred', 'credentials.json')
cred_dict = load_credentials(cred_pth)
cohere_api_key = cred_dict.get("COHERE_API_KEY", '')
# openai_api_key = cred_dict.get("OPENAI_API_KEY", '')

os.environ["COHERE_API_KEY"] = cohere_api_key
# os.environ["OPENAI_API_KEY"] = openai_api_key

## Dataset Description

### WANDS - Wayfair Annotation Dataset

WANDS is a Wayfair product search relevance dataset that is published as a companion to the paper from ECIR 2022:

> WANDS: Dataset for Product Search Relevance Assessment  
> Yan Chen, Shujian Liu, Zheng Liu, Weiyi Sun, Linas Baltrunas and Benjamin Schroeder

The dataset allows objective benchmarking and evaluation of search engines on an E-Commerce dataset. Key features of this dataset includes:

1. 42,994 candidate products
2. 480 queries
3. 233,448 (query,product) relevance judgements

We can leverage this dataset to demonstrate the capability of LLMs in enhancing recommendations by surfacing the most relevant products through ranking or re-ranking to improve quality of recommendations. 

Some of the key aspects related to curation of this dataset:

> Specifically, we segmented search queries among several dimensions that are key indicators of customer behavior, such as:

> * On-site organic searches as compared to marketing-redirected searches
> * Searches that resulted in customer engagement (e.g., added products to cart) versus searches that didn’t result in customer engagement 
> * Product popularity over the past two years  

## High Level Demo: Run search and get response in Natural Language

The following section aggregates everything done in the exploration side of the notebook to deliver a query engine that is capable of retrievining the candidate products and geenrating a summarized recommendation given a query. If there are no candidate products that match the given descriptions, you will currently not get any useful recommendation. 

The overall workflow from ingesting data, defining the components and running short experiments will feature below this section for those who are interested in reviewing the deeper details. 

In [5]:
# Now we'll set up the cohere client.
co_client = cohere.Client(os.environ["COHERE_API_KEY"])
wands_embed_model = "embed-english-v3.0" # embed-english-light-v3.0
wands_embed_input_type = "search_document"
wands_embeds_query_type = "search_query"

# Parameterize Cohere configurations
wands_embed_model = "embed-english-v3.0" #1024 dimension embedding
wands_embed_input_type = "search_document" 
llm_model = "command" #Generation LLM
max_tokens = 256
temperature = 0.5 #0
rerank_finetuned_model_name = 'cohere_rerank_wands_chair' #'cohere_rerank_wands"
rerank_model_type = "RERANK"
rerank_base_model = 'english'
top_n_rerank = 5
similarity_top_k = 10

In [6]:
# Vector Database setups
chroma_db_name = 'chroma_db_dev' #'chroma_db_local'
chromadb_pth = os.path.join(parent_dir, chroma_db_name)
chroma_collection_name = 'recommender_dev' #"recommender_demo"

In [7]:
# llama index configurations
index_name = 'cohere_recommender_index' # "cohere_chroma_index"
query_response_mode = 'tree_summarize'

# Setup a Cohere Service Context for Llama Index
llm = Cohere(model=llm_model, 
             temperature=temperature, 
             cohere_api_key=os.environ['COHERE_API_KEY'], 
             max_tokens=max_tokens)

embed_model = CohereEmbedding(cohere_api_key=os.environ["COHERE_API_KEY"],
                              model_name=wands_embed_model,
                              input_type=wands_embed_input_type,)

# cohere_rerank_fine_tuned = finetuned_reranker_model.get_finetuned_model(top_n=3)
cohere_rerank = CohereRerank(api_key=os.environ["COHERE_API_KEY"], 
                             top_n=top_n_rerank, 
                             model=cred_dict.get("COHERE_MODEL_ID2"))

service_context = ServiceContext.from_defaults(llm=llm, 
                                               embed_model=embed_model)

In [8]:
# Load the Llama Index index persisted to disk to run queries 
db_loaded = chromadb.PersistentClient(path=chromadb_pth)
chroma_collection_loaded = db_loaded.get_or_create_collection(chroma_collection_name)
vector_store_loaded = ChromaVectorStore(chroma_collection=chroma_collection_loaded)

simple_index = VectorStoreIndex.from_vector_store(vector_store=vector_store_loaded, 
                                                  service_context=service_context,)

In [9]:
query_engine_rerank = simple_index.as_query_engine(similarity_top_k=similarity_top_k, 
                                                   node_postprocessors=[cohere_rerank], 
                                                   response_mode=query_response_mode,
                                                   verbose=True,)

### WANDS reference Query results

In their paper, WANDS released a [reference pdf](https://github.com/wayfair/WANDS/blob/main/Product%20Search%20Relevance%20Annotation%20Guidelines.pdf) where they have details of product images and the web URL to the actual product with reasoning behind choice of labels. This is a great initial test to see whether the model+emebdding out of the box can match or improve recommendations of those scenarios through natural language assistant capabilities. 

This also gives us an opportunity to apply the re-ranker module and work on optimizing existing recommender algorithms with a small amount of high quality labelled data

In [10]:
wands_ref_queries = ['wicker outdoor bar', 
                     'chair and a half recliner', 
                     'shamrock', 
                     'farmhouse cabinet', 
                     'wire basket with dividers', 
                     'kids chair', 
                     '48 in entry table with side by side drawer', 
                     'salon chair']

In [14]:
ref_query = wands_ref_queries[2]
rerank_response_fine_tuned = query_engine_rerank.query(ref_query)
print(rerank_response_fine_tuned, sep="\n")

Your text contains a trailing whitespace, which has been trimmed to ensure high quality generations.


I cannot find any results for a product with the name Shamrock. However, I did find several products that are decorative objects and a chandelier that you may be interested in. 

Would you like to know more about any of these products? 

Products:
- Fluorite crystals with minor Hemimorphite (product ID: 831)
- Vanadinite crystals on Barite (product ID: 832)
- Meinhardt 5-light shaded empire chandelier (product class: Chandeliers)

If you provide more information about what you are looking for, I can help you find other products that match your criteria. 

Would you like to know more about any of these products? 


## Building the LLM powered Search Workflow

This section covers prototyping of the LLM powered search workflow with various sections:

* Loading the datasets and inspecting the data features
* Testing Llama Index as a framework that implements search and retrieval tasks 
* Applying Cohere LMs through Llama Index to build embeddings, re-rank retrieved candidates and apply LLMs to generate response
    * Computing simple baselines through out of box cosine similarity retrieval using only embeddings
    * Comparing against out of box Llama Index query engine implementation with:
        *  Different retrieval techniques to generate responses from chunks
        * Applying re-ranking strategies to evaluate improval from candidate generation to response evaluation
        * Fine-tuning re-ranker with the domain specific data to customize performance on hard search queries

In [5]:
wands_data_pth = os.path.join(parent_dir, 'WANDS', 'dataset')

In [6]:
# get search queries
wands_query_df = pd.read_csv(os.path.join(wands_data_pth, "query.csv"), sep='\t')
wands_query_df.head()

Unnamed: 0,query_id,query,query_class
0,0,salon chair,Massage Chairs
1,1,smart coffee table,Coffee & Cocktail Tables
2,2,dinosaur,Kids Wall Décor
3,3,turquoise pillows,Accent Pillows
4,4,chair and a half recliner,Recliners


In [7]:
# get products
wands_product_df = pd.read_csv(os.path.join(wands_data_pth, "product.csv"), sep='\t')
wands_product_df.head()

Unnamed: 0,product_id,product_name,product_class,category hierarchy,product_description,product_features,rating_count,average_rating,review_count
0,0,solid wood platform bed,Beds,Furniture / Bedroom Furniture / Beds & Headboa...,"good , deep sleep can be quite difficult to ha...",overallwidth-sidetoside:64.7|dsprimaryproducts...,15.0,4.5,15.0
1,1,all-clad 7 qt . slow cooker,Slow Cookers,Kitchen & Tabletop / Small Kitchen Appliances ...,"create delicious slow-cooked meals , from tend...",capacityquarts:7|producttype : slow cooker|pro...,100.0,2.0,98.0
2,2,all-clad electrics 6.5 qt . slow cooker,Slow Cookers,Kitchen & Tabletop / Small Kitchen Appliances ...,prepare home-cooked meals on any schedule with...,features : keep warm setting|capacityquarts:6....,208.0,3.0,181.0
3,3,all-clad all professional tools pizza cutter,"Slicers, Peelers And Graters",Browse By Brand / All-Clad,this original stainless tool was designed to c...,overallwidth-sidetoside:3.5|warrantylength : l...,69.0,4.5,42.0
4,4,baldwin prestige alcott passage knob with roun...,Door Knobs,Home Improvement / Doors & Door Hardware / Doo...,the hardware has a rich heritage of delivering...,compatibledoorthickness:1.375 '' |countryofori...,70.0,5.0,42.0


In [9]:
wands_product_df['product_name'].isnull().values.any()

False

In [10]:
wands_product_df['product_description'].isnull().values.any()

True

In [8]:
def combine_product_texts(product_name, product_description):
    """
    Combine product name and description into a single piece of text for embedding
    :param product_name: Name of Product
    :param product_description: Description of Product
    :return: 
    """
    if pd.notnull(product_name) and pd.notnull(product_description):
        return product_name + " " + product_description
    elif pd.notnull(product_name):
        return product_name
    elif pd.notnull(product_description):
        return product_description
    else:
        return "no product description available"

In [9]:
# get manually labeled groundtruth lables
wands_label_df = pd.read_csv(os.path.join(wands_data_pth, "label.csv"), sep='\t')
wands_label_df.head()

Unnamed: 0,id,query_id,product_id,label
0,0,0,25434,Exact
1,1,0,12088,Irrelevant
2,2,0,42931,Exact
3,3,0,2636,Exact
4,4,0,42923,Exact


In [10]:
def assign_label_score(label):
    """
    Function to convert label texts to scores
    :param label: The label for product recommendation given a query
    :return: 
    """
    if pd.isna(label):
        # Assume not relevant if there is no human label assigned
        return 0
    elif label.lower() == "exact":
        return 1.0
    elif label.lower() == "irrelevant":
        return 0
    elif label.lower() == "partial":
        # Rate higher than 50% to make relevance matching higher quality
        return 0.6

In [11]:
wands_label_df['label_score'] = wands_label_df['label'].apply(lambda x: assign_label_score(x))

In [12]:
# Combine queries and labels to get a ground truth dataframe
wands_query_label_df = pd.merge(wands_query_df, wands_label_df, on="query_id")

In [13]:
# Aggregate the text in products to ensure there are no NAs for each product
wands_product_df['product_text'] = wands_product_df['product_name'].combine(wands_product_df['product_description'],  lambda x, y: combine_product_texts(x, y))

In [14]:
wands_product_df['product_text'][:4]

0    solid wood platform bed good , deep sleep can ...
1    all-clad 7 qt . slow cooker create delicious s...
2    all-clad electrics 6.5 qt . slow cooker prepar...
3    all-clad all professional tools pizza cutter t...
Name: product_text, dtype: object

### Simple Cohere Client Embedding test

In [14]:
# Now we'll set up the cohere client.
co_client = cohere.Client(os.environ["COHERE_API_KEY"])
wands_embed_model = "embed-english-v3.0" # embed-english-light-v3.0
wands_embed_input_type = "search_document"
wands_embeds_query_type = "search_query"

In [14]:
wands_sample_texts = list(wands_product_df['product_text'])[:4]
# Get the embeddings
wands_embeds_sample = co_client.embed(texts=wands_sample_texts,
                                      model=wands_embed_model,
                                      input_type=wands_embed_input_type).embeddings

### Simple LlamaIndex Pipeline

In [15]:
documents = []
for i, row in wands_product_df.iterrows():
    document = Document(
        text=row['product_text'],  # Product description data
        doc_id=row['product_id'],
        extra_info={"product_id": row['product_id'], 
                    "product_class": row['product_class'], 
                    "product_category": row['category hierarchy']}
    )
    documents.append(document)

### Parse Documents with Simple Parser

In [16]:
# Create a simple parser 
parser = SimpleNodeParser()
nodes = parser.get_nodes_from_documents(documents)

### Vector Store Setup

Vector Stores are helpful in storing and retreiving embeddings at scale.
Let us choose a simple open source Vector database (Chroma) to store and retreive embeddings for this demonstration.
Using a Vector Store over a standard index data structure that Llama Index offers is helpful to transition to  database services for deployment in the cloud when there is a need to scale the prototype into a development solution for production.

In [17]:
chroma_db_name = 'chroma_db_dev' #'chroma_db_local'
chromadb_pth = os.path.join(parent_dir, chroma_db_name)
chroma_collection_name = 'recommender_dev' #"recommender_demo"

In [18]:
db = chromadb.PersistentClient(path=chromadb_pth)
chroma_collection = db.get_or_create_collection(chroma_collection_name)
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

### Create a simple Indexing Model

Once a vector store is setup, we can create a simple llama index object, that will help implement a great prototype for search and retreival capabilities of recommendation systems.

We will leverage Cohere's embedding, LLM and re-ranker models available through the Cohere API to add a natural language personalization component that will personalize user experience on the platform.

In [19]:
# Parameterize Cohere configurations
wands_embed_model = "embed-english-v3.0" #1024 dimension embedding
wands_embed_input_type = "search_document"
llm_model = "command" #Generation LLM
max_tokens = 256
temperature = 0.5 #0

In [20]:
# Setup a Cohere Service Context for Llama Index
llm = Cohere(model=llm_model, 
             temperature=temperature, 
             cohere_api_key=os.environ['COHERE_API_KEY'], 
             max_tokens=max_tokens)

embed_model = CohereEmbedding(cohere_api_key=os.environ["COHERE_API_KEY"],
                              model_name=wands_embed_model,
                              input_type=wands_embed_input_type,)

service_context = ServiceContext.from_defaults(llm=llm, 
                                               embed_model=embed_model)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

In [21]:
index_name = 'cohere_recommender_index' # "cohere_chroma_index"

In [22]:
# Build a simple index and persist it for future use
simple_index = VectorStoreIndex.from_documents(documents=documents, 
                                               service_context=service_context,
                                               storage_context=storage_context, 
                                               show_progress=True)
simple_index.storage_context.persist(persist_dir=os.path.join(parent_dir, index_name))

Parsing nodes:   0%|          | 0/42994 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/42994 [00:00<?, ?it/s]

### Load persisted Index

If index is already generated and stored, can be loaded back in memory easily.

In [23]:
db_loaded = chromadb.PersistentClient(path=chromadb_pth)
chroma_collection_loaded = db_loaded.get_or_create_collection(chroma_collection_name)
vector_store_loaded = ChromaVectorStore(chroma_collection=chroma_collection_loaded)

simple_index = VectorStoreIndex.from_vector_store(vector_store=vector_store_loaded, 
                                                  service_context=service_context,)

### WANDS reference Query results

In their paper, WANDS released a [reference pdf](https://github.com/wayfair/WANDS/blob/main/Product%20Search%20Relevance%20Annotation%20Guidelines.pdf) where they have details of product images and the web URL to the actual product with reasoning behind choice of labels. This is a great initial test to see whether the model+emebdding out of the box can match or improve recommendations of those scenarios through natural language assistant capabilities. 

This also gives us an opportunity to apply the re-ranker module and work on optimizing existing recommender algorithms with a small amount of high quality labelled data

In [23]:
wands_ref_queries = ['wicker outdoor bar', 
                     'chair and a half recliner', 
                     'shamrock', 
                     'farmhouse cabinet', 
                     'wire basket with dividers', 
                     'kids chair', 
                     '48 in entry table with side by side drawer', 
                     'salon chair']

### Computing relevance similarities through VectorDB

We know LLM calls are expensive and slow, can we get good results when doing cosine similarity search with embeddings alone?
This gives us a good baseline to understand pure embedding based ranking performance that can be compered to LLM based workflow with embeddings

In [26]:
chroma_collection.count()

42994

In [None]:
rec = chroma_collection.peek(1)
print('Metadatas:  ', rec['metadatas'])

In [None]:
print('Documents:  ', rec['documents'])

### Retrieve embeddings through query search with Vector DB

In [24]:
co_client = cohere.Client(os.environ["COHERE_API_KEY"])
wands_embed_model = "embed-english-v3.0" # embed-english-light-v3.0
wands_embeds_query_type = "search_query"

In [25]:
example = wands_ref_queries[1]
print(f"Search Query: {example}")

Search Query: chair and a half recliner


In [54]:
sample_query_embed = co_client.embed(texts=[example],
                                     model=wands_embed_model,
                                     input_type=wands_embeds_query_type).embeddings

In [55]:
sample_search_res = chroma_collection.query(query_embeddings=sample_query_embed,
                                            n_results=3,
                                            # where={"metadata_field": "is_equal_to_this"},
                                            # where_document={"$contains":"search_string"}
                                            )
sample_search_res

{'ids': [['9f6dce51-e212-4333-b0c2-677af2eb5b5c',
   'ae17bd92-2d91-40fe-a5ed-ba9a80de2568',
   'd38274a8-f534-44ec-bb86-305e85210443']],
 'distances': [[0.7982885837554932, 0.8617343306541443, 0.8665069937705994]],
 'metadatas': [[{'_node_content': '{"id_": "9f6dce51-e212-4333-b0c2-677af2eb5b5c", "embedding": null, "metadata": {"product_class": "Accent Chairs", "product_category": "Furniture / Living Room Furniture / Chairs & Seating / Accent Chairs / Chair And A Half Accent Chairs"}, "excluded_embed_metadata_keys": [], "excluded_llm_metadata_keys": [], "relationships": {"1": {"node_id": "6458", "node_type": "4", "metadata": {"product_class": "Accent Chairs", "product_category": "Furniture / Living Room Furniture / Chairs & Seating / Accent Chairs / Chair And A Half Accent Chairs"}, "hash": "44985d5d4f8241b849534c355a2679296c26d103e5e338eb44abf5b04ae3a23e", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "ab8de879-ad81-4aaf-81c7-96a26236f69b", "node_type": "1", "metadata": {"produc

In [56]:
sample_search_res['documents']

[["steel 43 '' wide top grain leather chair and a half this chair is architectural shaping and bold nail head detailing . this armchair has a casual appeal as well as amazing comfort with deep-seated there is plenty of room to relax . it features semi-attached backs , blends down seat cushions , and recesssed arms .",
  "st james 63 '' wide polyester chair and a half your search for the world 's greatest chair ends here . this oversize chair takes size and comfort to a new level . a chair that can easily fit two lovebirds or just a single bird who likes its space , it maintains its sleek lines and cool aesthetics while offering what many would consider a slice of heaven on earth .",
  "51 '' wide armchair this sofa bed armchair adds a custom touch to your living room or guest room . it 's made in the usa with a solid wood frame and foam-filled upholstery of your choice . this chair and a half features a traditional silhouette with a loose cushioned back and rolled arms with welted edge

In [57]:
inter_res = [meta['_node_content'] for meta in sample_search_res['metadatas'][0]]

In [58]:
inter_res

['{"id_": "9f6dce51-e212-4333-b0c2-677af2eb5b5c", "embedding": null, "metadata": {"product_class": "Accent Chairs", "product_category": "Furniture / Living Room Furniture / Chairs & Seating / Accent Chairs / Chair And A Half Accent Chairs"}, "excluded_embed_metadata_keys": [], "excluded_llm_metadata_keys": [], "relationships": {"1": {"node_id": "6458", "node_type": "4", "metadata": {"product_class": "Accent Chairs", "product_category": "Furniture / Living Room Furniture / Chairs & Seating / Accent Chairs / Chair And A Half Accent Chairs"}, "hash": "44985d5d4f8241b849534c355a2679296c26d103e5e338eb44abf5b04ae3a23e", "class_name": "RelatedNodeInfo"}, "2": {"node_id": "ab8de879-ad81-4aaf-81c7-96a26236f69b", "node_type": "1", "metadata": {"product_class": NaN, "product_category": "Outdoor / Outdoor D\\u00e9cor / Outdoor Wall D\\u00e9cor"}, "hash": "06e130e677ebe7ba7e26b88ac94d0d03ac1d4bd99f608d904ff078112a67f5da", "class_name": "RelatedNodeInfo"}, "3": {"node_id": "c172f058-88cd-4e29-8e48-a

In [33]:
product_id_list = [6458, 18971, 8369]

### Inspect GroundTruth

Find which were the exact labelled matches for the search

In [36]:
wands_query_label_df.loc[(wands_query_label_df['product_id'].isin(product_id_list)) & (wands_query_label_df['query'] == example)]

Unnamed: 0,query_id,query,query_class,id,product_id,label,label_score
6059,4,chair and a half recliner,Recliners,61048,6458,Partial,0.6
6105,4,chair and a half recliner,Recliners,205426,18971,Partial,0.6


In this situation, we have matches that were identified as partial for chair and half recliner pop up in our embedding based recommendations!
While this is a good result for our baseline, we would like to see if there is better relevance in recommended results, and a better user experience through a conversational agent that can be concise and add a bit of personal touch, and this is what we can quickly hope to prototype with Cohere's LLMs :)

### Use Llama Index Query Engine

This will apply the LLMPredictor to search and retreive all relevant contexts, while providing the output in a more natural language oriented user experience for search.

In [23]:
# sampled_queries = wands_query_label_df.sample(n=10, random_state=420)
# check_index = 1
# example = sampled_queries['query'].values[check_index]

In [26]:
example_llm = wands_ref_queries[1]
print(f"Search Query: {example_llm}")

Search Query: chair and a half recliner


In [27]:
query_engine = simple_index.as_query_engine()

In [28]:
ex_response = query_engine.query(example_llm)
print(ex_response, sep="\n")

Your text contains a trailing whitespace, which has been trimmed to ensure high quality generations.


Here are a couple of options for recliner chairs: 

- The first recliner is 33.5" wide and is made with soft microfiber fabric and high-density sponge filling for added comfort. This chair is a power lift assist standard recliner, which can help elderly individuals with standing problems. 

- The second recliner is slightly wider at 31" and is made with engineered bonded leather. This chair includes a glider power recliner with chaise seating, as well as a swivel base with a 160-degree turning radius. 

Would you like me to narrow down the options based on additional specifications? 


In [34]:
ex_response.source_nodes

[NodeWithScore(node=TextNode(id_='917c1213-ea17-4304-9881-fbdf0876e6e5', embedding=None, metadata={'product_class': 'Recliners', 'product_category': 'Furniture / Living Room Furniture / Chairs & Seating / Recliners'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='2277', node_type=<ObjectType.DOCUMENT: '4'>, metadata={'product_class': 'Recliners', 'product_category': 'Furniture / Living Room Furniture / Chairs & Seating / Recliners'}, hash='71902313bce54534c7e3406e54599827bd59375bc4fd8f959e89bd503ac0be44'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='ded78c3e-b177-4103-8864-83d7e3527a80', node_type=<ObjectType.TEXT: '1'>, metadata={'product_class': 'Ottomans', 'product_category': 'Furniture / Living Room Furniture / Ottomans & Poufs'}, hash='3834b988015a1c4d9fcca7145632e66677a5946566157b026119384366a80675'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='fa1d8781-c165-42e6-96

#### Initial Observations

This sounds like a real in store assistant who gives you that customer experience feeling there are some details that will actually be more helpful coming from agents (such as size or other measurement characteristics). But it does not look like it was able to retrieve partial results just looking at the descriptions, as they seem to focus more on recliners

With that in mind, we can try different prompting techniques easily through a workflow like Llama Index to get the LLM to be more helpful in responses. One such approach is the [tree summarize](https://gpt-index.readthedocs.io/en/stable/module_guides/deploying/query_engine/response_modes.html), which is good at generating summary responses combining information across different chunks of a context fed to the LLM. Lets apply that and review the results.

In [29]:
query_engine_param = simple_index.as_query_engine(
    response_mode="tree_summarize",
    verbose=True,
)

In [31]:
ex_response_param = query_engine_param.query(example_llm)
print(ex_response_param, sep="\n")

Your text contains a trailing whitespace, which has been trimmed to ensure high quality generations.


Here are some recliner options for you: 

- The MCCandlish recliner is a chair and a half recliner with engineered bonded leather, a smooth and soft cover. It features a swivel base with a 160-degree turning radius and glider power recliner with chaise seating. The MCCandlish recliner is 31" wide and is a good option for those who prefer a wider recliner. 

- The recliner is 33.5" wide and is upholstered in microfiber fabric. It is filled with a high-density sponge, making it a comfortable option. This recliner also has lift assist, which can help those with standing problems. 

Both of these options are recliner chairs and a half, offering ample space for comfortable seating. 

Would you like to know more about any of these products? 


#### First Impressions

That was a similar response to what we got earlier, with some helpful indications. However, the text seems a little long winded in comparison and it did not really retrieve better recommendations either. Let us check if any of these above 2 responses fall under partial or exact matches!

In [None]:
ex_response_param.source_nodes

In [36]:
llama_match_product_ids = [2277, 32081]

In [38]:
wands_query_label_df[(wands_query_label_df['query'] == example_llm) & (wands_query_label_df['product_id'].isin(llama_match_product_ids))]

Unnamed: 0,query_id,query,query_class,id,product_id,label,label_score


### Enhancing candidate generation to recommendation with re-ranking

As our intuition served before, we found relevant matches were not recommended. 
Re-ranking can be employed to look past pure semantic search + LLMs solutions when candidates are already available for generation with some understanding of available labels (keyword based or in this case, manual annotation). Cohere re-ranking endpoints provide a very simple post-processing workflow to help re-rank the candidate results at the last stage of a search workflow.

Let us try the re-ranker module out of the box and see how it performs for our search queries.

In [39]:
cohere_rerank = CohereRerank(api_key=os.environ["COHERE_API_KEY"], top_n=5)

In [40]:
query_engine_rerank = simple_index.as_query_engine(similarity_top_k=10, 
                                                   node_postprocessors=[cohere_rerank],
                                                   response_mode="tree_summarize",
                                                   verbose=True,
)

In [41]:
rerank_response = query_engine.query(example_llm)
print(rerank_response, sep="\n")

Your text contains a trailing whitespace, which has been trimmed to ensure high quality generations.


Here are a couple of recliner options for you:

- The first recliner is 33.5" wide and made with soft microfiber fabric and high-density sponge filling for added comfort. This recliner can be adjusted to assist those with standing issues and is a great option for elderly persons. 

- The second recliner, called the mccandlish, is 31" wide with a swivel base that has a 160-degree turning radius. It features chaise seating and smooth bonded leather engineered for comfort. 

Would you like me to specify any other requirements when looking for a recliner? 


In [42]:
rerank_response.source_nodes

[NodeWithScore(node=TextNode(id_='917c1213-ea17-4304-9881-fbdf0876e6e5', embedding=None, metadata={'product_class': 'Recliners', 'product_category': 'Furniture / Living Room Furniture / Chairs & Seating / Recliners'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='2277', node_type=<ObjectType.DOCUMENT: '4'>, metadata={'product_class': 'Recliners', 'product_category': 'Furniture / Living Room Furniture / Chairs & Seating / Recliners'}, hash='71902313bce54534c7e3406e54599827bd59375bc4fd8f959e89bd503ac0be44'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='ded78c3e-b177-4103-8864-83d7e3527a80', node_type=<ObjectType.TEXT: '1'>, metadata={'product_class': 'Ottomans', 'product_category': 'Furniture / Living Room Furniture / Ottomans & Poufs'}, hash='3834b988015a1c4d9fcca7145632e66677a5946566157b026119384366a80675'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='fa1d8781-c165-42e6-96

It looks like re-ranking didnt help move the needle on retrieving the most relevant recommendation for this query. We can try our luck with other queries as well and see what the average case is!

### Generate Query Response Pairs for all examples

Produce the assistive chat responses for all reference queries 

In [None]:
agent_res_list = []
for qr in wands_ref_queries:
    agent_res_list,append({f"{qr}": })

### Fine-tuning re-ranking model

The re-ranking model didnt really change the order of recommended products but provided a more concise summary of each, while making the overall response from the bot longer, which may not be the best for user experience. 

A quick experiment to understand its effectiveness better might be to use a subset of our labelled data to fine-tune the re-ranker model and deploy a fine-tuned model to re-rank the top results, which we will proceed to do below.

#### Fine-tuning Dataset Curation Process

The documentation mentions that Query + Relevant Passage/ Query + Hard Negatives should be less than 510 tokens.

Training Custom Re-ranker expects minimum 256 (Query + Relevant passage) pairs with or without hard negatives for training and 64 pairs for validation. Please note that the validation is optional.

**Training**: We will initially sample unique queries from WANDS dataset, and then further sample mix of context pairs that is representative of our full dataset, for creating training pairs.

**Validation**: We will sample remaining queries of the training stage sampling from above for validation. 

In both training and validation, we need to experiment with number of samples to hit the 256 and 64 unique requirements respectively.

In [66]:
wands_query_label_df['label'].value_counts()

label
Partial       146633
Irrelevant     61201
Exact          25614
Name: count, dtype: int64

In [None]:
# wands_ref_query_label_df = wands_query_label_df.loc[wands_query_label_df['query'].isin(wands_ref_queries)]
# wands_ref_query_label_df.head()

In [132]:
wands_query_df['query'].unique()

array(['salon chair', 'smart coffee table', 'dinosaur',
       'turquoise pillows', 'chair and a half recliner',
       'sofa with ottoman', 'acrylic clear chair', 'driftwood mirror',
       'home sweet home sign', 'coffee table fire pit', 'king poster bed',
       'ombre rug', 'large spoon and fork wall decor',
       'outdoor privacy wall', 'beds that have leds',
       'black 5 drawer dresser by guilford', 'blk 18x18 seat cushions',
       'closet storage with zipper',
       'chrome bathroom 4 light vanity light', 'gurney  slade 56',
       'foutains with brick look', 'living curtains pearl',
       'light and navy blue decorative pillow',
       'stoneford end tables white and wood',
       'wood coffee table set by storage', 'sunflower', 'leather chairs',
       'outdoor welcome rug', 'rooster decor', 'bathroom vanity knobs',
       '3 1/2 inch drawer pull', 'burnt orange curtains',
       'dark gray dresser', 'non slip shower floor tile',
       'bar stool with backrest', 'enclo

In [46]:
# Make sure to get unique training and validation queries to sample the datasets 
train_queries, val_queries = train_test_split(wands_query_df['query'].unique(), train_size=0.8, random_state=420)

In [146]:
train_queries.size

384

In [49]:
N = len(wands_ref_queries)

In [52]:
train_queries_total = train_queries.tolist()+wands_ref_queries[:N//2]

In [148]:
val_queries.size

96

In [53]:
# Mix and match numbers to find the ideal Sampling from subsets that satisfies 256 unique train examples and 64 unique val examples
n_train = 1028
n_test = 256

In [54]:
wands_query_label_train_df = wands_query_label_df[wands_query_label_df['query'].isin(train_queries_total)]
wands_rerank_train, _ = train_test_split(wands_query_label_train_df, train_size=n_train, random_state=420, stratify=wands_query_label_train_df['label'])

In [55]:
wands_query_label_test_df = wands_query_label_df[wands_query_label_df['query'].isin(val_queries)]
wands_rerank_test, _ = train_test_split(wands_query_label_test_df, train_size=n_test, random_state=420, stratify=wands_query_label_test_df['label'])

In [56]:
wands_rerank_train.head()

Unnamed: 0,query_id,query,query_class,id,product_id,label,label_score
132558,264,cortez pillow,Accent Pillows,124708,9055,Partial,0.6
143994,308,led 60,Light Bulbs,132144,5965,Partial,0.6
174604,387,self enclosed planters,Planters,40389,21199,Partial,0.6
71558,127,wine bar,Bars & Bar Sets,229140,8150,Irrelevant,0.0
181547,409,teal chair,Accent Chairs,161186,28924,Partial,0.6


In [58]:
wands_rerank_train['query'].unique()

array(['cortez pillow', 'led 60', 'self enclosed planters', 'wine bar',
       'teal chair', 'leather chair', 'ruckus chair',
       'above toilet cabinet', 'wood coffee table set by storage',
       'wayfair comforters', 'wishbone chair',
       'front door cabinet with doors', 'floating bed', 'card table',
       'turquoise chair', 'chrome bathroom 4 light vanity light',
       'kids chair', '30 inch bathroom vanity', 'nesting tray set',
       'sheffield home bath set', 'outdoor storage',
       'fortunat coffee table', 'midcentury tv unit',
       'benjiamino faux leather power lift chair', 'twin bed frame',
       'acrylic clear chair', 'gaia driftwood table',
       'bathroom wastebasket', 'storage dresser', '7 draw white dresser',
       'entertainment stand end table', 'orren ellis l shape desk',
       '46 inch closet door', 'carolyn console table',
       'kohen 5 drawer dresser', 'outdoor clock', 'entrance table',
       'small wardrobe grey', 'parsons chairs', 'aloe vera pl

In [61]:
wands_rerank_test.head()

Unnamed: 0,query_id,query,query_class,id,product_id,label,label_score
108636,211,memory foam rug galveston,Bath Rugs & Mats,107085,6472,Irrelevant,0.0
207524,434,bathroom lighting,Vanity Lighting,181858,34115,Partial,0.6
174277,386,alter furniture,Outdoor Conversation Sets,40283,30132,Partial,0.6
194273,422,ergonomic chair,Office Chairs,170780,42342,Partial,0.6
68958,124,unique coffee tables,Coffee & Cocktail Tables,228329,36193,Partial,0.6


In [62]:
wands_rerank_test['query'].unique()

array(['memory foam rug galveston', 'bathroom lighting',
       'alter furniture', 'ergonomic chair', 'unique coffee tables',
       'modern farmhouse lighting semi flush mount',
       'full mattress padded liner', 'attleboro drum coffee table',
       'bed risers', 'ombre rug', 'tressler rug', 'marble',
       'velvet chaise', 'bohemian', 'adjustable height artist stool',
       'industrial pipe dining  table', 'butcher block island',
       'zodiac pillow', 'trundle daybed', 'wire basket with dividers',
       'bathroom vanity with counter space', 'tall storage cabinet',
       'smart coffee table', 'hardwood beds',
       'betty resin free standing umbrella', 'upholstered girls bed',
       'marlon tufted queen bed', 'anchor decor', 'pedistole sink',
       'annex dresser', '3/4 size mattress', 'kitchen storage shelves',
       'living room coffee table sets', 'nectar queen mattress',
       'outdoor privacy wall', 'pool floats', 'gray dresser',
       'shoe bench entryway', 'seat 

In [63]:
def generate_hard_negatives(queries, query_df, product_df, n_context=3, negative_label_score=0.0):
    """
    This function extracts hard negatives for each search query from the labelled WANDS dataset
    :param queries: Hand picked queries demonstrated in WANDS paper
    :param query_df: WANDS dataset which labels relevance of returned product id by recommender given query
    :param product_df: WANDS dataset which holds product description information given product id
    :param n_context: How many contexts are considered relevant for a query? Default is 3
    :param negative_label_score: Default label score assigned to hard negative
    :return: List of hard negatives given each query
    """
    hard_negatives = []
    for query in queries:
        negative_df = query_df[(query_df['query'] == query) & (query_df['label_score'] == negative_label_score)]
        if negative_df.empty:
            hard_negatives.append([])
        else:
            negative_product_ids = negative_df['product_id'].values
            if len(negative_product_ids) > n_context:
                negative_product_ids = negative_product_ids[:n_context]
            negative_product_df = product_df[product_df['product_id'].isin(negative_product_ids)]
            hard_negatives.append(negative_product_df['product_text'].to_list())
    return hard_negatives

In [64]:
hard_negatives_train = generate_hard_negatives(queries=wands_rerank_train['query'].unique().tolist(), 
                                         query_df=wands_query_label_df, 
                                         product_df=wands_product_df)
hard_negatives_train[:3]

[["cortez kelay ergonomic gaming chair is your work or office chair causing your back pain ? sitting in a chair for extended periods can cause pain in the lower back . working on your desk for long hours also increases stress on the neck and shoulders , and adds considerable pressure on your spine and muscles . even impeccable posture may not save you from muscle strain . ordinary desk chairs do not provide the proper support needed to keep you from slouching or overstretching your spinal ligaments . don ’ t wait until your condition worsens . it ’ s time to choose comfort . with kelly 's mesh office chair , you get unprecedented and custom back support . our chair is designed for total comfort , from the headrest to the seat depth . you can also adjust the lumbar pillow forwards or backward and up or down for truly customized lower back support . there are 4 levels of lockable leaning angles . the height is fully adjustable , as well as the seat depth . kelay is the first in taiwan to

In [65]:
hard_negatives_test = generate_hard_negatives(queries=wands_rerank_test['query'].unique().tolist(),
                                               query_df=wands_query_label_df,
                                               product_df=wands_product_df)
hard_negatives_test[:3]

[["chasimi 10 '' medium memory foam mattress enjoy a memorable night 's sleep thanks to these memory foam mattresses with certipur-us certified foam , made up of memory foam and high-density foam . comfort and air circulation created to provide balanced support—no matter how much you toss and turn . perfectly 3d knitted fabric makes for a breathable and smooth comfort surface . revolutionary high-density foam brings you conforming comfort . and is shipped compressed , rolled , and vacuum-sealed for your convenience .",
  'nantucket memory foam 15 piece shower curtain set + hooks set the scene in your washroom ensemble with this stylish set keep rogue water splashes at bay in the shower with the geometric and flower designs -shower curtain comes with 12 piece metal hooks , and keep your feet dry after you get out of the shower with the matching color memory foam bath mat .',
  "gritton medium memory foam cooling body pillow did you know it 's recommended that you replace your pillows ev

In [66]:
def generate_query_context_pairs(queries, query_df, product_df, n_context=3, negative_label_score=0.0):
    """
    This function extracts exact and partial matches for each search query from the labelled WANDS dataset
    :param queries: Hand picked queries demonstrated in WANDS paper
    :param query_df: WANDS dataset which labels relevance of returned product id by recommender given query
    :param product_df: WANDS dataset which holds product description information given product id
    :param n_context: How many contexts are considered relevant for a query? Default is 3
    :param negative_label_score: Default label score assigned to hard negative
    :return: List of relevant product descriptions from recommender given each query
    """
    context_pairs = []
    for query in queries:
        relevant_df = query_df[(query_df['query'] == query) & (query_df['label_score'] > negative_label_score)]
        if relevant_df.empty:
            context_pairs.append([])
        else:
            relevant_df = relevant_df.sort_values(by=['label_score'], ascending=False)
            relevant_product_ids = relevant_df['product_id'].values
            if len(relevant_product_ids) > n_context:
                relevant_product_ids = relevant_product_ids[:n_context]
            relevant_product_df = product_df[product_df['product_id'].isin(relevant_product_ids)]
            context_pairs.append(relevant_product_df['product_text'].to_list())
    return context_pairs  

In [67]:
contexts_train = generate_query_context_pairs(queries=wands_rerank_train['query'].unique().tolist(), 
                                         query_df=wands_query_label_df, 
                                         product_df=wands_product_df)
contexts_train[:3]

[['cortez go dallas indoor/outdoor throw pillow show off your state pride and support your local football team with this throw pillow . for indoor and outdoor use , this square accent pillow features a double-sided print and is a stain , mildew , and water-resistant . makes a great gift for the football fan in your life .',
  'cortez vibe rectangular pillow cover & insert',
  'cortez rectangular pillow cover & insert'],
 ['60 led solar wall light w/motion sensor 270° super bright ( set of 2 ) description : upgraded 60 led solar lights : 60 led motion sensor solar lights are perfect for using on outdoor , garden , patio yard , deck garage , driveway porch . powerful sensor ball head offers strong motion sensitivity up to 9-17 feet and 270 degrees sensor angle . solar powered security lights : waterproof solar lights outdoor have passed the fcc certification . and our units are made of high-impact abs which can withstand snow , rain and other extreme weather conditions . intelligent auto

In [68]:
contexts_test = generate_query_context_pairs(queries=wands_rerank_test['query'].unique().tolist(), 
                                         query_df=wands_query_label_df, 
                                         product_df=wands_product_df)
contexts_test[:3]

[["27.5 '' x19.6 '' memory foam bath mat ultra soft non slip and absorbent bathroom rug , set of 2 net weight : 0.5kgproduct size : bath mat 50 × 40cm , u-shaped mat 70 × 50cmmaterial : flannel and memory cottonproduct features:1. the cushion is made of super soft microfiber . thick flannel and memory cotton help you avoid dripping water when you get out of the shower , bathtub or at the sink2 . anti slip and absorption : because the back of the hot adhesive anti-skid bottom , one-time molding has no gap , so it can effectively absorb water and prevent slip . the durable material and high-quality structure ensure that it will not deteriorate , flatten or scatter in use3 . our u-shaped cushion size is perfect for your toilet , while the other is a universal rectangle . it will be the perfect choice for master bathroom , children 's bathroom , toilet and suite,4 . high quality and comfortable . this super soft and super comfortable bath blanket will feel very comfortable when you step on

In [69]:
def generate_rerank_labelled_dataset(queries, contexts, hard_negatives, dataset_file_pth):
    """
    Generated labelled rerank dataset based on context pairs and hard negatives that were identified for a sampled subset of queries from the main dataset
    :param queries: The unique reference queries in the dataset
    :param contexts: The relevant contexts (Exact or Partial) corresponding to each reference query, if available
    :param hard_negatives: The irrelevant matches shown by recommender as labelled by humans 
    :param dataset_file_pth: Save generated dataset in this file path
    :return: 
    """
    with open(dataset_file_pth, "w") as outfile:
    # Iterate over the lists simultaneously using zip
        for query, context, hard_negative in zip(
            queries, contexts, hard_negatives
        ):
            # Instantiate a CohereRerankerFinetuneDataset object for the current entry
            entry = CohereRerankerFinetuneDataset(
                query=query, relevant_passages=context, hard_negatives=hard_negative
            )
            # Write the JSONL string to the file
            outfile.write(entry.to_jsonl())
        print(f"{dataset_file_pth} is generated and ready!")

### Generate Re-ranking Dataset for Fine-tuning

In [70]:
rerank_data_path = os.path.join(parent_dir, 'rerank-ft')
os.makedirs(rerank_data_path, exist_ok=True)
rerank_train_filename = 'train.jsonl'
rerank_val_filename = 'val.jsonl'
rerank_train_file_pth = os.path.join(rerank_data_path, rerank_train_filename)
rerank_val_file_pth = os.path.join(rerank_data_path, rerank_val_filename)

In [71]:
generate_rerank_labelled_dataset(queries=wands_rerank_train['query'].unique().tolist(),
                                 contexts=contexts_train,
                                 hard_negatives=hard_negatives_train,
                                 dataset_file_pth=rerank_train_file_pth)

/Users/karthiksubramanian/PycharmProjects/recommenderLLM/rerank-ft/train.jsonl is generated and ready!


In [72]:
generate_rerank_labelled_dataset(queries=wands_rerank_test['query'].unique().tolist(),
                                 contexts=contexts_test,
                                 hard_negatives=hard_negatives_test,
                                 dataset_file_pth=rerank_val_file_pth)

/Users/karthiksubramanian/PycharmProjects/recommenderLLM/rerank-ft/val.jsonl is generated and ready!


In [74]:
rerank_finetuned_model_name = 'cohere_rerank_wands_chair' #'cohere_rerank_wands"
rerank_model_type = "RERANK"
rerank_base_model = 'english'

# Reranker model with avg 3 hard negatives selected from labelled dataset
finetuned_reranker_model = CohereRerankerFinetuneEngine(train_file_name=rerank_train_file_pth,
                                                        val_file_name=rerank_val_file_pth,
                                                        model_name=rerank_finetuned_model_name,
                                                        model_type=rerank_model_type,
                                                        base_model=rerank_base_model,
                                                        )

In [75]:
finetuned_reranker_model.finetune()

In [44]:
# cohere_rerank_fine_tuned = finetuned_reranker_model.get_finetuned_model(top_n=3)
cohere_rerank_fine_tuned = CohereRerank(api_key=os.environ["COHERE_API_KEY"], 
                                        top_n=3, 
                                        model=cred_dict.get("COHERE_MODEL_ID"))

In [45]:
query_engine_rerank_fine_tuned = simple_index.as_query_engine(similarity_top_k=10, 
                                                              node_postprocessors=[cohere_rerank_fine_tuned], response_mode="tree_summarize",
                                                              verbose=True,)

In [187]:
rerank_response_fine_tuned = query_engine_rerank_fine_tuned.query(example_llm)
print(rerank_response_fine_tuned, sep="\n")

Your text contains a trailing whitespace, which has been trimmed to ensure high quality generations.


We have multiple options for recliner chairs, but none that are specifically described as a 'chair and a half.' 

If you have any more details about what you are looking for, please provide additional information and we can further refine our search to recommend options that might be a better fit. 

Would you like to know more about any of these products? 

- Dariya Recliner: A manual rocker recliner with a smooth bonded faux leather finish. It is 26.5" wide and ideal for a living room, executive lounge, or lobby.
- Aleid Recliner: A power lift assist recliner with a soft short plush fabric cover. It has a classic contemporary style and is 31" wide. It is suitable for living rooms, bedrooms, theater rooms, and media rooms. 
- Liam Recliner: A wider faux leather power lift assist recliner, suitable for relaxing or sleeping. It is 33.5" wide and made of high-quality fabric with a high-density sponge filling. 

Let me know if you have any other questions about these products or anything e

In [77]:
# cohere_rerank_fine_tuned = finetuned_reranker_model.get_finetuned_model(top_n=3)
cohere_rerank_fine_tuned = CohereRerank(api_key=os.environ["COHERE_API_KEY"], 
                                        top_n=3, 
                                        model=cred_dict.get("COHERE_MODEL_ID2"))

In [78]:
query_engine_rerank_fine_tuned = simple_index.as_query_engine(similarity_top_k=10, 
                                                              node_postprocessors=[cohere_rerank_fine_tuned], response_mode="tree_summarize",
                                                              verbose=True,)

In [82]:
rerank_response_fine_tuned = query_engine_rerank_fine_tuned.query("wire basket dividers")
print(rerank_response_fine_tuned, sep="\n")

Your text contains a trailing whitespace, which has been trimmed to ensure high quality generations.


I have found several examples of wire basket dividers:

- Dividers for wire bins (product category: Garage Shelving Accessories)
- Optional dividers for wire baskets (product category: Garage & Outdoor Storage & Organization)
- Divider for mesh wire basket metal/wire lid (product category: Storage Containers & Drawers)

Would you like to know more about any of these products? 

If none of these matches what you are looking for let me know and I will search for other products in the wire basket dividers category. 


## Search and Retrieval Engine in Action