In [None]:
!pip install typing-extensions==4.7.0
!pip install langchain==0.1.4
!pip install pinecone-client==3.0.0
!pip install konko
!pip install sentence-transformers

# RAG-Enhanced Review Insights with Konko, LangChain & Pinecone

## Introduction

This is a guide about synthesing reviews at scale using an LLM (via Konko AI), a vector database (pinecone) and LangChain.

Specifically, we will use Retrieval Augmented Generation (RAG) to give Llama 2 the context it needs to summarize customer reviews for two products from Amazon (2k+ reviews each).

## 🧠 The Problem: LLMs are not aware of recent or business-specific events

LLMs understanding is anchored to the last data they were trained on.

Here are some nuances we need to be aware of:

1. **Business Context Blindness:** An LLM, out of the box, lacks the nuances of your specific business. It's like a fresh recruit on their first day; they don't inherently know the intricacies of your operations or the preferences of your users.
2. **Static Knowledge Base:** An LLM's strength is its extensive knowledge, but it's also its limitation. It's not inherently aware of evolving trends, recent events, or fresh data, which can be vital for many applications.



## The Solution: RAG (Retrieval Augmented Generation)

Enter the RAG framework. The essence of Retrieval Augmentation is to supplement LLMs with external, up-to-date information. This ensures that the insights and analyses are both deep and current.

**Advantages of RAG:**

1. **Dynamic Knowledge:** RAG ensures that the information LLMs work with is both vast (from its internal knowledge) and fresh (from external sources).
2. **Efficient Fine-Tuning:** RAG allows updates to its knowledge without the need for exhaustive retraining. This flexibility makes it adept at adapting to changing information landscapes.
3. **Contextual Business Relevance:** With the right sources, RAG can be tailored to provide business-specific context, making LLM outputs more pertinent to specific user needs and business scenarios.

**Once Implemented, this is how RAG works**

**RAG Implementation Steps**

1. **User Prompt**: User provides an initial instruction or query to the agent.

2. **Contextual Search**: The agent searches Pinecone (a vector database) to gather related context for the prompt.

3. **Prompt Augmentation**: The agent enhances the original user prompt with the additional context.

4. **Inference**: Using the enriched prompt, the LLM processes the instruction or query, leveraging the added factual data.

5. **Action**: The agent takes action based on the LLM's response, which is now more informed and accurate.

The RAG framework elegantly addresses the data freshness challenge faced by LLMs. By fetching current, relevant data, and feeding it to LLMs, we ensure that our analyses remain both in-depth and contemporary.


![RAG Workflow](https://raw.githubusercontent.com/konko-ai/examples/main/img/Rag.png)

## How to Implement RAG (A Step by step guide using Amazon Reviews)

In this notebook, you will experience firsthand the synergy of **Konko's hosted LLM**, **LangChain**, and **Pinecone**.
Equipped with these tools and techniques, businesses can gain a competitive edge, always staying in tune with their customer's latest feedback. Ready to leverage this for your business?

**Overview of Steps**

1. Setup
2. Initialization & data loading
3. Narrowing down the dataset
4. Converting our review data to embeddings in preparation for storage
5. Storing our embedding into Pinecone
6. Putting it all together with Langchain and Llama 2


Dive in and explore the code snippets provided. Happy coding!

### Step 0: Setup

1. Install Necessary Libraries: First up, we'll set up our environment.
2. Set Up Environment Variables: As a best practice, API keys and configurations will be kept in environment variables. Ensure you have established variables for Konko API KEY, Pinecone API KEY, and PINECONE ENVIRONMENT.

The dataset we'll be working with is available [here](https://cseweb.ucsd.edu/~jmcauley/datasets/amazon_v2/).



In [2]:
import os
import json
import gzip
import pandas as pd
from urllib.request import urlopen

### Step 1: Initialization & data loading

1. Extract and load Amazon reviews and associated metadata directly from compressed files to pandas dataframes.
2. Conduct a preliminary data cleanup, focusing on truncating lengthy reviews for more efficient processing.

In [4]:
# Extract data from files
data = []
with gzip.open('data/AMAZON_FASHION.json.gz') as f:
    for l in f:
        data.append(json.loads(l.strip()))

metadata = []
with gzip.open('data/meta_AMAZON_FASHION.json.gz') as f:
    for l in f:
        metadata.append(json.loads(l.strip()))

In [5]:
# Load the data to dataframes

df = pd.DataFrame.from_dict(data)
df = df[df['reviewText'].notna()]

df_meta=pd.DataFrame.from_dict(metadata)

In [6]:
# Truncate the reviewText

max_text_length=400
def truncate_review(text):
    return text[:max_text_length]

df['truncated']=df.apply(lambda row: truncate_review(row['reviewText']),axis=1)

**This is what the data looks like after cleaning.**

In [7]:
df

Unnamed: 0,overall,verified,reviewTime,reviewerID,asin,reviewerName,reviewText,summary,unixReviewTime,vote,style,image,truncated
0,5.0,True,"10 20, 2014",A1D4G1SNUZWQOT,7106116521,Tracy,Exactly what I needed.,perfect replacements!!,1413763200,,,,Exactly what I needed.
1,2.0,True,"09 28, 2014",A3DDWDH9PX2YX2,7106116521,Sonja Lau,"I agree with the other review, the opening is ...","I agree with the other review, the opening is ...",1411862400,3,,,"I agree with the other review, the opening is ..."
2,4.0,False,"08 25, 2014",A2MWC41EW7XL15,7106116521,Kathleen,Love these... I am going to order another pack...,My New 'Friends' !!,1408924800,,,,Love these... I am going to order another pack...
3,2.0,True,"08 24, 2014",A2UH2QQ275NV45,7106116521,Jodi Stoner,too tiny an opening,Two Stars,1408838400,,,,too tiny an opening
4,3.0,False,"07 27, 2014",A89F3LQADZBS5,7106116521,Alexander D.,Okay,Three Stars,1406419200,,,,Okay
...,...,...,...,...,...,...,...,...,...,...,...,...,...
883631,5.0,True,"02 21, 2017",A1ZSB2Q144UTEY,B01HJHTH5U,Amazon Customer,I absolutely love this dress!! It's sexy and ...,I absolutely love this dress,1487635200,,,,I absolutely love this dress!! It's sexy and ...
883632,5.0,True,"11 25, 2016",A2CCDV0J5VB6F2,B01HJHTH5U,Amazon Customer,I'm 5'6 175lbs. I'm on the tall side. I wear a...,I wear a large and ordered a large and it stil...,1480032000,2,,,I'm 5'6 175lbs. I'm on the tall side. I wear a...
883633,3.0,True,"11 10, 2016",A3O90PACS7B61K,B01HJHTH5U,Fabfifty,Too big in the chest area!,Three Stars,1478736000,,,,Too big in the chest area!
883634,3.0,True,"11 10, 2016",A2HO94I89U3LNH,B01HJHF97K,Mgomez,"Too clear in the back, needs lining",Three Stars,1478736000,,,,"Too clear in the back, needs lining"


### Step 2: Narrowing down the dataset


To showcase the power of LLMs in summarizing large quantity of data and to keep this guide simple, we will only select the two products with the most reviews.

In [8]:
# Look for productIds with the most reviews

df.groupby('asin').count().sort_values('overall').tail(20)

Unnamed: 0_level_0,overall,verified,reviewTime,reviewerID,reviewerName,reviewText,summary,unixReviewTime,vote,style,image,truncated
asin,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
B00XTM0ZPG,1405,1405,1405,1405,1405,1405,1405,1405,33,1405,66,1405
B000GHMRLW,1415,1415,1415,1415,1414,1415,1415,1415,53,1391,3,1415
B000GHRZN2,1415,1415,1415,1415,1414,1415,1415,1415,0,0,3,1415
B00ZW3SCF0,1522,1522,1522,1522,1522,1522,1518,1522,142,1520,276,1522
B000JOOR7O,1584,1584,1584,1584,1584,1584,1583,1584,74,1538,28,1584
B009RUKQ2G,1590,1590,1590,1590,1590,1590,1590,1590,92,1590,27,1590
B000YFSR4W,1648,1648,1648,1648,1648,1648,1646,1648,44,1612,10,1648
B004HX6P1E,1671,1671,1671,1671,1671,1671,1670,1671,147,1670,81,1671
B005N7YWX6,1688,1688,1688,1688,1688,1688,1688,1688,101,1649,11,1688
B0017U1KBK,1826,1826,1826,1826,1826,1826,1824,1826,178,0,49,1826


After filtering by number of reviews, we are left with two products:

1. RFID Blocking Card Holder
2. PowerStep Pinnacle Orthotic Shoe Insoles


In [9]:
# Work on only a slice of the dataframe

df = df.loc[(df['asin'] == 'B00GXE331K') | (df['asin'] == 'B000KPIHQ4')].copy()

**Below is a snapshot of the reviews dataset, specifically focusing on the two products we've chosen for this analysis: 'RFID Blocking Card Holder' and 'PowerStep Pinnacle Orthotic Shoe Insoles'.**

In [10]:
df

Unnamed: 0,overall,verified,reviewTime,reviewerID,asin,reviewerName,reviewText,summary,unixReviewTime,vote,style,image,truncated
11218,3.0,True,"09 26, 2007",A1CIM0XZ3UA926,B000KPIHQ4,M. Cane,"Good price, good product. Howver, it is generi...",Orthotics off the rack,1190764800,2,"{'Size Name:': ' Men's 5-5.5, Women's 7-7.5', ...",,"Good price, good product. Howver, it is generi..."
11219,5.0,True,"01 18, 2007",A1EVVPCWRW5YYZ,B000KPIHQ4,Deborah Morris,My husband rates these insoles a 5 for comfort...,Very comfortable,1169078400,3,"{'Size Name:': ' Men's 10-10.5, Women's 12', '...",,My husband rates these insoles a 5 for comfort...
11220,5.0,True,"05 18, 2018",A2P3NZ9H4PANK0,B000KPIHQ4,Stephanie,I have worn the Powerstep Pinnacle shoe insole...,... Pinnacle shoe insoles for the past 5 years...,1526601600,,"{'Size Name:': ' Men's 6-6.5, Women's 8-8.5', ...",,I have worn the Powerstep Pinnacle shoe insole...
11221,1.0,True,"05 18, 2018",A2975GY186VV7A,B000KPIHQ4,jessica etim,Very uncomfortable feel like I wasted my money!,Uncomfortable,1526601600,,"{'Size Name:': ' Men's 7-7.5, Women's 9-9.5', ...",,Very uncomfortable feel like I wasted my money!
11222,5.0,True,"05 17, 2018",A3U8E58RIKWDAW,B000KPIHQ4,Nancy Mazzuca,work perfect,Five Stars,1526515200,,"{'Size Name:': ' Men's 9-9.5, Women's 11-11.5'...",,work perfect
...,...,...,...,...,...,...,...,...,...,...,...,...,...
486369,2.0,True,"07 4, 2018",AQCHECTIUVKTV,B00GXE331K,Amazon Customer,I started switching my cards from my old walle...,I started switching my cards from my old walle...,1530662400,,{'Color:': ' Stainless Steel'},,I started switching my cards from my old walle...
486370,5.0,True,"07 4, 2018",A1LXAF4YMKSDEB,B00GXE331K,Amazon Customer,I really love the card holder case that I'm us...,I really love the card holder case that I'm us...,1530662400,,{'Color:': ' Black Stainless Steel'},,I really love the card holder case that I'm us...
486371,4.0,True,"07 3, 2018",A3USRXIGMZW02O,B00GXE331K,Dave Dettelbach,Fast shipping and product looks great.,Four Stars,1530576000,,{'Color:': ' Black Stainless Steel'},,Fast shipping and product looks great.
486372,5.0,True,"07 3, 2018",A1M00GF04C1TZK,B00GXE331K,xiiztec,"Love it, held it and didn't want to put it down.",Absolutely amazing,1530576000,,{'Color:': ' Black Stainless Steel'},,"Love it, held it and didn't want to put it down."


### Step 3: Converting our review data to embeddings in preparation for storage

In this step, we're converting our textual data into vectors. These vectors pack the essence of our product reviews into a format that's semantically rich and compact.

In subsequent steps we will use these embeddings to extract relevant context for the LLM to leverage. For this guide we are using "HuggingFaceEmbeddings" to generate the embeddings but stay tuned for Konko's unified embeddings API (coming soon!)

Remember to only vectorize relevant data for your usecase as using embeddings can get costly at scale. It is best practice to narrow down your dataset before vectorizing it.


In [11]:
# Import and apply embeddings from HuggingFace
from langchain.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings()

df['embeddings']=df.apply(lambda row: embeddings.embed_query(row['truncated']),axis=1)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


.gitattributes:   0%|          | 0.00/1.18k [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

data_config.json:   0%|          | 0.00/39.3k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/438M [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

train_script.py:   0%|          | 0.00/13.1k [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

In [47]:
len(df['embeddings'].iloc[0])

768

### Step 4: Storing our embedding into Pinecone 🌲

To harness the true potential of our review embeddings, we're tapping into Pinecone - a vector database. By doing so, we're not only storing our data but also setting the stage for **LLAMA 2 13B** to weave its magic and derive meaningful insights from the reviews.

Now the data is ready, we can set up our index to store it.

We begin by initializing our connection to Pinecone. To do this we need a [free API key](https://app.pinecone.io).

In [16]:
from pinecone import Pinecone
from getpass import getpass

# initialize connection to pinecone (get API key at app.pinecone.io)
api_key = os.getenv("PINECONE_API_KEY") or getpass("Enter your Pinecone API key: ")

# configure client
pc = Pinecone(api_key=api_key)
environment = os.environ.get('PINECONE_ENVIRONMENT') or 'PINECONE_ENVIRONMENT'

Enter your Pinecone API key: ··········


Now we setup our index specification, this allows us to define the cloud provider and region where we want to deploy our index. You can find a list of all [available providers and regions here](https://docs.pinecone.io/docs/projects).

In [55]:
import os

use_serverless = os.environ.get("USE_SERVERLESS", "False").lower() == "true"

In [56]:
from pinecone import ServerlessSpec, PodSpec

if use_serverless:
    spec = ServerlessSpec(cloud='aws', region='us-west-2')
else:
    spec = PodSpec(environment=environment)

In [57]:
index_name = 'langchain-retrieval-augmentation-fast'

In [58]:
import time

if index_name in pc.list_indexes().names():
    pc.delete_index(index_name)

# we create a new index
pc.create_index(
        index_name,
        dimension=768,  # dimensionality of embedding
        metric='dotproduct',
        spec=spec
    )

# wait for index to be initialized
while not pc.describe_index(index_name).status['ready']:
    time.sleep(1)

In [59]:
index = pc.Index(index_name)
# wait a moment for connection
time.sleep(1)

index.describe_index_stats()

{'dimension': 768,
 'index_fullness': 0.0,
 'namespaces': {},
 'total_vector_count': 0}

**Transform & Upload:** Convert truncated reviews into a list, embed via HuggingFace, and store using Pinecone's from_texts method.



In [60]:
# Create list with truncated review texts

texts=df['truncated'].tolist()

In [61]:
len(texts)

6396

In [62]:
from langchain.vectorstores import Pinecone

# switch back to normal index for langchain
index = pc.Index(index_name)

vstore = Pinecone(
    index, embeddings, "text"
)


In [None]:
vstore.add_texts(texts)

**Confirmation:** A quick glance at Pinecone's dashboard verifies the successful upload of our review vectors.

Before uploading:

![](https://raw.githubusercontent.com/konko-ai/examples/main/img/pinecone_before.png)


After uploading:

![](https://raw.githubusercontent.com/konko-ai/examples/main/img/pinecone_after.png)

### Step 5: Putting it all together with Langchain and Llama 2 🦙⛓🦜


In this step we will setup LLama 2 using Konko's unified API and Langchain's RetrievalQA.

Konko's unified API allows you to easily query the best LLM with a single API call.

Every time a user sends a prompt, RetrievalQA extracts the relevant context from our vector store (Pinecone) and appends it to the user prompt before sending it to our LLM (LLama 2).


#### Step 5a:  🛠 Setting Up RetrievalQA chain

RetrievalQA provides the most generic interface for answering questions. It loads a chain that you can do QA for your input reviews. The default chain_type="stuff" uses ALL of the text from the relevant reviews in the prompt.

####  Set Konko API Key

In [65]:
os.environ['KONKO_API_KEY'] = 'your_konko_api_key'

In [66]:
# Import RetrievalQA and Konko API and define review_chain in order to have Llama 2 access the review data

from langchain.chains import RetrievalQA
from langchain.chat_models import ChatKonko

chat = ChatKonko(model='meta-llama/llama-2-13b-chat', max_tokens = 2000)
review_chain = RetrievalQA.from_chain_type(llm=chat, chain_type="stuff", retriever=vstore.as_retriever())

### Step 5b: Test your RetrievalQA chain and Refine Llama 2's prompt and parameters

We're set to extract meaningful feedback and actionable recommendations.

Now all that's left is figuring out the right prompt and parameters for LLama 2 to return the best summaries for your usecase.

The right answer will vary depending on your specific context and needs. This is an iterative process and it goes like this:

**Experimentation Phase**: Crafting the Query. In this guide, we're asking LLaMa 2 for an overall impression, detailed examples, and potential improvements. We also have the flexibility to modify parameters for LLM, such as temperature, top-p, top-k, and so forth.

**Optimizing the Response**: Use system messages to fine-tune LLaMa 2's responses for clearer insights.


**Execution**: 🚀 From Concept to Reality: What's the possibility? Transition this into weekly summaries for teams, guaranteeing that feedback remains practical and implementable.

**Harness the reviews, guide the strategy!**

In [67]:
# Define the task for Llama 2 and run the chain

q="""
The reviews you see are for a product called 'Best RFID Blocking Card Holder Case for Men and Women Slim Stainless Steel Metal Wallet'.
What is the overall impression of these reviews? Give most prevalent examples in bullets.
What do you suggest we focus on improving?
"""

result=review_chain.run(q)
print(result)

  warn_deprecated(


  Based on the reviews provided, here is the overall impression of the product:

Overall Impression:

* Good quality and sturdy product
* Satisfies the need for a slim and professional wallet
* Cards are elegantly separated and easy to grab
* Holds all necessary cards and cash
* Aluminum finish is beautiful and professional-looking
* The product is great for those who want a hard case without looking like a cigarette case

Most Prevalent Examples in Bullets:

* Good quality and sturdy product
* Satisfies the need for a slim and professional wallet
* Cards are elegantly separated and easy to grab
* Holds all necessary cards and cash
* Aluminum finish is beautiful and professional-looking

Based on the reviews, it seems that the product is well-liked for its sleek design, sturdiness, and ability to hold all necessary cards and cash. The aluminum finish is also praised for its professional look.

However, there are some suggestions for improvement:

* Some reviewers mentioned that the pro

In [68]:
# Define the task for Llama 2 and run the chain

q="""
The reviews you see are for a product called 'Powerstep Pinnacle Orthotic Shoe Insoles'.
What is the overall impression of these reviews? Give most prevalent examples in bullets.
What do you suggest we focus on improving?
"""

result=review_chain.run(q)
print(result)

  Based on the reviews provided, here is the overall impression of the Powerstep Pinnacle Orthotic Shoe Insoles:

Overall Impression:

* Good quality insoles with effective support for foot pain
* Some users experienced significant improvement in foot pain and discomfort
* The Orthotics-U Blue model is recommended for those with mild pronation
* The Superfeet insoles may be too high in arch support for some users

Most Prevalent Examples (in bullets):

* Many users reported improved foot comfort and reduced pain after using the insoles
* Some users experienced improved arch support and reduced foot fatigue
* The Orthotics-U Blue model was recommended for those with mild pronation
* A few users reported that the Superfeet insoles were too high in arch support and caused discomfort
* Many users were satisfied with the product and recommended it to others

What to Focus on Improving:

* Based on the reviews, it appears that the Powerstep Pinnacle Orthotic Shoe Insoles are effective in pro