## Build a RAG agent that reads the internal data from CSV and is used to query LLM to answer questions pertaining to this data

- We are going to start by importing basic and langchain libraries
- We will use CSVLoader provided by langchain framework to read the CSV
- Create embeddings for data using OpenAI
- Docarray for creating vectorstore
- providing the vectors index to LLM for getting answers to questions
- building a conversational agent using langchain.tools.Tool class by Langchain framework.

In [43]:
import dotenv
import os
import openai
import warnings
warnings.filterwarnings('ignore')

In [44]:
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

#to avoid using deprecated openai chat model
import datetime
current_date = datetime.datetime.now()
target_date = datetime.datetime(2024, 6, 12)

if current_date>target_date:
    llm_model = 'gpt-3.5-turbo'
else:
    llm_model = 'gpt-3.5-turbo-0301'

In [45]:
from langchain.chat_models import ChatOpenAI
from langchain.chains import RetrievalQA
from langchain.document_loaders import CSVLoader
from langchain.vectorstores import DocArrayInMemorySearch
from langchain.llms import OpenAI
from IPython.display import Markdown, display

In [46]:
loader = CSVLoader(file_path="OutdoorClothingCatalog_1000.csv")

In [47]:
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()

from langchain.indexes import VectorstoreIndexCreator
#help(VectorstoreIndexCreator)

In [48]:
pip install --quiet -U docarray

Note: you may need to restart the kernel to use updated packages.


In [49]:
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch,
embedding=embeddings).from_loaders([loader])

In [50]:
query="Please list all your shirts with sun protection \
in a table in markdown and summarize each one."

In [51]:
llm_replacement_model = OpenAI(
    model="gpt-3.5-turbo-instruct",
    temperature=0.2
)

In [52]:
response = index.query(query, llm=llm_replacement_model)

In [53]:
display(Markdown(response))



| Name | Description | Sun Protection Rating |
| --- | --- | --- |
| Men's Tropical Plaid Short-Sleeve Shirt | Made of 100% polyester, UPF 50+ rating, front and back cape venting, two front bellows pockets, imported | SPF 50+, blocks 98% of harmful UV rays |
| Men's Plaid Tropic Shirt, Short-Sleeve | Made of 52% polyester and 48% nylon, UPF 50+ rating, front and back cape venting, two front bellows pockets, machine washable and dryable, imported | SPF 50+, blocks 98% of harmful UV rays |
| Men's TropicVibe Shirt, Short-Sleeve | Made of 71% nylon and 29% polyester, UPF 50+ rating, front and back cape venting, two front bellows pockets, machine washable and dryable, imported | SPF 50+, blocks 98% of harmful UV rays |
| Sun Shield Shirt | Made of 78% nylon and 22% Lycra Xtra Life fiber, UPF 50+ rating, wicks moisture, fits comfortably over swimsuit, abrasion resistant, imported | SPF 50

In [54]:
print(response)



| Name | Description | Sun Protection Rating |
| --- | --- | --- |
| Men's Tropical Plaid Short-Sleeve Shirt | Made of 100% polyester, UPF 50+ rating, front and back cape venting, two front bellows pockets, imported | SPF 50+, blocks 98% of harmful UV rays |
| Men's Plaid Tropic Shirt, Short-Sleeve | Made of 52% polyester and 48% nylon, UPF 50+ rating, front and back cape venting, two front bellows pockets, machine washable and dryable, imported | SPF 50+, blocks 98% of harmful UV rays |
| Men's TropicVibe Shirt, Short-Sleeve | Made of 71% nylon and 29% polyester, UPF 50+ rating, front and back cape venting, two front bellows pockets, machine washable and dryable, imported | SPF 50+, blocks 98% of harmful UV rays |
| Sun Shield Shirt | Made of 78% nylon and 22% Lycra Xtra Life fiber, UPF 50+ rating, wicks moisture, fits comfortably over swimsuit, abrasion resistant, imported | SPF 50


In [55]:
loader.file_path

'OutdoorClothingCatalog_1000.csv'

In [56]:
import pandas as pd

csv_data = pd.read_csv("OutdoorClothingCatalog_1000.csv")
csv_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 3 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   Unnamed: 0   1000 non-null   int64 
 1   name         1000 non-null   object
 2   description  1000 non-null   object
dtypes: int64(1), object(2)
memory usage: 23.6+ KB


In [57]:
csv_data.head()

Unnamed: 0.1,Unnamed: 0,name,description
0,0,Women's Campside Oxfords,This ultracomfortable lace-to-toe Oxford boast...
1,1,"Recycled Waterhog Dog Mat, Chevron Weave",Protect your floors from spills and splashing ...
2,2,Infant and Toddler Girls' Coastal Chill Swimsu...,"She'll love the bright colors, ruffles and exc..."
3,3,"Refresh Swimwear, V-Neck Tankini Contrasts",Whether you're going for a swim or heading out...
4,4,EcoFlex 3L Storm Pants,Our new TEK O2 technology makes our four-seaso...


In [58]:
query = """What are the different products available based on the names?"""
response = index.query(query, llm=llm_replacement_model)

In [59]:
display(Markdown(response))

 The first product is called Microsilk and the second product is called NanoFoamPlus.

In [60]:
query = """How many products are there in the file in total and name them?"""
response = index.query(query, llm=llm_replacement_model)

In [61]:
display(Markdown(response))

 There are three products in the file: Microsilk, Microblocks, and Microbot.

In [62]:
doc = loader.load()

In [63]:
type(doc)

list

In [64]:
#library to create embeddings
from langchain.embeddings import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()

embedded_vector = embeddings.embed_query("Sanakausar Kazi")

In [65]:
print(len(embedded_vector)) #length or size of the vector for above

1536


In [66]:
db = DocArrayInMemorySearch.from_documents(doc, embeddings)

In [67]:
query = "Suggest me some sunblocking shirts."

result = db.similarity_search(query)

In [68]:
print(result)

[Document(metadata={'source': 'OutdoorClothingCatalog_1000.csv', 'row': 255}, page_content=': 255\nname: Sun Shield Shirt by\ndescription: "Block the sun, not the fun – our high-performance sun shirt is guaranteed to protect from harmful UV rays. \n\nSize & Fit: Slightly Fitted: Softly shapes the body. Falls at hip.\n\nFabric & Care: 78% nylon, 22% Lycra Xtra Life fiber. UPF 50+ rated – the highest rated sun protection possible. Handwash, line dry.\n\nAdditional Features: Wicks moisture for quick-drying comfort. Fits comfortably over your favorite swimsuit. Abrasion resistant for season after season of wear. Imported.\n\nSun Protection That Won\'t Wear Off\nOur high-performance fabric provides SPF 50+ sun protection, blocking 98% of the sun\'s harmful rays. This fabric is recommended by The Skin Cancer Foundation as an effective UV protectant.'), Document(metadata={'source': 'OutdoorClothingCatalog_1000.csv', 'row': 374}, page_content=": 374\nname: Men's Plaid Tropic Shirt, Short-Sleev

In [69]:
llm_db = ChatOpenAI(model=llm_model, temperature=0.2)

qdocs = "".join(result[i].page_content for i in range(len(result)))
print(qdocs)

: 255
name: Sun Shield Shirt by
description: "Block the sun, not the fun – our high-performance sun shirt is guaranteed to protect from harmful UV rays. 

Size & Fit: Slightly Fitted: Softly shapes the body. Falls at hip.

Fabric & Care: 78% nylon, 22% Lycra Xtra Life fiber. UPF 50+ rated – the highest rated sun protection possible. Handwash, line dry.

Additional Features: Wicks moisture for quick-drying comfort. Fits comfortably over your favorite swimsuit. Abrasion resistant for season after season of wear. Imported.

Sun Protection That Won't Wear Off
Our high-performance fabric provides SPF 50+ sun protection, blocking 98% of the sun's harmful rays. This fabric is recommended by The Skin Cancer Foundation as an effective UV protectant.: 374
name: Men's Plaid Tropic Shirt, Short-Sleeve
description: Our Ultracomfortable sun protection is rated to UPF 50+, helping you stay cool and dry. Originally designed for fishing, this lightest hot-weather shirt offers UPF 50+ coverage and is gr

In [70]:
type(qdocs)

str

In [71]:
response = llm_db.call_as_llm(
    f"{qdocs} Question: Please list all the shirts from the list provided above in a table in markdown and summarize each one."
)

In [72]:
print(display(Markdown(response)))

| Name                           | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                

None


In [73]:
retriever = db.as_retriever()
qa_stuff = RetrievalQA.from_chain_type(
    llm=llm_db,
    retriever=retriever,
    chain_type="stuff",
    verbose = True
)

In [74]:
query = "List down all your shirts which provide sun blocking and format them in a tabular format in a markdown and summarize each item."

search_result = qa_stuff.run(query)

print(display(Markdown(search_result)))




[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


| **Name**                                | **Description**                                                                                                                                                                                                                                        |
|-----------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Men's Plaid Tropic Shirt, Short-Sleeve | Ultracomfortable shirt with UPF 50+ sun protection, designed for fishing and extended travel. Made of 52% polyester and 48% nylon, wrinkle-free, and quick-drying. Features front and back cape venting, two front bellows pockets. Imported design.            |
| Men's Tropical Plaid Short-Sleeve Shirt | Lightest hot-weather shirt with UPF 50+ protection. Made of 100% polyester, wrinkle-resistant, with front and back cape venting, and two front bellows pockets. Provides superior sun protection. Imported.                                                |
| Sun Shield Shirt by                     | High-performance sun shirt with SPF 50+ sun protection, blocking 98% of harmful UV rays. Made of 78% nylon and 22% Lycra Xtra Life fiber. Handwash, line dry. Moisture-wicking, fits comfortably over swimsuit, abrasion-resistant. Recommended by The Skin Cancer Foundation. |
| Men's TropicVibe Shirt, Short-Sleeve    | Sun-protection shirt with built-in UPF 50+, lightweight and comfortable. Made of 71% nylon and 29% polyester shell, 100% polyester knit mesh lining. Wrinkle-resistant, front and back cape venting, two front bellows pockets. Provides SPF 50+ sun protection.        |

None


In [75]:
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch,
    embedding=embeddings
).from_loaders([loader])

In [76]:
response = index.query(query, llm=llm_db)
print(display(Markdown(response)))

| Name                                | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                

None


#### Let us build a conversational agent to help user query the data

In [77]:
from langchain.tools import Tool
tools = [
    Tool(
        name="Knowledge Base",
        func=qa_stuff,
        description="Use this tool to anser questions about the listed products."
    )
]

In [78]:
from langchain.agents import initialize_agent
from langchain.chains.conversation.memory import ConversationBufferWindowMemory

conversation_memory = ConversationBufferWindowMemory(
    memory_key="chat_history",
    k=5,
    return_messages=True
)

agent = initialize_agent(
    tools=tools,
    agent='chat-conversational-react-description',
    llm=llm_db,
    verbose=True,
    max_iterations=3,
    early_stopping_method="generate",
    memory=conversation_memory
)

In [79]:
agent(query)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m```json
{
    "action": "Knowledge Base",
    "action_input": "List of shirts with sun blocking feature"
}
```[0m

[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m

Observation: [36;1m[1;3m{'query': 'List of shirts with sun blocking feature', 'result': "1. Sun Shield Shirt by [Brand Name]\n2. Men's Plaid Tropic Shirt, Short-Sleeve\n3. Men's TropicVibe Shirt, Short-Sleeve\n4. Men's Tropical Plaid Short-Sleeve Shirt"}[0m
Thought:[32;1m[1;3m```json
{
    "action": "Final Answer",
    "action_input": "The shirts that provide sun blocking feature are: \n\n1. Sun Shield Shirt by [Brand Name] - This shirt offers excellent sun protection with its special fabric.\n\n2. Men's Plaid Tropic Shirt, Short-Sleeve - A stylish short-sleeve shirt that blocks harmful UV rays.\n\n3. Men's TropicVibe Shirt, Short-Sleeve - This shirt is designed to keep you protected from the sun while keeping you cool and comfortable.\

{'input': 'List down all your shirts which provide sun blocking and format them in a tabular format in a markdown and summarize each item.',
 'chat_history': [],
 'output': "The shirts that provide sun blocking feature are: \n\n1. Sun Shield Shirt by [Brand Name] - This shirt offers excellent sun protection with its special fabric.\n\n2. Men's Plaid Tropic Shirt, Short-Sleeve - A stylish short-sleeve shirt that blocks harmful UV rays.\n\n3. Men's TropicVibe Shirt, Short-Sleeve - This shirt is designed to keep you protected from the sun while keeping you cool and comfortable.\n\n4. Men's Tropical Plaid Short-Sleeve Shirt - A fashionable shirt that also provides sun protection."}

In [80]:
agent("can you list down only the names of shirts from the above?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m```json
{
    "action": "Final Answer",
    "action_input": "Sun Shield Shirt, Men's Plaid Tropic Shirt, Men's TropicVibe Shirt, Men's Tropical Plaid Short-Sleeve Shirt"
}
```[0m

[1m> Finished chain.[0m


{'input': 'can you list down only the names of shirts from the above?',
 'chat_history': [HumanMessage(content='List down all your shirts which provide sun blocking and format them in a tabular format in a markdown and summarize each item.'),
  AIMessage(content="The shirts that provide sun blocking feature are: \n\n1. Sun Shield Shirt by [Brand Name] - This shirt offers excellent sun protection with its special fabric.\n\n2. Men's Plaid Tropic Shirt, Short-Sleeve - A stylish short-sleeve shirt that blocks harmful UV rays.\n\n3. Men's TropicVibe Shirt, Short-Sleeve - This shirt is designed to keep you protected from the sun while keeping you cool and comfortable.\n\n4. Men's Tropical Plaid Short-Sleeve Shirt - A fashionable shirt that also provides sun protection.")],
 'output': "Sun Shield Shirt, Men's Plaid Tropic Shirt, Men's TropicVibe Shirt, Men's Tropical Plaid Short-Sleeve Shirt"}

In [82]:
response=agent("which shirt provides the maximum UPF protection?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m```json
{
    "action": "Knowledge Base",
    "action_input": "Sun Shield Shirt UPF protection"
}
```[0m

[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m

Observation: [36;1m[1;3m{'query': 'Sun Shield Shirt UPF protection', 'result': "The Sun Shield Shirt provides UPF 50+ sun protection, which blocks 98% of the sun's harmful rays. This fabric is recommended by The Skin Cancer Foundation as an effective UV protectant."}[0m
Thought:[32;1m[1;3m```json
{
    "action": "Final Answer",
    "action_input": "The Sun Shield Shirt provides UPF 50+ sun protection, which blocks 98% of the sun's harmful rays. This fabric is recommended by The Skin Cancer Foundation as an effective UV protectant."
}
```[0m

[1m> Finished chain.[0m


In [87]:
from pprint import pprint
pprint(response['chat_history'])

[HumanMessage(content='List down all your shirts which provide sun blocking and format them in a tabular format in a markdown and summarize each item.'),
 AIMessage(content="The shirts that provide sun blocking feature are: \n\n1. Sun Shield Shirt by [Brand Name] - This shirt offers excellent sun protection with its special fabric.\n\n2. Men's Plaid Tropic Shirt, Short-Sleeve - A stylish short-sleeve shirt that blocks harmful UV rays.\n\n3. Men's TropicVibe Shirt, Short-Sleeve - This shirt is designed to keep you protected from the sun while keeping you cool and comfortable.\n\n4. Men's Tropical Plaid Short-Sleeve Shirt - A fashionable shirt that also provides sun protection."),
 HumanMessage(content='can you list down only the names of shirts from the above?'),
 AIMessage(content="Sun Shield Shirt, Men's Plaid Tropic Shirt, Men's TropicVibe Shirt, Men's Tropical Plaid Short-Sleeve Shirt"),
 HumanMessage(content='which shirt provides the maximum UPF protection?'),
 AIMessage(content="T

In [89]:
for item in response:
    print(f'{'-' * 50}')
    print(item)

--------------------------------------------------
input
--------------------------------------------------
chat_history
--------------------------------------------------
output


In [90]:
print(response['output'])

The Sun Shield Shirt provides UPF 50+ sun protection, which blocks 98% of the sun's harmful rays. This fabric is recommended by The Skin Cancer Foundation as an effective UV protectant.
