# LangChain: Q&A over Documents

An example might be a tool that would allow you to query a product catalog for items of interest.

In [8]:
#pip install --upgrade langchain

In [9]:
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

In [10]:
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import CSVLoader
from langchain.vectorstores import DocArrayInMemorySearch
from IPython.display import display, Markdown

In [15]:
file = 'myntra_products_catalog.csv'
loader = CSVLoader(file_path=file)

In [16]:
from langchain.indexes import VectorstoreIndexCreator

In [17]:
#pip install docarray

In [18]:
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch
).from_loaders([loader])

In [19]:
import pandas as pd

In [20]:
df = pd.read_csv('myntra_products_catalog.csv')

In [21]:
df.head()

Unnamed: 0,ProductID,ProductName,ProductBrand,Gender,Price (INR),NumImages,Description,PrimaryColor
0,10017413,DKNY Unisex Black & Grey Printed Medium Trolle...,DKNY,Unisex,11745,7,"Black and grey printed medium trolley bag, sec...",Black
1,10016283,EthnoVogue Women Beige & Grey Made to Measure ...,EthnoVogue,Women,5810,7,Beige & Grey made to measure kurta with churid...,Beige
2,10009781,SPYKAR Women Pink Alexa Super Skinny Fit High-...,SPYKAR,Women,899,7,Pink coloured wash 5-pocket high-rise cropped ...,Pink
3,10015921,Raymond Men Blue Self-Design Single-Breasted B...,Raymond,Men,5599,5,Blue self-design bandhgala suitBlue self-desig...,Blue
4,10017833,Parx Men Brown & Off-White Slim Fit Printed Ca...,Parx,Men,759,5,"Brown and off-white printed casual shirt, has ...",White


In [22]:
df.columns

Index(['ProductID', 'ProductName', 'ProductBrand', 'Gender', 'Price (INR)',
       'NumImages', 'Description', 'PrimaryColor'],
      dtype='object')

In [23]:
df.dtypes

ProductID        int64
ProductName     object
ProductBrand    object
Gender          object
Price (INR)      int64
NumImages        int64
Description     object
PrimaryColor    object
dtype: object

In [24]:
df.shape

(12491, 8)

In [26]:
df = df.dropna()
df.shape

(11597, 8)

In [27]:
df.describe().transpose()

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
ProductID,11597.0,9962030.0,1297211.0,101206.0,10063875.0,10155387.0,10215659.0,10275139.0
Price (INR),11597.0,1460.913,2159.003,153.0,649.0,939.0,1499.0,63090.0
NumImages,11597.0,4.967319,1.063547,1.0,5.0,5.0,5.0,10.0


In [28]:
df['Description']

0        Black and grey printed medium trolley bag, sec...
1        Beige & Grey made to measure kurta with churid...
2        Pink coloured wash 5-pocket high-rise cropped ...
3        Blue self-design bandhgala suitBlue self-desig...
4        Brown and off-white printed casual shirt, has ...
                               ...                        
12485    Black lace full-coverage Bralette bra Lightly ...
12486    Black dark wash 5-pocket low-rise jeans, clean...
12487    A pair of gold-toned open toe heels, has regul...
12488    Navy Blue and White printed mid-rise denim sho...
12490    Black and grey striped T-shirt, has a polo col...
Name: Description, Length: 11597, dtype: object

In [29]:
query ="Please list all your shirts with sun protection \
in a table in markdown and summarize each one."

In [30]:
response = index.query(query)

In [31]:
display(Markdown(response))

 I don't know.

In [32]:
query ="Please list all your shirts available in black \
in a table in markdown and summarize each one."

In [33]:
response = index.query(query)

In [34]:
display(Markdown(response))



| ProductID | ProductName | Price (INR) |
|-----------|------------|-------------|
| 10245285 | Basics Men Black & Grey Slim Fit Striped Casual Shirt | 1469 |
| 10245063 | Basics Men Black Slim Fit Solid Casual Shirt | 1424 |
| 10201151 | Mufti Men Black Regular Fit Printed Casual Shirt | 999 |

Basics Men Black & Grey Slim Fit Striped Casual Shirt: Black striped casual shirt, has a spread collar, long sleeves, button placket, and curved hem.

Basics Men Black Slim Fit Solid Casual Shirt: Black solid casual shirt, has a spread collar, long sleeves, button placket, curved hem, and 1 patch pocket.

Mufti Men Black Regular Fit Printed Casual Shirt: Black printed casual shirt, has a spread collar, long sleeves, button placket, curved hem, and 1 patch pocket.

In [35]:
loader = CSVLoader(file_path=file)

In [36]:
docs = loader.load()

In [37]:
docs[0]

Document(page_content='ProductID: 10017413\nProductName: DKNY Unisex Black & Grey Printed Medium Trolley Bag\nProductBrand: DKNY\nGender: Unisex\nPrice (INR): 11745\nNumImages: 7\nDescription: Black and grey printed medium trolley bag, secured with a TSA lockOne handle on the top and one on the side, has a trolley with a retractable handle on the top and four corner mounted inline skate wheelsOne main zip compartment, zip lining, two compression straps with click clasps, one zip compartment on the flap with three zip pocketsWarranty: 5 yearsWarranty provided by Brand Owner / Manufacturer\nPrimaryColor: Black', metadata={'source': 'myntra_products_catalog.csv', 'row': 0})

In [38]:
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()

In [39]:
embed = embeddings.embed_query("Hi my name is Harrison")

In [40]:
print(len(embed))

1536


In [41]:
print(embed[:5])

[-0.021913960576057434, 0.006774206645786762, -0.018190348520874977, -0.039148248732089996, -0.014089343138039112]


In [42]:
db = DocArrayInMemorySearch.from_documents(
    docs, 
    embeddings
)

In [43]:
query = "Please suggest a cotton shirt in black"

In [44]:
docs = db.similarity_search(query)

In [45]:
len(docs)

4

In [46]:
docs[0]

Document(page_content='ProductID: 10201151\nProductName: Mufti Men Black Regular Fit Printed Casual Shirt\nProductBrand: Mufti\nGender: Men\nPrice (INR): 999\nNumImages: 5\nDescription: Black printed casual shirt, has a spread collar, long sleeves, button placket, curved hem, and 1 patch pocket\nPrimaryColor: Black', metadata={'source': 'myntra_products_catalog.csv', 'row': 8738})

In [47]:
retriever = db.as_retriever()

In [48]:
llm = ChatOpenAI(temperature = 0.0)


In [49]:
qdocs = "".join([docs[i].page_content for i in range(len(docs))])


In [50]:
response = llm.call_as_llm(f"{qdocs} Question: Please list all your \
shirts with sun protection in a table in markdown and summarize each one.") 


In [51]:
display(Markdown(response))

| ProductID  | ProductName                                                | ProductBrand   | Gender | Price (INR) | NumImages | Description                                                                                          | PrimaryColor |
|------------|------------------------------------------------------------|----------------|--------|-------------|-----------|------------------------------------------------------------------------------------------------------|--------------|
| 10201151   | Mufti Men Black Regular Fit Printed Casual Shirt            | Mufti          | Men    | 999         | 5         | Black printed casual shirt, has a spread collar, long sleeves, button placket, curved hem, and 1 patch pocket | Black        |
| 10038579   | IVOC Men Black Regular Fit Printed Casual Shirt             | IVOC           | Men    | 461         | 5         | Black printed casual shirt, has a spread collar, short sleeves, button placket, and curved hem         | Black        |
| 10153269   | Indian Terrain Men Black Slim Fit Printed Pure Cotton Shirt | Indian Terrain | Men    | 989         | 6         | Black printed casual shirt, has a spread collar, long sleeves, button placket, curved hem, and 1 patch pocket | Black        |
| 10201137   | Mufti Men Black & White Regular Fit Printed Casual Shirt    | Mufti          | Men    | 1074        | 5         | Black and White printed casual shirt, has a spread collar, long sleeves, button placket, curved hem, and 1 patch pocket | Black        |

Summary:
1. Mufti Men Black Regular Fit Printed Casual Shirt: This shirt is from the brand Mufti and is designed for men. It is priced at INR 999 and comes in a black color. It has a spread collar, long sleeves, button placket, curved hem, and 1 patch pocket.

2. IVOC Men Black Regular Fit Printed Casual Shirt: This shirt is from the brand IVOC and is designed for men. It is priced at INR 461 and comes in a black color. It has a spread collar, short sleeves, button placket, and curved hem.

3. Indian Terrain Men Black Slim Fit Printed Pure Cotton Shirt: This shirt is from the brand Indian Terrain and is designed for men. It is priced at INR 989 and comes in a black color. It has a spread collar, long sleeves, button placket, curved hem, and 1 patch pocket.

4. Mufti Men Black & White Regular Fit Printed Casual Shirt: This shirt is from the brand Mufti and is designed for men. It is priced at INR 1074 and comes in a black and white color. It has a spread collar, long sleeves, button placket, curved hem, and 1 patch pocket.

In [52]:
qa_stuff = RetrievalQA.from_chain_type(
    llm=llm, 
    chain_type="stuff", 
    retriever=retriever, 
    verbose=True
)

In [53]:
query =  "Please list all your shirts with sun protection in a table \
in markdown and summarize each one."

In [54]:
response = qa_stuff.run(query)



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


In [55]:
display(Markdown(response))

I'm sorry, but I don't have access to the information about sun protection for the shirts.

In [57]:
query =  "Please list all your black cotton shirts in a table \
in markdown and summarize each one."

In [58]:
response = qa_stuff.run(query)



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


In [59]:
display(Markdown(response))

| ProductID  | ProductName                                                  | ProductBrand        | Gender | Price (INR) | NumImages | Description                                                                                                                                                    | PrimaryColor |
|------------|--------------------------------------------------------------|---------------------|--------|-------------|-----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|
| 10245285   | Basics Men Black & Grey Slim Fit Striped Casual Shirt        | Basics              | Men    | 1469        | 5         | Black striped casual shirt, has a spread collar, long sleeves, button placket, and curved hem                                                                 | Black        |
| 10074235   | Black coffee Men White & Blue Slim Fit Checked Formal Shirt  | Black coffee        | Men    | 1599        | 5         | White and Blue checked formal shirt, has a spread collar, long sleeves, button placket, straight hem, and 1 patch pocket                                      | Blue         |
| 10201151   | Mufti Men Black Regular Fit Printed Casual Shirt             | Mufti               | Men    | 999         | 5         | Black printed casual shirt, has a spread collar, long sleeves, button placket, curved hem, and 1 patch pocket                                                  | Black        |
| 10267633   | Calvin Klein Jeans Men White & Black Colourblocked T-shirt   | Calvin Klein Jeans  | Men    | 1799        | 6         | White and black Tshirt for men   Colourblocked   Regular length   Round neck   Short,  regular sleeves   Knitted cotton fabric   The monochromatic trend is all about wearing the same hue in different textures for an overall tonal look. It allows you to add dimension and depth to your style affordably and has the power to accentuate or down-play certain body parts, making it the right fit for all. | Black        |

Summary:
1. Basics Men Black & Grey Slim Fit Striped Casual Shirt: A black striped casual shirt with a spread collar, long sleeves, button placket, and curved hem.
2. Black coffee Men White & Blue Slim Fit Checked Formal Shirt: A white and blue checked formal shirt with a spread collar, long sleeves, button placket, straight hem, and 1 patch pocket.
3. Mufti Men Black Regular Fit Printed Casual Shirt: A black printed casual shirt with a spread collar, long sleeves, button placket, curved hem, and 1 patch pocket.
4. Calvin Klein Jeans Men White & Black Colourblocked T-shirt: A white and black T-shirt for men with a colourblocked design, regular length, round neck, short regular sleeves, and knitted cotton fabric. The monochromatic trend allows for an overall tonal look and adds dimension and depth to your style.

In [60]:
response = index.query(query, llm=llm)

In [61]:
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch,
    embedding=embeddings,
).from_loaders([loader])