<a href="https://colab.research.google.com/github/poojamahajan0712/Langchain/blob/main/langchain_q_a_over_doc.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# LangChain: Q&A over Documents

An example might be a tool that would allow you to query a product catalog for items of interest.

In [None]:
!pip install --upgrade langchain
!pip install python-dotenv
!pip install openai

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

In [None]:
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import CSVLoader
from langchain.vectorstores import DocArrayInMemorySearch
from IPython.display import display, Markdown

In [None]:
import pandas as pd
file = 'adidas_usa.csv'
df1 = pd.read_csv(file)
df1 = df1[['name','description']]
df1.to_csv("product_des.csv",index=False)


In [None]:
df1.head()

Unnamed: 0,name,description
0,Beach Shorts,Splashing in the surf. Making memories with yo...
1,Five Ten Kestrel Lace Mountain Bike Shoes,Lace up and get after it. The Five Ten Kestrel...
2,Mexico Away Jersey,"Clean and crisp, this adidas Mexico Away Jerse..."
3,Five Ten Hiangle Pro Competition Climbing Shoes,The Hiangle Pro takes on the classic shape of ...
4,Mesh Broken-Stripe Polo Shirt,Step up to the tee relaxed. This adidas golf p...


In [None]:
df1.tail(5)

Unnamed: 0,name,description
840,Supernova+ Shoes,Take off. Touch down. Repeat. These adidas run...
841,Choigo Shoes,"If you want drama, the bold female track and f..."
842,Daily 3.0 Shoes,The style is in the details of the Daily 3.0 S...
843,Daily 3.0 Shoes,The style is in the details of the Daily 3.0 S...
844,Choigo Shoes,Take your style to bold new heights. Throw in ...


In [None]:
df1['description'][0]

'Splashing in the surf. Making memories with your friends. Beach days are the best days. These shorts are made of stretchy woven fabric. An elastic waistband that features the adidas logo brings a sporty look to your day at the beach.'

In [None]:
file_path = "product_des.csv"
loader = CSVLoader(file_path=file_path)

In [None]:
from langchain.indexes import VectorstoreIndexCreator

In [None]:
!pip install tiktoken

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting tiktoken
  Downloading tiktoken-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m19.2 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: tiktoken
Successfully installed tiktoken-0.4.0


In [None]:
!pip install docarray

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting docarray
  Downloading docarray-0.33.0-py3-none-any.whl (220 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m220.8/220.8 kB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
Collecting orjson>=3.8.2 (from docarray)
  Downloading orjson-3.9.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (136 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m137.0/137.0 kB[0m [31m13.8 MB/s[0m eta [36m0:00:00[0m
Collecting types-requests>=2.28.11.6 (from docarray)
  Downloading types_requests-2.31.0.1-py3-none-any.whl (14 kB)
Collecting types-urllib3 (from types-requests>=2.28.11.6->docarray)
  Downloading types_urllib3-1.26.25.13-py3-none-any.whl (15 kB)
Installing collected packages: types-urllib3, types-requests, orjson, docarray
Successfully installed docarray-0.33.0 orjson-3.9.1 types-requests-2.31.0.1 types-urllib3-1.26.25.13


In [None]:
index = VectorstoreIndexCreator(vectorstore_cls=DocArrayInMemorySearch).from_loaders([loader])

In [None]:
query ="Please list all your products that can be worn at beach \
in a table in markdown and summarize each one."

In [None]:
response = index.query(query)

In [None]:
display(Markdown(response))

 

| Product | Description |
| --- | --- |
| Beach Shorts | Made of stretchy woven fabric with an elastic waistband featuring the adidas logo. |
| Relaxed Marble Wash Hat | Easy-wearing cotton comfort with a flowy marble wash. |
| Comfort Flip-Flops | Rugged durability with quick-drying step-in cushioning. |
| Classic 3-Stripes Swim Shorts | Very short length and inner briefs for full coverage. Made with recycled materials. |

In [None]:
loader = CSVLoader(file_path=file_path)
docs = loader.load()
docs[0]

Document(page_content='name: Beach Shorts\ndescription: Splashing in the surf. Making memories with your friends. Beach days are the best days. These shorts are made of stretchy woven fabric. An elastic waistband that features the adidas logo brings a sporty look to your day at the beach.', metadata={'source': 'product_des.csv', 'row': 0})

In [None]:
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()

In [None]:
embed = embeddings.embed_query("Hi my name is Pooja")

In [None]:
print(len(embed))

1536


In [None]:
print(embed[:5])


[-0.0003221847000531852, -0.006914160680025816, -0.009502936154603958, -0.021342869848012924, -0.027759933844208717]


In [None]:
db = DocArrayInMemorySearch.from_documents(docs,embeddings)

In [None]:
query = "Please suggest a shoe for running"

In [None]:
docs = db.similarity_search(query)

In [None]:
len(docs)

4

In [None]:
docs[0]

Document(page_content="name: Runfalcon 2.0 Shoes\ndescription: Put on these adidas shoes, and you're set for a run in the park followed by coffee with friends. With a mesh upper for added breathability, they're meant to deliver comfort all day long. A durable rubber outsole gives you a solid foundation no matter how busy your schedule.", metadata={'source': 'product_des.csv', 'row': 502})

In [None]:
retriever = db.as_retriever()

In [None]:
llm = ChatOpenAI(temperature = 0.0)


In [None]:
qdocs = "".join([docs[i].page_content for i in range(len(docs))])


In [None]:
response = llm.call_as_llm(f"{qdocs} Question: Please list all your \
shoes for running in a table in markdown and summarize each one.") 



In [None]:
display(Markdown(response))

| Shoe Name | Description |
| --- | --- |
| Runfalcon 2.0 Shoes | These adidas shoes are perfect for a run in the park or a casual day out with friends. The mesh upper provides breathability and comfort all day long, while the durable rubber outsole ensures a solid foundation. |
| Swift Run X Shoes | These adidas shoes are designed to keep up with your daily routine, whether you're rushing out the door, hitting the gym, or running errands. The soft cushioning and snug mesh support your every move, while the sleek black design keeps things cool and casual. |
| Kids' Runfalcon 2.0 Shoes | These adidas running shoes are perfect for young athletes who love to run and play. The breathable mesh upper and durable sole provide comfort and support, whether they're running laps or chasing friends. |

In [None]:
qa_stuff = RetrievalQA.from_chain_type(
    llm=llm, 
    chain_type="stuff", 
    retriever=retriever, 
    verbose=True
)

In [None]:
query =  "Please list all your shoes for running in a table \
in markdown and summarize each one."

In [None]:
response = qa_stuff.run(query)



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


In [None]:
display(Markdown(response))

| Shoe Name | Description |
| --- | --- |
| Tensor Run Shoes | A versatile pair of shoes for kids that can be worn for any activity. The shoes have a breathable upper and a lightweight unitsole for cushioning. They also have welded 3-Stripes on the sides. |
| Supernova Shoes | Running shoes designed to help you achieve your goals. They have responsive Boost in the forefoot and heel, as well as springy Bounce for a balanced and energized ride. |
| Swift Run X Shoes | These shoes are designed to keep up with your daily life, whether you're rushing out the door, going to the gym, or running errands. They have soft cushioning and snug mesh support, and a sleek black design. |

Tensor Run Shoes are a versatile pair of shoes for kids that can be worn for any activity. They have a breathable upper and a lightweight unitsole for cushioning. They also have welded 3-Stripes on the sides.

Supernova Shoes are running shoes designed to help you achieve your goals. They have responsive Boost in the forefoot and heel, as well as springy Bounce for a balanced and energized ride.

Swift Run X Shoes are designed to keep up with your daily life, whether you're rushing out the door, going to the gym, or running errands. They have soft cushioning and snug mesh support, and a sleek black design.

In [None]:
response = index.query(query, llm=llm)

In [None]:
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch,
    embedding=embeddings,
).from_loaders([loader])

Video 6

### Coming up with test datapoints

In [None]:
loader = CSVLoader(file_path=file_path)
data = loader.load()
data[10]

Document(page_content='name: Formotion Sculpt Biker Short Tights\ndescription: Sometimes confidence comes in a surprising form. These adidas short tights have a unique sculpted shape to hold you in and targeted compression zones that support your muscles as you bend and stretch. If a compression look is not your thing, order a size up. An adaptive FORMOTION design follows your natural movement for a better fit and greater comfort in motion. A high-rise waist helps you focus, even before class begins.', metadata={'source': 'product_des.csv', 'row': 10})

In [None]:
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch
).from_loaders([loader])

In [None]:
llm = ChatOpenAI(temperature = 0.0)
qa = RetrievalQA.from_chain_type(
    llm=llm, 
    chain_type="stuff", 
    retriever=index.vectorstore.as_retriever(), 
    verbose=True,
    chain_type_kwargs = {
        "document_separator": "<<<<>>>>>"
    }
)

In [None]:
data[1]

Document(page_content="name: Five Ten Kestrel Lace Mountain Bike Shoes\ndescription: Lace up and get after it. The Five Ten Kestrel Lace Mountain Bike Shoes offer efficient pedal power with low-profile style. The wide platform is compatible with all clipless pedals and offers high-friction grip on and off the bike. You'll find the find comfort and versatility for extended trail rides and afterwork hot laps alike.", metadata={'source': 'product_des.csv', 'row': 1})

In [None]:
data[3]

Document(page_content='name: Five Ten Hiangle Pro Competition Climbing Shoes\ndescription: The Hiangle Pro takes on the classic shape of the original Hiangle with the addition of a seamless outsole wrapping around the toes, allowing for maximum rubber contact when tackling the most challenging boulder problems.', metadata={'source': 'product_des.csv', 'row': 3})

In [None]:
data[8]

Document(page_content="name: Classic 3-Stripes Swimsuit\ndescription: You can show your concern for the health of the oceans while you increase your fitness in the pool. This adidas swimsuit is designed for comfort and support. Soft and fully lined, it's made with yarn spun from recycled materials. Iconic 3-Stripes tape on the sides gives it a sporty and classic look.", metadata={'source': 'product_des.csv', 'row': 8})

In [None]:
data[11]

Document(page_content="name: Athletic Cushioned Crew Socks 6 Pairs\ndescription: Stop searching for the lost match to your favorite socks. With six pairs of matching socks, you'll always have a set. These adidas crew socks support your feet with foot-hugging arch support. The stretch blend pulls moisture away from the skin to keep feet feeling dry.", metadata={'source': 'product_des.csv', 'row': 11})

### Hard-coded examples

In [None]:
examples = [
    {
        "query": "Do the socks have \
        arch support?",
        "answer": "Yes"
    },
    {
        "query": "Give example of bike shoes that provide \
        efficient pedal power?",
        "answer": "Five Ten Kestrel Lace Mountain Bike Shoes"
    }
]

In [None]:
from langchain.evaluation.qa import QAGenerateChain


In [None]:
example_gen_chain = QAGenerateChain.from_llm(ChatOpenAI())

In [None]:
new_examples = example_gen_chain.apply_and_parse(
    [{"doc": t} for t in data[:5]]
)



In [None]:
new_examples[0]

{'query': 'What is the description of the Beach Shorts mentioned in the document?',
 'answer': 'The Beach Shorts are described as being perfect for beach days, made of stretchy woven fabric with an elastic waistband featuring the adidas logo for a sporty look.'}

In [None]:
new_examples[1]

{'query': 'What type of pedals are the Five Ten Kestrel Lace Mountain Bike Shoes compatible with?',
 'answer': 'The Five Ten Kestrel Lace Mountain Bike Shoes are compatible with all clipless pedals.'}

### Combine examples

In [None]:
examples += new_examples

In [None]:
qa.run(examples[0]["query"])



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


'Yes, the Sport Stripe High Quarter Socks, Athletic Cushioned Crew Socks, and 3-Stripes Crew Socks all have arch compression for support. However, the Cushioned Angle Stripe Low-Cut Socks do not mention arch support in their description.'

## Manual Evaluation

In [None]:
import langchain
langchain.debug = True

In [None]:
qa.run(examples[0]["query"])

[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA] Entering Chain run with input:
[0m{
  "query": "Do the socks have         arch support?"
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA > 2:chain:StuffDocumentsChain] Entering Chain run with input:
[0m[inputs]
[32;1m[1;3m[chain/start][0m [1m[1:chain:RetrievalQA > 2:chain:StuffDocumentsChain > 3:chain:LLMChain] Entering Chain run with input:
[0m{
  "question": "Do the socks have         arch support?",
  "context": "name: Sport Stripe High Quarter Socks 3 Pairs\ndescription: When was the last time you said thanks to your feet? Show 'em some appreciation in these cushioned adidas socks. Arch compression and moisture-wicking yarn keep your feet comfy, so you can squat and lunge with ease. Who knows, at the end of the day, your feet might be the ones thanking you.<<<<>>>>>name: Athletic Cushioned Crew Socks 6 Pairs\ndescription: Stop searching for the lost match to your favorite socks. With six pairs of matching soc

'Yes, the Sport Stripe High Quarter Socks, Athletic Cushioned Crew Socks, and 3-Stripes Crew Socks all have arch compression for support. However, the Cushioned Angle Stripe Low-Cut Socks do not mention arch support in their description.'

In [None]:
from langchain.evaluation.qa import QAEvalChain

In [None]:
llm = ChatOpenAI(temperature=0)
eval_chain = QAEvalChain.from_llm(llm)

In [None]:
langchain.debug = False

In [None]:
predictions = qa.apply(examples)



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m





[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m





[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m





[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m





[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m





[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m





[1m> Finished chain.[0m


In [None]:
graded_outputs = eval_chain.evaluate(examples, predictions)



In [None]:
for i, eg in enumerate(examples):
    print(f"Example {i}:")
    print("Question: " + predictions[i]['query'])
    print("Real Answer: " + predictions[i]['answer'])
    print("Predicted Answer: " + predictions[i]['result'])
    print("Predicted Grade: " + graded_outputs[i]['text'])
    print()

Example 0:
Question: Do the socks have         arch support?
Real Answer: Yes
Predicted Answer: Yes, the Sport Stripe High Quarter Socks, Athletic Cushioned Crew Socks, and 3-Stripes Crew Socks all have arch compression for support. However, the Cushioned Angle Stripe Low-Cut Socks do not mention arch support in their description.
Predicted Grade: CORRECT

Example 1:
Question: Give example of bike shoes that provide         efficient pedal power?
Real Answer: Five Ten Kestrel Lace Mountain Bike Shoes
Predicted Answer: The Five Ten Kestrel Lace Mountain Bike Shoes offer efficient pedal power with low-profile style.
Predicted Grade: CORRECT

Example 2:
Question: What is the description of the Beach Shorts mentioned in the document?
Real Answer: The Beach Shorts are described as being perfect for beach days, made of stretchy woven fabric with an elastic waistband featuring the adidas logo for a sporty look.
Predicted Answer: The description of the Beach Shorts mentioned in the document is