## LangChain: Q&A over Documents

An example might be a tool that would allow you to query a product catalog for items of interest.

In [1]:
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

In [2]:
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import CSVLoader
from langchain.vectorstores import DocArrayInMemorySearch
from IPython.display import display, Markdown

In [6]:
file = 'OutdoorClothingCatalog_100.csv'
loader = CSVLoader(file_path=file)

In [7]:
from langchain.indexes import VectorstoreIndexCreator

In [5]:
# pip install docarray

In [8]:
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch
).from_loaders([loader])

In [9]:
query ="Please list all your shirts with sun protection \
in a table in markdown and summarize each one."

In [10]:
response = index.query(query)

In [11]:
display(Markdown(response))

 

| Name | Description | Sun Protection |
|------|-------------|---------------|
| Women's Tropical Plaid Shirt | Our lightest hot-weather shirt lets you beat the heat with a flattering fit. | UPF 50+ rated – the highest rated sun protection possible. |
| Performance Plus Woven Shirt | Perfect for trail or travel, this breathable summer shirt has the look and feel of cotton - but is packed with performance. | UPF 40+ rated to block the sun's UV rays. |
| Tropicview Baseball Cap | This sun-blocking baseball hat features a rear flap for extra coverage. | UPF 50+ rated, the highest possible. |

In [12]:
loader = CSVLoader(file_path=file)

In [13]:
docs = loader.load()

In [14]:
docs[0]

Document(page_content=": 0\nname: Women's Campside Oxfords\ndescription: This ultracomfortable lace-to-toe Oxford boasts a super-soft canvas, thick cushioning, and quality construction for a broken-in feel from the first time you put them on. \n\nSize & Fit: Order regular shoe size. For half sizes not offered, order up to next whole size. \n\nSpecs: Approx. weight: 1 lb.1 oz. per pair. \n\nConstruction: Soft canvas material for a broken-in feel and look. Comfortable EVA innersole with Cleansport NXT® antimicrobial odor control. Vintage hunt, fish and camping motif on innersole. Moderate arch contour of innersole. EVA foam midsole for cushioning and support. Chain-tread-inspired molded rubber outsole with modified chain-tread pattern. Imported. \n\nQuestions? Please contact us for any inquiries.", metadata={'source': 'OutdoorClothingCatalog_100.csv', 'row': 0})

In [15]:
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()

In [16]:
embed = embeddings.embed_query("Hi my name is Harrison")

In [17]:
print(len(embed))

1536


In [18]:
print(embed[:6])

[-0.02194717898964882, 0.006735079921782017, -0.01816144771873951, -0.03916534036397934, -0.014086442068219185, 0.016840843483805656]


In [19]:
db = DocArrayInMemorySearch.from_documents(
    docs, 
    embeddings
)

In [20]:
query = "Please suggest a shirt with sunblocking"

In [21]:
docs = db.similarity_search(query)

In [22]:
len(docs)

4

In [23]:
docs[0]

Document(page_content=": 87\nname: Women's Tropical Plaid Shirt\ndescription: Our lightest hot-weather shirt lets you beat the heat with a flattering fit.\n\nSize & Fit\n- Slightly Fitted: Softly shapes the body.\n- Falls at hip.\n\nFabric & Care\n- 52% polyester/ 48% nylon.\n- UPF 50+ rated – the highest rated sun protection possible.\n\nAdditional Features\n- Keeps you cool and comfortable by wicking perspiration away from your skin, then dries in minutes.\n- Smooth buttons are easy on your hands.\n- Wrinkle resistant.\n- Front and back cape venting for ventilation.\n- Low-profile pockets and side shaping offer a more flattering fit.\n- Two front pockets, tool tabs and eyewear loop.\n- Imported.\n\nQuestions?\nContact us for more information.", metadata={'source': 'OutdoorClothingCatalog_100.csv', 'row': 87})

In [24]:
retriever = db.as_retriever()
llm = ChatOpenAI(temperature = 0.0)
qdocs = "".join([docs[i].page_content for i in range(len(docs))])

In [25]:
response = llm.call_as_llm(f"{qdocs} Question: Please list all your \
shirts with sun protection in a table in markdown and summarize each one.") 
display(Markdown(response))

| Shirt Name                      | Description                                                                                                 |
|---------------------------------|-------------------------------------------------------------------------------------------------------------|
| Women's Tropical Plaid Shirt    | A lightweight, hot-weather shirt with a flattering fit. UPF 50+ rated for sun protection.                     |
| Performance Plus Woven Shirt    | Breathable summer shirt with quick-dry fabric. UPF 40+ rated for sun protection.                               |
| Tropicview Baseball Cap         | Sun-blocking baseball hat with UPF 50+ rated sun protection. Features a rear flap for extra coverage.         |
| Smooth Comfort Check Shirt      | Men's check shirt with wrinkle-free performance and TrueCool® fabric for moisture-wicking and quick drying. |

Summary:
1. Women's Tropical Plaid Shirt: A lightweight shirt with a flattering fit, UPF 50+ sun protection, and quick-drying fabric.
2. Performance Plus Woven Shirt: A breathable summer shirt with quick-dry fabric, UPF 40+ sun protection, and exceptional durability.
3. Tropicview Baseball Cap: A sun-blocking baseball hat with UPF 50+ sun protection, rear flap for extra coverage, and moisture-wicking sweatband.
4. Smooth Comfort Check Shirt: A men's check shirt with wrinkle-free performance, TrueCool® fabric for moisture-wicking, and a relaxed fit.

In [26]:
qa_stuff = RetrievalQA.from_chain_type(
    llm=llm, 
    chain_type="stuff", 
    retriever=retriever, 
    verbose=True
)

In [27]:
query =  "Please list all your shirts with sun protection in a table \
in markdown and summarize each one."

In [28]:
response = qa_stuff.run(query)



[1m> Entering new  chain...[0m

[1m> Finished chain.[0m


In [29]:
display(Markdown(response))

| Shirt Name                      | Description                                                                                                 |
|---------------------------------|-------------------------------------------------------------------------------------------------------------|
| Women's Tropical Plaid Shirt    | Our lightest hot-weather shirt with a flattering fit. UPF 50+ rated sun protection. Wicks away perspiration. |
| Performance Plus Woven Shirt    | Breathable summer shirt with quick-dry fabric. UPF 40+ rated sun protection. Wicks away moisture.            |
| Tropicview Baseball Cap         | Sun-blocking baseball hat with UPF 50+ rated sun protection. Rear flap for extra coverage.                    |
| Smooth Comfort Check Shirt      | Men's check shirt with TrueCool® fabric that wicks away moisture. Wrinkle-free performance.                    |

- Women's Tropical Plaid Shirt: This shirt is lightweight and designed for hot weather. It has a flattering fit and offers UPF 50+ rated sun protection. It also wicks away perspiration and dries quickly.

- Performance Plus Woven Shirt: This breathable summer shirt has the look and feel of cotton but is made with quick-dry fabric. It offers UPF 40+ rated sun protection and wicks away moisture. It is also abrasion-resistant for durability.

- Tropicview Baseball Cap: This sun-blocking baseball hat provides UPF 50+ rated sun protection. It has a rear flap for extra coverage and can be tucked away when not needed. The elastic cord allows for comfortable adjustment, and the coolmax sweatband wicks away moisture.

- Smooth Comfort Check Shirt: This men's check shirt is slightly fitted and made with TrueCool® fabric that wicks away moisture. It has wrinkle-free performance and offers a button-down collar and a single patch pocket.

In [30]:
response = index.query(query, llm=llm)

In [None]:
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch,
    embedding=embeddings,
).from_loaders([loader])