# **Q&A over Documents** in LangChain

An example might be a tool that would allow you to query a product catalog for items of interest.

In [2]:
import os
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import CSVLoader
from langchain.vectorstores import DocArrayInMemorySearch
from IPython.display import display, Markdown

In [3]:
from langchain.document_loaders import CSVLoader
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS

# Load documents using CSVLoader.
loader = CSVLoader("OutdoorClothingCatalog_1000.csv")
docs = loader.load()

# Create embeddings using OpenAIEmbeddings.
embeddings = OpenAIEmbeddings()

# Create the vector database using FAISS vectorstore.
db = FAISS.from_documents(docs, embeddings)

  embeddings = OpenAIEmbeddings()


In [4]:
doc_at_index_0 = docs[0]
print("Document content:", doc_at_index_0.page_content)
# Retrieve the vector for the document at index 0
vector = db.index.reconstruct(0)
print(len(vector))

Document content: : 0
name: Women's Campside Oxfords
description: This ultracomfortable lace-to-toe Oxford boasts a super-soft canvas, thick cushioning, and quality construction for a broken-in feel from the first time you put them on. 

Size & Fit: Order regular shoe size. For half sizes not offered, order up to next whole size. 

Specs: Approx. weight: 1 lb.1 oz. per pair. 

Construction: Soft canvas material for a broken-in feel and look. Comfortable EVA innersole with Cleansport NXT® antimicrobial odor control. Vintage hunt, fish and camping motif on innersole. Moderate arch contour of innersole. EVA foam midsole for cushioning and support. Chain-tread-inspired molded rubber outsole with modified chain-tread pattern. Imported. 

Questions? Please contact us for any inquiries.
1536


In [5]:
print("Number of documents:", db.index.ntotal)

Number of documents: 1000


method 1: directly use similarity_search to find relevant docs and pass into llm as a prompt

In [6]:
query = "Please suggest a shirt with sunblocking"

In [8]:
# Retrieve the 3 most similar documents for your query
results = db.similarity_search("shirt", k=3)
for doc in results:
    print(doc.page_content)

: 345
name: T-Shirt

Classical Cotton/Modal Scoopneck, Short-Sleeve T-Shirt
description: Destined to be your favorite cotton t-shirt, with crave-worthy softness and perfect drape. Stand-out prints add even more beauty to the collar-bone-skimming scoopneck style.

Size & Fit
Slightly Fitted: Softly shapes the body. Falls at high hip.

Fabric & Care
The ultrasoft fabric fits perfectly, drapes beautifully and resists pilling. In 60% cotton and 40% modal. Machine wash and dry.

Additional Features
Open scoop neckline. Cuffed short sleeves. Softly rounded hem is slightly longer in the back. Imported.
: 650
name: Mountain Range Herringbone Shirt
description: This shirt is sure to impress with its soft touch and handsome look. It is also rugged and abrasion-resistant, so you don't have to worry about wear and tear on the trail. The 57% cotton, 43% polyester fabric is garment-washed for an extra-soft feel and it wicks moisture to keep you comfortable on the go. This shirt is slightly fitted wi

In [9]:
llm = ChatOpenAI(temperature = 0.0)

  llm = ChatOpenAI(temperature = 0.0)


In [12]:
qdocs = "".join([results[i].page_content for i in range(len(results))])


In [13]:
response = llm.call_as_llm(f"{qdocs} Question: Please list all your shirts in a table in markdown and summarize each one.") 
print(response)

| Name                                  | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   

method 2: use retriever + QA chain + retrieval chain

In [80]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

# Use the existing retriever and query
retriever = db.as_retriever()
query = "Please list all your shirts in a table in markdown and summarize each one."

# Initialize the language model
llm = ChatOpenAI()

# Define the system prompt
system_prompt = (
    "Use the given context to answer the question. "
    "If you don't know the answer, say you don't know. "
    "Use three sentence maximum and keep the answer concise. "
    "Context: {context}"
)

# Create the prompt template
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)

# Create the question-answer chain
question_answer_chain = create_stuff_documents_chain(llm, prompt)

# Create the retrieval chain
chain = create_retrieval_chain(retriever, question_answer_chain)

# Invoke the chain with the query
response = chain.invoke({"input": query})

# Display the response
display(Markdown(response['answer']))

| Shirt Name                                    | Description                                                    | Size & Fit                                              | Fabric & Care                        | Additional Features                                                            |
| --------------------------------------------- | -------------------------------------------------------------- | ------------------------------------------------------- | ------------------------------------ | ------------------------------------------------------------------------ |
| Classic Plaid Short-Sleeve Shirt              | Colorful linen shirt keeping you cool and comfortable         | Slightly Fitted, Relaxed through chest and sleeve      | 100% linen. Machine wash and dry.  | Single patch pocket. Shirttail hem. Imported.                              |
| T-Shirt                                       | Cotton/modal scoopneck t-shirt with stand-out prints           | Slightly Fitted, Softly shapes the body                 | 60% cotton, 40% modal. Machine wash and dry. | Open scoop neckline. Cuffed short sleeves. Slightly longer hem in back. Imported. |
| Linen Luxe Shirt, Slightly Tailored           | Lightweight, breathable shirt that's a warm-weather staple    | Slightly Fitted, Relaxed through chest and sleeve      | Lightweight 100% linen. Machine wash and dry. | Single patch pocket. Shirttail hem. Imported.                              |
| Northeast Coast Plaid Shirt, Slightly Fitted  | Softly textured, cool shirt in summery plaids                 | Slightly Fitted, Relaxed through chest and sleeve      | 100% organic cotton slub. Machine wash and dry.  | Chambray lining at back yoke. Triple needle stitching. Classic fish-eye buttons. Imported. |

Classics Plaid Short-Sleeve Shirt is a colorful linen shirt with a relaxed fit, T-Shirt is a cotton/modal scoopneck shirt with stand-out prints, Linen Luxe Shirt is a lightweight and breathable warm-weather staple, and Northeast Coast Plaid Shirt is an organic cotton shirt in summery plaids.