# LangChain: Q&A over Documents

An example might be a tool that would allow you to query a product catalog for items of interest.

In [1]:
!pip install docarray

Collecting docarray
  Downloading docarray-0.41.0-py3-none-any.whl (302 kB)
     -------------------------------------- 302.8/302.8 kB 6.2 MB/s eta 0:00:00
Collecting rich>=13.1.0
  Downloading rich-14.2.0-py3-none-any.whl (243 kB)
     ------------------------------------- 243.4/243.4 kB 14.6 MB/s eta 0:00:00
Collecting types-requests>=2.28.11.6
  Downloading types_requests-2.32.4.20250913-py3-none-any.whl (20 kB)
Collecting markdown-it-py>=2.2.0
  Downloading markdown_it_py-4.0.0-py3-none-any.whl (87 kB)
     ---------------------------------------- 87.3/87.3 kB 4.8 MB/s eta 0:00:00
Collecting mdurl~=0.1
  Downloading mdurl-0.1.2-py3-none-any.whl (10.0 kB)
Installing collected packages: types-requests, mdurl, markdown-it-py, rich, docarray
Successfully installed docarray-0.41.0 markdown-it-py-4.0.0 mdurl-0.1.2 rich-14.2.0 types-requests-2.32.4.20250913



[notice] A new release of pip available: 22.3 -> 25.2
[notice] To update, run: python.exe -m pip install --upgrade pip


We are going to load a CSV file containing a catalog. We have to use the model used to create the embeddings for the `nomic-embed-text` model(better than `qwen2.5:3b`), creating a vector store index. Make sure that `nomic-embed-text` is installed in your Ollama enviroment.

In [18]:
from langchain.document_loaders import CSVLoader
from langchain.indexes import VectorstoreIndexCreator
from langchain_community.llms.ollama import Ollama
from langchain.embeddings import OllamaEmbeddings
from langchain.vectorstores import DocArrayInMemorySearch

# Load LLM model
llm = Ollama(model="qwen2.5:3b", temperature=0)

# Load file
file = "OutdoorClothingCatalog_1000.csv"
loader = CSVLoader(file_path=file, encoding="utf-8")
embeddings = OllamaEmbeddings(model="nomic-embed-text")
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch,
    embedding=embeddings,
).from_loaders([loader])
# Create a query
query ="Please list all your shirts with sun protection \
in a table in markdown and summarize each one."
response = index.query(query, llm=llm)
print(response)


Here is a summary of the two shirts with sun protection features listed in a Markdown format:

| Shirt Name | Description |
|-------------|--------------|
| Sun Shield Shirt by [Brand] | The high-performance sun shirt from [Brand], made of 78% nylon and 22% Lycra Xtra Life fiber, providing UPF 50+ rated protection. Features include quick-drying comfort, moisture-wicking, abrasion resistance, and a slightly fitted design that falls at the hip. |
| Women's Tropical Tee | A sleeveless button-up shirt with a fit to flatter and built-in SunSmart™ UPF 50+ protection. Made of 71% nylon and 29% polyester shell fabric with a cape lining made of 100% polyester, it features wrinkle resistance, low-profile pockets, side shaping for a more flattering fit, front and back cape venting, two front pockets, tool tabs, and an eyewear loop. |

Both shirts provide high-performance sun protection (UPF 50+) to block 98% of the sun's harmful rays. They are both machine washable and dryable, with wrinkle-resis

Next step is to create the database of the index as a retriever, an object that will return the `k` most similar documents given an input query. These documents are going to be feed to the LLM model.

In [21]:
from langchain.chains import RetrievalQA

docs = loader.load()
db = DocArrayInMemorySearch.from_documents(
    docs, 
    embeddings
)
retriever = db.as_retriever(search_type="similarity", search_kwargs={"k": 5})
retrieval_qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True
)
query = "List shirts with sun protection in markdown table"
result = retrieval_qa({"query": query})
print(result['result'])

  result = retrieval_qa({"query": query})


| Name | Sun Protection Rating |
| --- | --- |
| Women's Tropical Tee, Sleeveless | UPF 50+ |
| Sun Shield Shirt by | UPF 50+ |
| Sunrise Tee | UPF 50+ |
| Men's Tropical Plaid Short-Sleeve Shirt | UPF 50+ |
| Men's Plaid Tropic Shirt, Short-Sleeve | UPF 50+ |

Note: The exact rating for "Sun Shield Shirt by" is not specified in the provided context. Therefore, I've used "UPF 50+" as a placeholder since it matches all other shirts listed.
