# LangChain: Q&A over Documents

An example might be a tool that would allow you to query a product catalog for items of interest.

In [2]:
import os
import openai

# from dotenv import load_dotenv, find_dotenv
# _ = load_dotenv(find_dotenv()) # read local .env file
openai.api_key = os.environ.get('OPENAI_API_KEY')

In [3]:
# account for deprecation of LLM model
import datetime
# Get the current date
current_date = datetime.datetime.now().date()

# Define the date after which the model should be set to "gpt-3.5-turbo"
target_date = datetime.date(2024, 10, 1)

# Set the model variable based on the current date
if current_date > target_date:
    llm_model = "gpt-3.5-turbo"
else:
    llm_model = "gpt-3.5-turbo-0301"

In [4]:
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import CSVLoader
from langchain.vectorstores import DocArrayInMemorySearch
from IPython.display import display, Markdown

In [5]:
file = 'OutdoorClothingCatalog_1000.csv'
loader = CSVLoader(file_path=file, encoding="utf8")

In [6]:
from langchain.indexes import VectorstoreIndexCreator

In [7]:
#pip install docarray

In [8]:
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch
).from_loaders([loader])

  from .autonotebook import tqdm as notebook_tqdm
Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-embedding-ada-002 in organization org-HpabZxaRCyuECOSl8BrC9hl5 on tokens per min. Limit: 150000 / min. Current: 127634 / min. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..


In [9]:
query ="Please list all your shirts with sun protection \
in a table in markdown and summarize each one."

In [10]:
response = index.query(query)

Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-embedding-ada-002 in organization org-HpabZxaRCyuECOSl8BrC9hl5 on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..


In [11]:
print(response)



| Name | Description |
| --- | --- |
| Men's Tropical Plaid Short-Sleeve Shirt | UPF 50+ rated, 100% polyester, wrinkle-resistant, front and back cape venting, two front bellows pockets |
| Men's Plaid Tropic Shirt, Short-Sleeve | UPF 50+ rated, 52% polyester and 48% nylon, machine washable and dryable, front and back cape venting, two front bellows pockets |
| Men's TropicVibe Shirt, Short-Sleeve | UPF 50+ rated, 71% Nylon, 29% Polyester, 100% Polyester knit mesh, wrinkle resistant, front and back cape venting, two front bellows pockets |
| Sun Shield Shirt by | UPF 50+ rated, 78% nylon, 22% Lycra Xtra Life fiber, wicks moisture, fits comfortably over swimsuit, abrasion resistant |

All four shirts provide UPF 50+ sun protection, blocking 98% of the sun's harmful rays. The Men's Tropical Plaid Short-Sleeve Shirt is made of 100% polyester and is wrinkle-resistant. The Men's Plaid Trop


In [12]:
display(Markdown(response))



| Name | Description |
| --- | --- |
| Men's Tropical Plaid Short-Sleeve Shirt | UPF 50+ rated, 100% polyester, wrinkle-resistant, front and back cape venting, two front bellows pockets |
| Men's Plaid Tropic Shirt, Short-Sleeve | UPF 50+ rated, 52% polyester and 48% nylon, machine washable and dryable, front and back cape venting, two front bellows pockets |
| Men's TropicVibe Shirt, Short-Sleeve | UPF 50+ rated, 71% Nylon, 29% Polyester, 100% Polyester knit mesh, wrinkle resistant, front and back cape venting, two front bellows pockets |
| Sun Shield Shirt by | UPF 50+ rated, 78% nylon, 22% Lycra Xtra Life fiber, wicks moisture, fits comfortably over swimsuit, abrasion resistant |

All four shirts provide UPF 50+ sun protection, blocking 98% of the sun's harmful rays. The Men's Tropical Plaid Short-Sleeve Shirt is made of 100% polyester and is wrinkle-resistant. The Men's Plaid Trop

| LLMs on Documents | Embeddings |
| --- | --- |
| ![](images/llm_on_doc.png) | ![](images/embeddings.png) |

| Vector Database | Vector Database |
| --- | --- |
| ![](images/vector_database.png) | ![](images/vector_database2.png) |

## Step By Step

In [16]:
from langchain.document_loaders import CSVLoader
loader = CSVLoader(file_path=file, encoding="utf8")

In [17]:
docs = loader.load()

In [21]:
docs[0]

Document(page_content=": 0\nname: Women's Campside Oxfords\ndescription: This ultracomfortable lace-to-toe Oxford boasts a super-soft canvas, thick cushioning, and quality construction for a broken-in feel from the first time you put them on. \n\nSize & Fit: Order regular shoe size. For half sizes not offered, order up to next whole size. \n\nSpecs: Approx. weight: 1 lb.1 oz. per pair. \n\nConstruction: Soft canvas material for a broken-in feel and look. Comfortable EVA innersole with Cleansport NXT® antimicrobial odor control. Vintage hunt, fish and camping motif on innersole. Moderate arch contour of innersole. EVA foam midsole for cushioning and support. Chain-tread-inspired molded rubber outsole with modified chain-tread pattern. Imported. \n\nQuestions? Please contact us for any inquiries.", metadata={'source': 'OutdoorClothingCatalog_1000.csv', 'row': 0})

Because these documents are already so small, we don't need to do any chunking here.\
And so we can create embeddings directly.

In [22]:
from langchain.embeddings import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()

In [23]:
embed = embeddings.embed_query("Hi, my name is Batuhan")

Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 4.0 seconds as it raised APIConnectionError: Error communicating with OpenAI: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')).


In [24]:
print(len(embed))

1536


In [25]:
print(embed[:5])

[-0.018903138310127413, -0.009929736246813908, -0.01784744318276605, -0.023796592152118178, -0.018903138310127413]


We want to create embeddings for all the pieces of text that we just loaded and store them in a vector store. \
We can do that by using the "from_documents" method on the vector store.

In [None]:
db = DocArrayInMemorySearch.from_documents(
    docs,
    embeddings
)

We can now use this vector store to find pieces of text similar to an incoming query.

In [None]:
query = "Please suggest a shirt with sunblocking"

In [None]:
docs = db.similarity_search()

In [None]:
len(docs)

In [None]:
docs[0]

So, how do we use this to do question answering over our own documents?
- First we need to create a retriever from this vector store.
  - A retriever is a generic interface that can be underpinned by any method that takes in a query and returns documents.
- Vector stores and embeddings are one such method to do so, although there are plenty of different methods, some less advanced, some more advanced.
- Next, because we want to do text generation and return a natural language response, we're going to import a language model and we're going to use **ChatOpenAI**.

In [None]:
retriever = db.as_retriever()

In [None]:
llm = ChatOpenAI(temperature = 0.0, model=llm_model)

- If we were doing this by hand, what we would do is we would combine the documents into a single piece of text.
- So we'd do something like this, where we join all the page content in the documents into a variable and then this variable or a variant on the question, like: 
  - Please list all your shirts with sun protection in a table in markdown and summarize each one.

In [None]:
qdocs = "".join([docs[i].page_content for i in range(len(docs))])

In [None]:
response = llm.call_as_llm(f"{qdocs} Question: Please list all your \
shirts with sun protection in a table in markdown and summarize each one.") 


In [None]:
display(Markdown(response))

All of those steps can be encapsulated with the LangChain chain. \
So here we can create a **RetrievalQA** chain.
- This does retrieval and then does question answering over the retrieved documents.
- To create such a chain, we'll pass in a few different things.
  - First, we'll pass in the language model. This will be used for doing the text generation at the end.
  - Next, we'll pass in the chain type. We're going to use **"stuff"**. 
    - This is the simplest method as it just stuffs all the documents into context and makes one call to a language model.
  - Third, we're going to pass in a retriever. The retriever we created above is just an interface for fetching documents. This will be used to fetch the documents and pass it to the language model.
  - And then finally, we're going to set "verbose=True".

In [None]:
qa_stuff = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    verbose=True
)

Now, we can create a query and we can run the chain on this query.

In [None]:
query =  "Please list all your shirts with sun protection in a table \
in markdown and summarize each one."

In [None]:
response = qa_stuff.run(query)

In [None]:
display(Markdown(response))

In [None]:
response = index.query(query, llm=llm)

In [None]:
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch,
    embedding=embeddings,
).from_loaders([loader])

| Stuff          | Additional Methods                           |
| ----------------------------------------------- | ------------------------------------- |
| ![](images/stuff.png) | ![](images/additional_methods.png) |

- In this notebook we used the **"stuff method".**
- Stuff method is really nice because its pretty simple. You just put all of it into one prompt and send that to the language model and get back one response. But that doesn't always work okay.
- When we fetched the documents in the notebook, we only got 4 documents back and they were relatively small. But what if you wanted to do the same type of question answering over lots of different types of chunks? Then there are a few different methods that we can use.
- The first is **"Map_reduce"**.
  - This basically takes all the chunks, passes them along with the  question to a language model, gets back a response, and then uses another language model call to summarize all of the individual responses into a final answer.
    - This is really powerful because it can operate over any number of documents.
    - And it's also really powerful because you can do the individual questions in parallel.
    - But it does take a lot more calls.
    - And it does **treat all the documents as independent**, which may not always be the most desired thing.
- Another method is **"Refine"**.
  - This method is used to loop over many documents as well. But it actually does it iteratively.
  - It builds upon the answer from the previous document. So this is really good for **combining information and building up an answer over time**.
  - It will generally lead to longer answers.
  - And it's also not as fast because now the calls aren't independent. They depend on the result of previous calls.
  - This means that it often takes a good while longer and takes just as many calls as "Map_reduce", basically.
- **"Map_rerank"** is a pretty interesting and a bit more experimental one where you do a single call to the language model for each document. And you also ask it to return a score. And then you select the highest score.
  - This relies on the language model to know what the score should be. So you often have to tell it, "Hey, it should be a high score if it's relevant to the document and really refine the instructions there.
  - Similar to "Map_reduce", **all the calls are independent**. So you can batch them and it's relatively fast.
  - But again, you're making a bunch of language model calls. So it will be a bit more expensive.
- **The most common of these methods is the "stuff method"**, which we used in the notebook to combine it all into one document.
- **The second most common is the "Map_reduce" method**, which takes these chunks and sends them to the language model.
- These methods here can also be used for lots of other chains besides just question answering.
  - For example a really common use case of the **"Map_reduce"** chain is for **summarization**, where you have a really long document and you want to recursively summarize pieces of information in it.

That's it for question answering over documents.
