# Querying Documents

In [1]:
from dotenv import load_dotenv
import os
import pandas as pd
import openai

from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import CSVLoader
from langchain.vectorstores import DocArrayInMemorySearch
from langchain.indexes import VectorstoreIndexCreator
from IPython.display import display, Markdown

In [2]:
_ = load_dotenv("config.env")
openai.api_key = os.environ['OPENAI_API_KEY']

model_name = "gpt-3.5-turbo-0301"

## Embedding and sotring documents for querying

Documents may be too large to fit in the LLM. We can use a vector database to store the embeddings of the documents. Here is how it works:

1. Embbed the documents
    - Break documents in smmaler chunks
    - Embeddings the chunks
    - Save them in a vector Databse

    > Documents -> Chunks () -> Embeddings -> Vector Databse

2. Query the documents
    - Create a `query`
    - Embbed the query
    - Compare it to vectors database
    - Pick the N most similar vector to the query

3. Send the N vectors with the prompt to the LLM


<img src="imgs/vector-database.png" width="500" height="500" />

## Practice

We will embbed the documents and store them in a vector store. Then we will query the documents to answer specific questions.

In [3]:
# Load the data
file = 'data/myntra_products_catalog.csv'
loader = CSVLoader(file_path=file)

# Create the vector store
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch,
    # embedding=embeddings,            # We can also manually specify the embeddings
).from_loaders([loader])

In [4]:
df = pd.read_csv(file)
df.head(3)

Unnamed: 0,ProductID,ProductName,ProductBrand,Gender,Price (INR),NumImages,Description,PrimaryColor
0,10017413,DKNY Unisex Black & Grey Printed Medium Trolle...,DKNY,Unisex,11745,7,"Black and grey printed medium trolley bag, sec...",Black
1,10016283,EthnoVogue Women Beige & Grey Made to Measure ...,EthnoVogue,Women,5810,7,Beige & Grey made to measure kurta with churid...,Beige
2,10009781,SPYKAR Women Pink Alexa Super Skinny Fit High-...,SPYKAR,Women,899,7,Pink coloured wash 5-pocket high-rise cropped ...,Pink


**Ask question about the documents**

In [5]:
query ="Are there self-designed blue clothes?"

response = index.query(query)
response

' Yes, there are self-designed blue clothes. All four of the products listed have a primary color of blue and a description that includes "self-design".'

In [6]:
query ="Please list all the beige and grey women's clothes \
in a table in markdown and summarize each one."

response = index.query(query)
display(Markdown(response))



| ProductID | ProductName | ProductBrand | Gender | Price (INR) | NumImages | Description | PrimaryColor |
|-----------|-------------|--------------|-------|-------------|-----------|-------------|--------------|
| 10207057 | W Women Beige & Grey Printed Straight Kurta | W | Women | 849 | 5 | Beige and Grey striped straight kurta with printed detail, has a mandarin collar with button closure, three-quarter sleeves, straight hem, and side slits | Beige |
| 10143047 | MANGO Women Beige & Grey Printed Round Neck T-shirt | MANGO | Women | 1990 | 5 | Beige and grey printed T-shirt, has a round neck, and short sleeves | Beige |
| 10234491 | W Women Beige & Taupe Woven Design Kurta with Palazzos & Ethnic Jacket | W | Women | 3149 | 8 | Beige, taupe and golden woven design kurta with palazzos and ethnic jacketBeige and golden woven design front open longline ethnic jacket, has a shirt collar and three-quarter sleevesTaupe and golden striped A-line calf length kurta, has a

## Digging Deeper

The above operation was quite easy. But what's happening under the hood? To answer this question, we will accomplish the same task step by step:
1. Create the documents
2. Break the documents into chunks (optional)
3. Embbed the chunks
4. Store the embeddings in a vector database

**1. Creating documents**

In [None]:
from langchain.document_loaders import CSVLoader

# Create and load documents
loader = CSVLoader(file_path=file)
docs = loader.load()
docs[0]

**2 & 3. Embedding documents and storing them into a Vector Store**

Documents are small so we won't create chunks. We will create embeddings directly.

In [None]:
from langchain.embeddings import OpenAIEmbeddings

# Get OpenAI embedding model
embeddings = OpenAIEmbeddings()

# Embbed the docs and store them in a vector store
db = DocArrayInMemorySearch.from_documents(
    docs,
    embeddings
)

In [None]:
embed = embeddings.embed_query("Hi, my name is Becaye.")
print(len(embed))
print(embed[:4])

**4. Query the documents**

We can use the vector to store to find pieces of texts similar to the query.

In [None]:
query = "Suggest a white shirt for men"
response_docs = db.similarity_search(query)
response_docs[0].page_content

A retriever Interface that takes in a query and return a documents.

In [24]:
retriever = db.as_retriever()
llm = ChatOpenAI(temperature=0.0)

# Combine the documents into a single string
qdocs = "".join([docs[i].page_content for i in range(len(docs))])

# Ask a question to the LLM
question = "List all the red women dresses in a table in markdown \
    and summarize them in an hilarious way."
response = llm.call_as_llm(f"{qdocs} Question: {question}")

# Display the response
display(Markdown(response))

### Retrieval QA Chain

We can also create a RetrievalQA chain which takes in some arguments:
- `llm`: the LLM model
- `chain_type`: "stuff" is the simplest method. It will, well, "stuff" all the the documents into context and makes one call to the model.
- `retriever` the retriever will fetch the documents to pass it to the LLM.

In [None]:
# Create RetievalQA chain.                                                                                                                                                                                      
qa_stuff = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    verbose=True
)

# Run the chain
response = qa_stuff.run(query)

## Types of Retrieval QA Chains

There are several types of retrieval methods for questions answering.

### Stuff
Stuff is the simplest method. It takes in all the documents and stuff them into the model.

![stuff](imgs/stuff.png)

**Pros**
- Simple.

**Cons**
- Large documents might not fit in the context window.

### Map Reduce
Each chunk of documents is passed to the LLM independently. The results are then summarized by another LLM.
This can be useful for summuriation of long documents.

![map-reduce](imgs/map-reduce.png)

**Pros**
- Can handle large documents

**Cons**
- Makes a lot more LLM calls
- Treat all documents independently, which might not be desirable.

### Refine
Navigate through each document by building upon the answer of the previous docs.

![refine](imgs/refine.png)

**Pros**
- Build consistent responses over time.

**Cons**
- Makes a lot more LLM calls
- Slower because it depends on the output of the previous call

### Map Rerank (experimental)
Makes a single call to the LLM for each document, ask it to return a score and select the highest score

**Pros**
- Fast

**Cons**
- Makes a lot more LLM calls