## Question and Answer ##



- Given a piece of text, maybe extracted from a PDF file or from a webpage or from some company's intranet internal document collection, can you use an LLM to answer questions about the content of those documents to help users gain a deeper understanding and get access to the information that they need? 
    - This is really powerful because it starts to combine these language models with data that they weren't originally trained on. 
    - it makes them much more flexible and adaptable to your use case. 
    - It's also really exciting because we'll start to move beyond language models, prompts, and output parsers and start introducing some more of the key components of LangChain, such as **embedding models and vector stores.** 
 
- **embeddings and vector stores** 
    - are some of the most powerful modern techniques, 
        - import the retrieval QA chain. This will do retrieval over some documents. 
        - import our favorite chat open AI language model. 
        - import a document loader. 
            - This is going to be used to load some proprietary data that we're going to combine with the language model. 
            - import a vector store. 
                - There are many different types of vector stores we're going to get started with the "DocArrayInMemorySearch" vector store. 
                    - it's an in-memory vector store and it doesn't require connecting to an external database of any kind so it makes it really easy to get started. 
            - import display and markdown to common utilities for displaying information in Jupyter notebooks. 
            - import an index, the "VectorStoreIndexCreator". 
                - This will help us create a vector store really easily. 
                - To create it, we're going to specify two things. 
                    - First, we're going to specify the vector store class. 
                        - After it's been created, we're then going to call "from_loaders", which takes in a list of document loaders. 
                        - We've only got one loader that we really care about, so that's what we're passing in here. 
        - It's now been created and we can start to ask questions about it. 


    - We've gotten back a table in markdown with names and descriptions for all shirts with sun protection. 
    - We've also got a summary that the language model has provided us. 
- So we've gone over how to do question answering over your documents, but what exactly is going on underneath the hood? 
    - First, let's think about the general idea. 
        - We want to use language models and combine it with a lot of our documents. 
        - But there's a key issue. 
            - Language models can only inspect a few thousand words at a time. 
                - So if we have really large documents, how can we get the language model to answer questions about everything that's in there? 
                - This is where embeddings and vector stores come into play. 
                    - First, let's talk about embeddings. 
                        - **Embeddings create numerical representations for pieces of text.** 
                        - This numerical representation captures the semantic meaning of the piece of text that it's been run over. 
                        - Pieces of text with similar content will have similar vectors. 
                            - This lets us compare pieces of text in the vector space. 
                            - In the example below, we can see that we have three sentences. The first two are about pets, while the third is about a car. 
                            - If we look at the representation in the numeric space, we can see that when we compare the two vectors on the pieces of text corresponding to the sentences about pets, they're very similar. 

                        - This will let us easily figure out which pieces of text are like each other, which will be very useful as we think about which pieces of text we want to include when passing to the language model to answer a question. 
                    - The next component that we're going to cover is the vector database. 
                        - A vector database is a way to store these vector representations that we created in the previous step. 
                        - The way that we create this vector database is we populate it with chunks of text coming from incoming documents. 
                        - When we get a big incoming document, we're first going to break it up into smaller chunks. This helps create pieces of text that are smaller than the original document, which is useful because we may not be able to pass the whole document to the language model. 
                        - So we want to **create these small chunks so we can only pass the most relevant ones to the language model.** 
                        - We then create an embedding for each of these chunks, and then we store those in a vector database. That's what happens when we **create the index.**
                        - Now that we've got this index, we can use it during runtime to find the pieces of text most relevant to an incoming query. 
                        - When a query comes in, we 
                            - **first create an embedding for that query.** 
                            - We then **compare it to all the vectors in the vector database**, and we pick the n most similar. 
                            - These are then returned, and we can pass those in the prompt to the language model to get back a final answer. 
                    - So above, we created this chain and only a few lines of code. That's great for getting started quickly. 
            - Let's now do it a bit more step-by-step and understand what exactly is going on under the hood. 
                - The first step is similar to above. 
                    - We're going to create a document loader, loading from that CSV with all the descriptions of the products that we want to do question answering over. We can then load documents from this document loader. 
                        - If we look at the individual documents, we can see that each document corresponds to one of the products in the CSV. 
                    - Because these documents are already so small, we actually don't need to do any chunking here. And so we can create embeddings directly. 
                - To create embeddings, we're going to use OpenAI's embedding class. 
                    - We can import it and initialize it here. 
                    
- how do we use this to do question answering over our own documents? 
    - First, we need to **create a retriever from this vector store**. 
        - A retriever is **a generic interface** that can be **underpinned by any method** that takes in a query and returns documents. Vector stores and embeddings are one such method to do so, 
    - Next, because we want to do text generation and return a natural language response, we're going to import a language model and we're going to use ChatOpenAI. 
        - If we were doing this by hand, we would combine the documents into a single piece of text. where we **join all the page content in the documents into a variable**
        - and then pass this variable or a variant on the question, like, "Please list all your shirts with sun protection in a table in markdown and summarize each one." into the language model. 
    - All of those steps can be encapsulated with the LangChain chain. 
    
- we can **create a retrieval QA chain.**
    - This does retrieval and then does question answering over the retrieved documents. To create such a chain, we'll pass in a few different things. 
        - First, we'll pass in the language model. 
            - This will be used for doing the text generation at the end. 
        - Next, we'll **pass in the chain type**. We're going to use `"stuff"`. 
            - This is the simplest method as it just `stuffs all the documents into context and makes one call to a language model`. 
            - There are a few other methods that you can use to do question answering 
        - Third, we're going to pass in a retriever. 
            - The retriever we created above is just `an interface for fetching documents.` 
            - This will be used to fetch the documents and pass it to the language model. 
        - finally, we're going to set "verbose=True". 
    - Now, we can create a query and we can run the chain on this query. 
        - but remember that we can still do it pretty easily with just the one line that we had up above. 
 
- the interesting stuff about LangChain. 
    - You can do it in one line, 
    - or you can look at the individual things and break it down into five more detailed ones. 
        - The five more detailed ones let you set more specifics about what exactly is going on, but the one-liner is easy to get started. 
        
- We can also **customize the index when we're creating it**. 
    - specified an embedding. this will give us flexibility over how the embeddings themselves are created. 
    - we can also swap out the vector store here for a different type of vector store. 
    - We use the "stuff method" in this notebook. 
        - The stuff method is pretty simple. You just **put all of it into one prompt** and send that to the language model and get back one response. So it's quite simple to understand what's going on. 
        - It's quite cheap and it works pretty well. 
        - But that doesn't always work okay. 
            - when we fetched the documents in the notebook, we only got four documents back and they were relatively small. 
            - what if you wanted to do the same type of question answering over lots of different types of chunks? 
                - there are a few different methods that we can use. 
                    - "Map_reduce". This basically takes all the chunks, passes them along with the question to a language model, gets back a response, and then uses another language model call to summarize all of the individual responses into a final answer. 
                        - This is really powerful because it can operate over any number of documents. 
                        - And it's also really powerful because you can do the individual questions in parallel. 
                        - But it does take a lot more calls. 
                        - And it does treat all the documents as independent, which may not always be the most desired thing. 
                        - a really common use case of the "Map_reduce" chain is for summarization, where you have a really long document and you want to recursively summarize pieces of information in it. 
                    - "Refine", which is another method, is again used to loop over many documents. but it actually does it iteratively. 
                        - It builds upon the answer from the previous document. 
                        - So this is really good for combining information and building up an answer over time. 
                        - It will generally lead to longer answers. 
                        - it's also not as fast because now the calls aren't independent. They depend on the result of previous calls. This means that it often takes a good while longer and takes just as many calls as "Map_reduce", basically. 
                    - "Map_rerank" is a pretty interesting and a bit more experimental one where you do a single call to the language model for each document. And you also ask it to return a score. then you select the highest score. 
                        - This relies on the language model to know what the score should be. So you often have to tell it, "Hey, it should be a high score if it's relevant to the document and really refine the instructions there". 
                        - Similar to "Map_reduce", all the calls are independent. So you can batch them and it's relatively fast. 
                        - But again, you're making a bunch of language model calls. So it will be a bit more expensive. 



### LangChain: Q&A over Documents ###
An example might be a tool that would allow you to query a product catalog for items of interest.

In [1]:
import os
os.environ['OPENAI_API_KEY']='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'

In [2]:
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file

In [3]:
# account for deprecation of LLM model
import datetime
# Get the current date
current_date = datetime.datetime.now().date()

# Define the date after which the model should be set to "gpt-3.5-turbo"
target_date = datetime.date(2024, 6, 12)

# Set the model variable based on the current date
if current_date > target_date:
    llm_model = "gpt-3.5-turbo"
else:
    llm_model = "gpt-3.5-turbo-0301"

In [4]:
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import CSVLoader
from langchain.vectorstores import DocArrayInMemorySearch
from IPython.display import display, Markdown

In [5]:
file = 'OutdoorClothingCatalog_1000.csv'
loader = CSVLoader(file_path=file)

In [6]:
from langchain.indexes import VectorstoreIndexCreator

In [8]:
#!pip install docarray

In [9]:
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch
).from_loaders([loader])

In [14]:
print(index)

vectorstore=<langchain.vectorstores.docarray.in_memory.DocArrayInMemorySearch object at 0x7fc11b499df0>


In [11]:
query ="Please list all your shirts with sun protection \
in a table in markdown and summarize each one."

In [12]:
response = index.query(query)

In [13]:
display(Markdown(response))



| Name | Description |
| --- | --- |
| Men's Tropical Plaid Short-Sleeve Shirt | UPF 50+ rated, 100% polyester, wrinkle-resistant, front and back cape venting, two front bellows pockets |
| Men's Plaid Tropic Shirt, Short-Sleeve | UPF 50+ rated, 52% polyester and 48% nylon, machine washable and dryable, front and back cape venting, two front bellows pockets |
| Men's TropicVibe Shirt, Short-Sleeve | UPF 50+ rated, 71% Nylon, 29% Polyester, 100% Polyester knit mesh, wrinkle resistant, front and back cape venting, two front bellows pockets |
| Sun Shield Shirt by | UPF 50+ rated, 78% nylon, 22% Lycra Xtra Life fiber, wicks moisture, fits comfortably over swimsuit, abrasion resistant |

All four shirts provide UPF 50+ sun protection, blocking 98% of the sun's harmful rays. The Men's Tropical Plaid Short-Sleeve Shirt is made of 100% polyester and is wrinkle-resistant. The Men's Plaid Trop

### Step By Step ###

In [15]:
from langchain.document_loaders import CSVLoader
loader = CSVLoader(file_path=file)

In [16]:
docs = loader.load()

In [17]:
docs[0]

Document(page_content=": 0\nname: Women's Campside Oxfords\ndescription: This ultracomfortable lace-to-toe Oxford boasts a super-soft canvas, thick cushioning, and quality construction for a broken-in feel from the first time you put them on. \n\nSize & Fit: Order regular shoe size. For half sizes not offered, order up to next whole size. \n\nSpecs: Approx. weight: 1 lb.1 oz. per pair. \n\nConstruction: Soft canvas material for a broken-in feel and look. Comfortable EVA innersole with Cleansport NXT® antimicrobial odor control. Vintage hunt, fish and camping motif on innersole. Moderate arch contour of innersole. EVA foam midsole for cushioning and support. Chain-tread-inspired molded rubber outsole with modified chain-tread pattern. Imported. \n\nQuestions? Please contact us for any inquiries.", metadata={'source': 'OutdoorClothingCatalog_1000.csv', 'row': 0})

In [19]:
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()

In [20]:
embed = embeddings.embed_query("Hi my name is Jessica")

In [21]:
print(len(embed))

1536


In [22]:
print(embed[:5])

[-0.033419179780663916, -0.0012581591633956397, -0.012294453683891452, -0.026725462118644336, -0.02017994769318297]


In [23]:
db = DocArrayInMemorySearch.from_documents(
    docs, 
    embeddings
)

In [24]:
query = "Please suggest a shirt with sunblocking"

In [25]:
docs = db.similarity_search(query)

In [26]:
len(docs)

4

In [27]:
docs[0]

Document(page_content=': 255\nname: Sun Shield Shirt by\ndescription: "Block the sun, not the fun – our high-performance sun shirt is guaranteed to protect from harmful UV rays. \n\nSize & Fit: Slightly Fitted: Softly shapes the body. Falls at hip.\n\nFabric & Care: 78% nylon, 22% Lycra Xtra Life fiber. UPF 50+ rated – the highest rated sun protection possible. Handwash, line dry.\n\nAdditional Features: Wicks moisture for quick-drying comfort. Fits comfortably over your favorite swimsuit. Abrasion resistant for season after season of wear. Imported.\n\nSun Protection That Won\'t Wear Off\nOur high-performance fabric provides SPF 50+ sun protection, blocking 98% of the sun\'s harmful rays. This fabric is recommended by The Skin Cancer Foundation as an effective UV protectant.', metadata={'source': 'OutdoorClothingCatalog_1000.csv', 'row': 255})

In [28]:
retriever = db.as_retriever()

In [29]:
llm = ChatOpenAI(temperature = 0.0, model=llm_model)

In [30]:
qdocs = "".join([docs[i].page_content for i in range(len(docs))])

In [31]:
response = llm.call_as_llm(f"{qdocs} Question: Please list all your \
shirts with sun protection in a table in markdown and summarize each one.") 


In [32]:
display(Markdown(response))

| Name | Description |
| --- | --- |
| Sun Shield Shirt | High-performance sun shirt with UPF 50+ sun protection, moisture-wicking, and abrasion-resistant fabric. Fits comfortably over swimsuits. |
| Men's Plaid Tropic Shirt | Ultracomfortable shirt with UPF 50+ sun protection, wrinkle-free fabric, and front/back cape venting. Made with 52% polyester and 48% nylon. |
| Men's TropicVibe Shirt | Men's sun-protection shirt with built-in UPF 50+ and wrinkle-resistant fabric. Features front/back cape venting and two front bellows pockets. |
| Men's Tropical Plaid Short-Sleeve Shirt | Lightest hot-weather shirt with UPF 50+ sun protection, relaxed traditional fit, and front/back cape venting. Made with 100% polyester. |

All of these shirts provide UPF 50+ sun protection, blocking 98% of the sun's harmful rays. They also have additional features such as moisture-wicking, wrinkle-resistant, and venting for cool breezes. The Sun Shield Shirt is abrasion-resistant and fits comfortably over swimsuits. The Men's Plaid Tropic Shirt is made with a blend of polyester and nylon and is machine washable/dryable. The Men's TropicVibe Shirt is also wrinkle-resistant and has two front bellows pockets. The Men's Tropical Plaid Short-Sleeve Shirt has a relaxed traditional fit and is made with 100% polyester.

In [33]:
qa_stuff = RetrievalQA.from_chain_type(
    llm=llm, 
    chain_type="stuff", 
    retriever=retriever, 
    verbose=True
)

In [34]:
query =  "Please list all your shirts with sun protection in a table \
in markdown and summarize each one."

In [35]:
response = qa_stuff.run(query)



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


In [36]:
display(Markdown(response))

| Shirt ID | Name | Description |
| --- | --- | --- |
| 618 | Men's Tropical Plaid Short-Sleeve Shirt | Rated UPF 50+ for superior protection from the sun's UV rays. Made of 100% polyester and is wrinkle-resistant. With front and back cape venting that lets in cool breezes and two front bellows pockets. |
| 374 | Men's Plaid Tropic Shirt, Short-Sleeve | Rated to UPF 50+ and offers sun protection. Made with 52% polyester and 48% nylon, this shirt is machine washable and dryable. Additional features include front and back cape venting, two front bellows pockets. |
| 535 | Men's TropicVibe Shirt, Short-Sleeve | Built-in UPF 50+ has the lightweight feel you want and the coverage you need when the air is hot and the UV rays are strong. Made with 71% Nylon, 29% Polyester. Wrinkle-resistant. Front and back cape venting lets in cool breezes. Two front bellows pockets. |
| 255 | Sun Shield Shirt | High-performance sun shirt is guaranteed to protect from harmful UV rays. Made with 78% nylon, 22% Lycra Xtra Life fiber. Fits comfortably over your favorite swimsuit. Abrasion-resistant. |

The Men's Tropical Plaid Short-Sleeve Shirt is made of 100% polyester and is wrinkle-resistant. It is rated UPF 50+ for superior protection from the sun's UV rays. The Men's Plaid Tropic Shirt, Short-Sleeve is made with 52% polyester and 48% nylon, and is rated to UPF 50+. The Men's TropicVibe Shirt, Short-Sleeve has built-in UPF 50+ and is made with 71% Nylon, 29% Polyester. The Sun Shield Shirt is made with 78% nylon, 22% Lycra Xtra Life fiber and is guaranteed to protect from harmful UV rays.

- **When run index.query(query, llm=llm)**
    - The query (which is usually a string of text) is first converted into an embedding
    - Once the embedding for the query is generated, the vector store performs a similarity search. 
        - This involves comparing the query's embedding to the embeddings stored in the vector store. The comparison is typically based on a similarity metric, such as cosine similarity, Euclidean distance, or other relevant metrics. 
        - The goal is to find the embeddings in the store that are most similar (or least distant, depending on the metric) to the query's embedding.
    - The search results, which are the items from the vector store that have the highest similarity scores with the query embedding, are then returned. 
        - These results can be in various forms, such as documents, sentences, or any data type that was stored in the vector store alongside their embeddings.

In [37]:
response = index.query(query, llm=llm)

- creating a vector store and also storing embeddings in that vector store.
    - VectorstoreIndexCreator

In [38]:
index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch,
    embedding=embeddings,
).from_loaders([loader])

In [39]:
print(index)

vectorstore=<langchain.vectorstores.docarray.in_memory.DocArrayInMemorySearch object at 0x7fc11b752430>
