# RAG implementation using LangChain
In this notebook we will explore a simple RAG pipeline using LangChain framework. Having the knowledge on basic concepts that you gained from the intro notebook you should be able to fill in the gaps and run your first RAG pipeline. If you want to explore more check out the LangChain documentation with a [RAG Q&A example](https://python.langchain.com/docs/use_cases/question_answering).

## Loading documents
First, we need to load the relevant knowledge documents so the model can refer to them while answering the questions. I have prepared a couple of files with customer support policies that are located inside the `policies` directory. LangChain has a large number of document loaders available, for example you can load content of websites and remote storages. For more details refer to [documentation](https://python.langchain.com/docs/modules/data_connection/document_loaders/).

In this exercise we are going to use `DirectoryLoader` that parses the directory for files and uses `UnstructuredLoader` to load textual data. Files are found using the pattern matching for `txt` extension.

In [1]:
from langchain_community.document_loaders import DirectoryLoader

loader = DirectoryLoader(
    'policies', 
    glob="*.txt", 
    show_progress=True,
)

docs = loader.load()

print("Documents loaded:", len(docs))

  0%|          | 0/4 [00:00<?, ?it/s][nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\arkad\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping tokenizers\punkt.zip.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     C:\Users\arkad\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping taggers\averaged_perceptron_tagger.zip.
100%|██████████| 4/4 [00:07<00:00,  1.86s/it]

Documents loaded: 4





In [2]:
# Preview content of a document
print(docs[0].page_content)

Item Change Policy**

Our goal is to ensure that you are completely satisfied with your purchase. If you need to request a change for an item you've received due to damage, size issues, or other concerns, please refer to the following policy:

1. **Damaged Items**: If you receive an item that is damaged upon arrival, please contact our Customer Support Team within 48 hours of receiving the item. Provide a detailed description of the damage and include photographic evidence if possible. We will arrange for a replacement or refund as quickly as possible.

**Procedure for Reporting Damaged Items**: - Immediately document the damage with clear photographs. - Contact our Customer Support Team with your order number, description, and photographs of the damage. - Follow any additional instructions provided by the support agent to facilitate the exchange or refund.

2. **Incorrect Size/Item**: If you've received an item in the wrong size or the wrong item altogether, please notify us within 14

## Splitting documents
It is especially desirable to retrieve knowledge from enormous knowledge bases that are hard to traverse by humans. For example, imagine thousands of pages of legal documentation. Reading it would take long days for a single person. One of the limitations of LLMs are limited context windows which comes from the quadratic complexity of the [transformer attention layer](https://nlp.seas.harvard.edu/2018/04/03/attention.html). Because of that, long documents should be split into smaller, meaningful chunks of text. The split can't be done randomly, it would break the meaning of sentences and may cause loss of information. Thankfully, LangChain delivers a library of [text splitters](https://python.langchain.com/docs/modules/data_connection/document_transformers/) that you can use. In this exercise the policies are relatively short and can easily fit the context window. The default text splitter will leave them undivided. However, you can experiment with the `chunk_size` to see how the splitter slices the document into meaningful chunks of text.

In [3]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Split texts into chunks. Our documents are quite short so they won't be split. 
# To experiment with different settings uncomment the arguments to override default settings.
text_splitter = RecursiveCharacterTextSplitter(
    # chunk_size=1000,
    # chunk_overlap=20,
    # length_function=len,
)

documents = text_splitter.split_documents(docs)
print("Number of chunks:", len(documents))

Number of chunks: 4


## Initialize vector store
There are a number of vector databases supported by LangChain, ranging from Sklearn implementation to cloud based databases. For the full list of integrations refer to [documentation](https://python.langchain.com/docs/integrations/vectorstores). Here we are going to use [FAISS](https://python.langchain.com/docs/integrations/vectorstores/faiss) - Facebook AI Similarity Search, which is easy to install using python package manager. Create a vector store by passing documents and embedding models to the method.

In [4]:
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import OllamaEmbeddings

embeddings = OllamaEmbeddings(show_progress=True)

vector_store = FAISS.from_documents(documents, embeddings)

OllamaEmbeddings: 100%|██████████| 4/4 [01:19<00:00, 19.91s/it]


Vector store provides a method for similarity search out of the box. It is very easy to retrieve related documents.

In [5]:
retrieved = vector_store.similarity_search("I received wrong size of the item")
print("Retrieved documents:", len(retrieved))
print("Document content:", retrieved[0].page_content)

OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.59s/it]

Retrieved documents: 4
Document content: Item Change Policy**

Our goal is to ensure that you are completely satisfied with your purchase. If you need to request a change for an item you've received due to damage, size issues, or other concerns, please refer to the following policy:

1. **Damaged Items**: If you receive an item that is damaged upon arrival, please contact our Customer Support Team within 48 hours of receiving the item. Provide a detailed description of the damage and include photographic evidence if possible. We will arrange for a replacement or refund as quickly as possible.

**Procedure for Reporting Damaged Items**: - Immediately document the damage with clear photographs. - Contact our Customer Support Team with your order number, description, and photographs of the damage. - Follow any additional instructions provided by the support agent to facilitate the exchange or refund.

2. **Incorrect Size/Item**: If you've received an item in the wrong size or the wrong it




## RAG pipeline
Having all the pieces of the pipeline we can create a chain that takes a question and answers it given the knowledge from the policies. In the previous notebook you learned how to assemble components into a pipeline using the pipe operator `|`. Here we are going to use [helper functions](https://python.langchain.com/docs/modules/chains) provided by LangChain to compose complex RAG chain. 

- [`create_stuff_documents_chain`](https://api.python.langchain.com/en/latest/chains/langchain.chains.combine_documents.stuff.create_stuff_documents_chain.html#langchain.chains.combine_documents.stuff.create_stuff_documents_chain): This chain takes a list of documents and formats them all into a prompt, then passes that prompt to an LLM. It passes ALL documents, so you should make sure it fits within the context window the LLM you are using.
- [`create_retrieval_chain`](https://api.python.langchain.com/en/latest/chains/langchain.chains.retrieval.create_retrieval_chain.html#langchain.chains.retrieval.create_retrieval_chain): This chain takes in a user inquiry, which is then passed to the retriever to fetch relevant documents. Those documents (and original inputs) are then passed to an LLM to generate a response.

Your chat prompt template should take `{context}` and `{input}` fields. Having that, you can chain the prompt and the llm using the `create_stuff_documents_chain`.

In [17]:
from langchain_core.prompts import ChatPromptTemplate
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_community.llms import Ollama

llm = Ollama(model="llama2")

prompt = ChatPromptTemplate.from_template("""
   Answer the question based on the provided context:
   <context>
   {context}
   </context>
                                          
   Question: 
   {input}
""")

document_chain = create_stuff_documents_chain(llm, prompt)
document_chain

RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableLambda(format_docs)
}), config={'run_name': 'format_inputs'})
| ChatPromptTemplate(input_variables=['context', 'input'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'input'], template='\n   Answer the question based on the provided context:\n   <context>\n   {context}\n   </context>\n                                          \n   Question: \n   {input}\n'))])
| Ollama()
| StrOutputParser(), config={'run_name': 'stuff_documents_chain'})

Next, we will chain together the retriever (which is simply a wrapper around the vector store) and the combined document chain that you created above. It will make a chain that is able to retrieve relevant documents from the vector store and give the output for a given query. 

In [18]:
from langchain.chains import create_retrieval_chain

retriever = vector_store.as_retriever()
retrieval_chain = create_retrieval_chain(retriever, document_chain)
retrieval_chain

RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableBinding(bound=RunnableLambda(lambda x: x['input'])
           | VectorStoreRetriever(tags=['FAISS', 'OllamaEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x000001AA39632290>), config={'run_name': 'retrieve_documents'})
})
| RunnableAssign(mapper={
    answer: RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
              context: RunnableLambda(format_docs)
            }), config={'run_name': 'format_inputs'})
            | ChatPromptTemplate(input_variables=['context', 'input'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'input'], template='\n   Answer the question based on the provided context:\n   <context>\n   {context}\n   </context>\n                                          \n   Question: \n   {input}\n'))])
            | Ollama()
            | StrOutputParser(), config={'run_name': 'stuff_documents_chain'})
  }), conf

## Running the chain
The final chain implements a runnable interface as well. All you need to do is to provide your question as an input.

Some of the questions you can ask: 
- Accepted methods of payments
- Customer was charged twice
- Package was lost
- Order cancellation
- Item arrived damaged

In [19]:
response = retrieval_chain.invoke({"input": "I got charged too much for the item."})
print(response["answer"])

OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.57s/it]


Sure, here is an example of how you could answer the customer's question:

Thank you for reaching out to us regarding the charge on your order. We apologize for any inconvenience this may have caused and would be happy to assist you in resolving the issue.

To investigate this matter further, could you please provide us with some more details? For example, what is the item that was charged too much, and how much did you pay compared to the actual price of the item? Additionally, do you have any receipts or proof of purchase that could help us identify the issue more quickly?

Once we have this information, we can work on providing a resolution as soon as possible. In the meantime, please feel free to contact us again if you have any questions or concerns. We appreciate your patience and understanding in this matter.


## Further work
- Check if LLM is willing to give away your company secrets, ask it to tell something confidential
- Try using system prompt from the intro notebook to prevent model from going astray and perform only allowed actions - `ChatPromptTemplate.from_messages`
- To further improve the pipeline you can implement [memory mechanism](https://python.langchain.com/docs/use_cases/question_answering/chat_history) that holds previous conversation so you can ask follow-up questions!

In [20]:
response = retrieval_chain.invoke({"input": "What is your policy concerning damaged items? Provide it in detail."})
print(response["answer"])

OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.72s/it]


Answer:
At our company, we take great care to ensure that our products reach our customers in the best possible condition. However, despite our efforts, damages may occur during transit or handling. In such cases, we have a clear policy in place to handle these situations.

1. **Damaged Items**: If your item is damaged upon delivery, please contact us immediately through our official customer support channels. Please ensure that you inspect the item within 48 hours of delivery and notify us promptly if any damage is observed.
2. **Verification Process**: Our Customer Support Team will ask for photographs of the damaged item from you to verify the damage. Please provide clear images of the damage, including any defective packaging or labeling issues.
3. **Replacement or Refund**: Depending on the extent of the damage, we will offer you a replacement or a full refund of your order. If the damage is minor and does not affect the item's performance or functionality, we may opt for a repair