## Introduction to Langchain Retrievers

A retriever is an interface that returns documents based on an unstructured query input from a user. 

It is more general than a vector store and does not need to store documents, only to retrieve them. Retrievers can be used to create retrieval chains that retrieve documents and then pass them on.

Vector stores can be used as the backbone of a retriever, but there are other types of retrievers as well. For our use case we are going to use a vector store based retriever.

Different types of Langchain Retrievers can be found here: [ Langchain_Retrievers ](https://python.langchain.com/docs/modules/data_connection/retrievers)   



In [2]:
#! pip install faiss-cpu
import os, getpass
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain.text_splitter import CharacterTextSplitter

os.environ['OPENAI_API_KEY'] = getpass.getpass()


web_document = WebBaseLoader("https://www.rfc-editor.org/rfc/rfc6517.txt").load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
web_document_chunks = text_splitter.split_documents(web_document)
faiss_db = FAISS.from_documents(web_document_chunks, OpenAIEmbeddings())



Created a chunk of size 2262, which is longer than the specified 1000
Created a chunk of size 1080, which is longer than the specified 1000
Created a chunk of size 1031, which is longer than the specified 1000
Created a chunk of size 1047, which is longer than the specified 1000
Created a chunk of size 2554, which is longer than the specified 1000
Created a chunk of size 3139, which is longer than the specified 1000
Created a chunk of size 2846, which is longer than the specified 1000


In the above cell , we are loading a document from the web , splitting it into chunks and then indexing the data using a vectorstore. we have already covered these steps in detail in our previous Jupyter Notebook "Introduction to Vectorstores"

The only difference here though is we are using another very popular vectorstore named "FAISS".

With all the necessary steps completed, creating a retriever is fairly easy with just two steps.

In [3]:
retriever = faiss_db.as_retriever()
query = "What is the document about?"

#retriever.invoke(query)

# from langchain.retrievers import MultiQueryRetriever
# from langchain_openai import ChatOpenAI

# chat_model = ChatOpenAI()
# multi_query_retriever = MultiQueryRetriever.from_llm(retriever=faiss_db.as_retriever(), llm=chat_model)
# docs = multi_query_retriever.invoke(query)


One thing to note here is that a retriever does not get response from a LLM. Its function is primarily to retrieve most relevant documents from a data source based on the user query.

In order to create a LLM response to a user query we have to construct a prompt and then pass in the prompt along with the retrived documents from a retriever to a LLM; and we do this by creating a "Retrieval chain".

The entire flow of this process from ingesting a document , splitting , vectorizing and passing relevant data to LLM is known as "RAG - Retrieval Augmented Generation"

We have already seen under "quick_introduction" notebook that langchain uses LCEL (Langchain Expression Language) to create a chain.

chain = llm | prompt | output_parser 

Now, LCEL is great for constructing simple chains as shown above. For retrieval type of chains which involves adding an extra piece i.e a retriever to the chain, LangChain offers a higher-level constructor method to simplify creating a retrieval based chains. However, all that is being done under the hood is constructing a chain with LCEL.

For RAG systems, Langchain offers "create_retrieval_chain" constructor method. This chain takes in a user inquiry, which is then passed to the retriever to fetch relevant documents. Those documents (and original inputs) are then passed to an LLM to generate a response

In [4]:
from langchain_openai import ChatOpenAI
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate, HumanMessagePromptTemplate, SystemMessagePromptTemplate


chat_model = ChatOpenAI()

prompt = ChatPromptTemplate.from_template(
    """Answer the following question based only on the provided context:

<context>
{context}
</context>

Question: {input}"""  )

#Another way of using promtptemplates
# prompt = ChatPromptTemplate.from_messages([
#     SystemMessagePromptTemplate.from_template(
#         """Answer the following user question based only on the provided context:
# <context>
# {context}
# </context> """),
# HumanMessagePromptTemplate.from_template(f"Question: {input}")])

chat_model_chain = chat_model | prompt
document_chain = create_stuff_documents_chain(chat_model, prompt)

retrieval_chain = create_retrieval_chain(retriever, document_chain)

response = retrieval_chain.invoke({"input": "what document is this about?"})

print(response['answer'])






This document is about multicast VPN proposals and implementations.


## A Conversational Retrieval Chain

The chain we created above, can answer user questions related to the context of the data source provided. However , the chain does not track any history of prior conversations or questions that are asked.

Every question is treated as a new question and is answered without any context or prior knowledge of conversations that happed earlier with the user. This is a problem when you are trying to build an chatbot app. For a chatbot app it is necessary to have history of the previous conversations when answering any new user questions for accurate results.

The key is to save the user and the LLM chat history as a variable or entity in our prompt. The new chain will then take in the most recent input (input) and the conversation history (chat_history) and use an LLM to generate a search query.

In [5]:
from langchain_core.prompts import MessagesPlaceholder
from langchain_core.messages import HumanMessage, AIMessage, BaseMessage
from langchain_core.output_parsers import StrOutputParser
from typing import List

conversational_prompt = ChatPromptTemplate.from_messages([
    SystemMessagePromptTemplate.from_template(
        """Answer the following user question based only on the provided context:
<context>
{context}
</context> """),
HumanMessagePromptTemplate.from_template(f"Question: {input}"),
MessagesPlaceholder(variable_name="chat_history")
])

document_chain = create_stuff_documents_chain(chat_model, conversational_prompt)

chat_history: List[BaseMessage] = []

retrieval_chain = create_retrieval_chain(retriever, document_chain)

response = retrieval_chain.invoke({
    "input": "what is the document about?",
    "chat_history":chat_history})

chat_history = [AIMessage(content=str(response))]

retrieval_chain.invoke({
    "input": "do you remember my last question?",
    "chat_history":chat_history})




{'input': 'do you remember my last question?',
 'chat_history': [AIMessage(content='{\'input\': \'what is the document about?\', \'chat_history\': [], \'context\': [Document(page_content="This document is subject to BCP 78 and the IETF Trust\'s Legal\\n   Provisions Relating to IETF Documents\\n   (http://trustee.ietf.org/license-info) in effect on the date of\\n   publication of this document.  Please review these documents\\n   carefully, as they describe your rights and restrictions with respect\\n   to this document.  Code Components extracted from this document must\\n   include Simplified BSD License text as described in Section 4.e of\\n   the Trust Legal Provisions and are provided without warranty as\\n   described in the Simplified BSD License.\\n\\nTable of Contents", metadata={\'source\': \'https://www.rfc-editor.org/rfc/rfc6517.txt\'}), Document(page_content=\'The aim of this document is to leverage the already expressed\\n   requirements [RFC4834] and study the properties