# Multi Query Retriever
https://python.langchain.com/v0.1/docs/modules/data_connection/retrievers/

https://api.python.langchain.com/en/latest/retrievers/langchain.retrievers.multi_query.MultiQueryRetriever.html

1. Create the LLM (Cohere)
2. Create the VectorDB to be used as retriever
3. Create Multi Query Retriever
4. Test

#### When to use?
* The chunks are small in size with overlapping semantic meanings
* Chunks are sensitive to queries
* Use for end user facing chatbots


## 1. Create LLM

In [1]:
from dotenv import load_dotenv
import sys
import json

from langchain.retrievers.multi_query import MultiQueryRetriever
from langchain_community.document_loaders import DirectoryLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.embeddings.sentence_transformer import SentenceTransformerEmbeddings
import logging

from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

from langchain.prompts import PromptTemplate

# Load the file that contains the API keys - OPENAI_API_KEY
load_dotenv('C:\\Users\\raj\\.jupyter\\.env')

# setting path
sys.path.append('../')

from utils.create_chat_llm import create_gpt_chat_llm, create_cohere_chat_llm

# Try with GPT
llm = create_cohere_chat_llm()

## 2. Create Vectorstore, Text splitter

In [2]:
# Create the Chroma vector store
embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
vector_store = Chroma(collection_name="full_documents", embedding_function=embedding_function) 

# Load the docs
loader = DirectoryLoader('./util', glob="**/*.txt")
docs = loader.load()

# Smaller chunks stored in the vector DB
text_splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=20)
chunked_documents = text_splitter.split_documents(docs)

# Add the documents to vector store
vector_store.add_documents(chunked_documents)

print(vector_store)

<langchain_community.vectorstores.chroma.Chroma object at 0x0000023D38A45150>


## 3. Create the MultiQueryRetriever

https://python.langchain.com/docs/modules/data_connection/retrievers/MultiQueryRetriever

In [3]:
# Create the retriever
retriever = vector_store.as_retriever()

# Create the MQR object
multi_query_retriever = MultiQueryRetriever.from_llm(
    retriever=retriever,
    llm = llm
)

# To check out the generated queries
logging.basicConfig()
logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.DEBUG)


## 4. Checkout the behavior of MQR

In [4]:
# Test input
input = ["What is RAG?",
         "How is fine tuning different than RAG?",
         "What data is used to train ChatGPT?",
         "What are the benefits of generative AI?"]

# Change index to select the question
ndx = 2

print("Question :", input[ndx])

results = multi_query_retriever.invoke(input = input[ndx])
print(results)

Question : What data is used to train ChatGPT?


INFO:langchain.retrievers.multi_query:Generated queries: ["What kinds of data were used to train ChatGPT's language model? ", '', "Can you provide details on the datasets used to train ChatGPT's AI? ", '', 'Are there any specific data requirements for training large language models like ChatGPT, and if so, what are they?']


[Document(page_content='Title How ChatGPT and our language models are developed\n\nSource\n\nhttps://help.openai.com/en/articles/7842364\n\nhow\n\nchatgpt\n\nand\n\nour\n\nlanguage\n\nmodels\n\nare\n\ndeveloped', metadata={'source': 'util\\chatgpt-how-it-is-developed.txt'}), Document(page_content='As mentioned in the previous section, ChatGPT does not copy or store training information in a database. Instead, it learns about associations between words, and those learnings help the model update its numbers/weights. The model then uses those weights to predict and generate new words in', metadata={'source': 'util\\chatgpt-how-it-is-developed.txt'}), Document(page_content='How does the development of ChatGPT comply with privacy laws? We use training information lawfully. Large language models have many applications that provide significant benefits and are already helping people create content, improve customer service, develop software, customize education, support', metadata={'source': 

## 5. Create the retriever chain with MQR

In [5]:

system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)

qa_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),

        ("human", "{input}"),
    ]
)

# Create Q&A chain
question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)

# Create the chain with MQR
rag_chain = create_retrieval_chain(multi_query_retriever, question_answer_chain)

## 6. Test the chain's performance

In [6]:
# vague question
input = "retrieval for context"

response = rag_chain.invoke({"input": input})

response['answer']

INFO:langchain.retrievers.multi_query:Generated queries: ['What information can be retrieved about the concept of "retrieval" within a specific context?', '', 'What are the relevant documents that discuss retrieval in a particular setting or scenario?', '', 'Are there any studies or reports that examine the process, methods, or applications of retrieval in a defined context?']


'RAG, or Retrieval Augmented Generation, is a process where AI searches for relevant information to answer a query, using a source of data, rather than just relying on its training.'