### Query Transformation
The main idea behind the Query Transformation is that translate/transform the user query in a way that the LLM can correctly answer the question. For instance, if the user asks an ambiguous question, our RAG retriever might retrieve incorrect (or ambiguous) documents based on the embeddings that are not very relevant to answer the user question, leading the LLM to hallucinate answers. There are few ways to tackle this problem. Some of them are,

Step-back prompting: This involves encouraging the LLM to take a step back from a given question or problem and pose a more abstract, higher-level question that encompasses the essence of the original inquiry.

Least-to-most prompting: This allows to break down a complex problem into a series of simpler subproblems and then solve them in sequence. Solving each subproblem is facilitated by the answers to previously solved subproblems.

Query re-writing (Multi-Query or RAG Fusion): This allows to generate multiple questions from the original question with different wording and perspectives. Then retrieve documents using the similarity scores between each question and the vector store to answer the orginal question.

A blog post about query transformation by Langchain can be found [here](https://blog.langchain.dev/query-transformations/).

In [4]:
#@ Importing libraries
from langchain_community.vectorstores import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_community.document_loaders import PyPDFLoader
from langchain_core.runnables import RunnablePassthrough
from langchain_text_splitters import RecursiveCharacterTextSplitter

In [11]:
#@ load the document
loader = PyPDFLoader("../Introduction/introduction-to-natural-language-processing.pdf")
docs = loader.load()

#@ split into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size = 500, chunk_overlap = 20)
text_chunks = text_splitter.split_documents(docs)

#@ create a vector store
vector_store = Chroma.from_documents(documents=text_chunks, 
                                     embedding=embeddings,
                                     persist_directory="data/vectorstore"
                                     )

vector_store.persist()

  warn_deprecated(


In [12]:
#@ Retrieve
retriever = vector_store.as_retriever(search_kwargs={"k": 5})

### Query Translation
- Multi-Query
In multi-query approach, we first use an LLM (here I'm using ```llama3```) to generate 5 different questions based on our original question. To do that, we create a prompt and encapsulate it with the ```PromptTemplate```. Then we create the chain using LCEL, to read the user input and assign it to the question placeholder of the prompt, send the prompt to the LLM, parse the output containing 5 questions seperated by new line charcters.

In [16]:
# Since I'm using local llama model instead of an OpenAI
from langchain import PromptTemplate
from langchain.llms import Ollama
from langchain_community.embeddings import OllamaEmbeddings

MODEL = "llama3"

model = Ollama(model=MODEL)
embeddings = OllamaEmbeddings(model=MODEL)

In [18]:
template = """You are an intelligent assistant. Your task is to generate 5 questions based on the provided question in different wording and different perspectives to retrieve relevant
 documents from a vector database. By generating multiple perspectives on the user question, your goal is to help the user overcome some of the limitations of the distance-based similarity 
 search. Provide these alternative questions separated by newlines. Original question: {question}
"""
prompt = PromptTemplate(template=template)

In [19]:
generate_queries = (
    {
        "question" : RunnablePassthrough()}
        | prompt
        | model
        | StrOutputParser()
        | (lambda x: x.split("\n"))
)

We can check whether or not our query generation works by invoking the created chain with a query.

In [20]:
generate_queries.invoke("What are the brief history of Natural language processing?")

['Here are five alternative questions that capture different perspectives and wording:',
 '',
 'What is the evolution of human-computer interaction through natural language processing?',
 '',
 'How did NLP develop from its early beginnings to its current applications in text analysis and generation?',
 '',
 'Can you provide an overview of the milestones and advancements in the field of computational linguistics, particularly as it relates to natural language processing?',
 '',
 'What are some key events or breakthroughs that have shaped the course of natural language processing research over the years?',
 '',
 'How do historical developments in areas like machine learning, artificial intelligence, and computer science intersect with the emergence of natural language processing as a distinct field?']

Once we get the 5 questions, we parallelly retrieve the most relevant 5 documents for each question (resulting in a list of lists) and create a new document list by taking the unique documents of the union of all the retrieved documents. To do that we create another chain, retrieval_chain using LCEL.

In [21]:
from langchain.load import loads, dumps
from typing import List

In [23]:
def get_context_union(docs: List[List]):
    all_docs = [dumps(d) for doc in docs for d in doc]
    unique_docs = list(set(all_docs))
    
    return [loads(doc).page_content for doc in unique_docs] # We only return page contents


retrieval_chain = (
    {'question': RunnablePassthrough()}
    | generate_queries
    | retriever.map()
    | get_context_union
)
    

In [24]:
retrieval_chain.invoke("What are the brief history of Natural language processing?")

  warn_beta(


["CO3354 Introduction to natural language processing\nCreate a variable soysents containing all sentences from reports concerning soy\nproducts.\nreuters.categories()\nPick out categories relating to soy:\nsoysents = reuters.sents(categories=['soy-meal', ...])\nDisplay the ﬁrst ten sentences in soysents .\nprint soysents[:10]\nCreate a variable metalwords containing all words from reports concerning\nmetals.\nmetalwords = reuters.words(categories = ['alum','copper','gold', ...])",
 'CNJ (2324) PRO (2243) ‘,’ (1913) ‘.’ (1892) ADV (1485) NP (1224) VN (952)\n. . .\n42',
 'Longmans and Chambers, and the British Library.\nCOBUILD (Bank of English) The Bank of EnglishTMforms part of the Collins\nCorpus, developed by Collins Dictionaries and the University of Birmingham,\nand contains 650 million words.\nGutenberg An archive of free electronic books in various formats hosted at\nhttp://www.gutenberg.org /\nPenn Treebank A corpus of reports from the Wall Street Journal and other sources\nin v

Finally we put all together by creating a one final chain to read the user query, get the contexts from 5 different documents using the retrieval_chain, add both the question and context to the prompt, send it through the LLM, and get the final formatted output using the StrOutputParser.

In [25]:
template = """
    Asnwer the given question using the provided context and if you don't know the answer, just reply as I don't know.
    Context: {context} 
    Question: {question}

"""
prompt = PromptTemplate(template = template)

In [26]:
multi_chain_query = (
    {'context': retrieval_chain, 'question': RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)

In [27]:
multi_chain_query.invoke("What are the brief history of Natural language processing?")

"I don't know. The provided context does not mention the brief history of Natural Language Processing (NLP). It seems to focus on text analysis, corpora, and related topics. If you're looking for information on the history of NLP, I can try to provide a general overview or suggest some resources where you might find this information."