# Chapter 3: RAG Part II: Chatting with your Data
## Query transformation

One of the major problems with a basic RAG system is that it relies too heavily on the quality of a user’s query to generate an accurate output. In a production setting, a user is likely to construct their query in an incomplete, ambiguous, or poorly worded manner that leads to model hallucination.

_Query transformation_ is a subset of strategies designed to modify the user’s input to
answer the first RAG problem question: _How do we handle the variability in the
quality of a user’s input?_

### Hypothetical Document Embeddings (HyDE)

_Hypothetical Document Embeddings_ (HyDE) is a strategy that involves creating a hypothetical document based on the user’s query, embedding the document, and retrieving relevant documents based on vector similarity. The intuition behind HyDE is that an LLM-generated hypothetical document will be more similar to the most relevant documents than the original query.

1. Setup vector store

**NOTE**: Do not forget to launch a new pgvector docker container before using this notebook. Execute ```docker compose up -d``` in the terminal.

In [1]:
from langchain_community.document_loaders import TextLoader
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_postgres.vectorstores import PGVector
from dotenv import load_dotenv
import os

from torch.distributed.rpc.api import docstring

load_dotenv()

# load the document, split it into chunks
raw_documents = TextLoader("./rime.txt").load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
documents = text_splitter.split_documents(raw_documents)

# define embedding model
hf_embedding = HuggingFaceEmbeddings(
    model="sentence-transformers/all-mpnet-base-v2", # use this model to perform the embedding
    model_kwargs={"device": "cpu"},
    encode_kwargs={"normalize_embeddings": False},
)

# vector store credentials
connection_credentials = f"postgresql+psycopg://{os.getenv('POSTGRES_USER')}:{os.getenv('POSTGRES_PASSWORD')}@localhost:8888/{os.getenv('POSTGRES_DB')}"

# embed each chunk and insert it into the vector store
db = PGVector.from_documents(documents=documents, embedding=hf_embedding, connection=connection_credentials)


2. Setup retriever and llm

In [2]:
from langchain_deepseek import ChatDeepSeek
from langchain_core.prompts import ChatPromptTemplate

retriever = db.as_retriever(search_kwargs={"k": 4})

prompt = ChatPromptTemplate.from_template(
    template=
    """
    Answer the question based only on the following context:
    {context}

    Question: {question}
    """
)
llm = ChatDeepSeek(model="deepseek-chat", temperature=0.0)

3. Setup HyDE

In [7]:
from langchain_core.output_parsers import StrOutputParser

prompt_hyde = ChatPromptTemplate.from_template("""Please write a passage to answer the following question:\nQuestion: {question}\n Passage:""")

generate_doc = prompt_hyde | llm | StrOutputParser()

Next, we take the hypothetical document and use it as input to the retriever, which will generate its embedding and search for similar documents in the vector store:

In [8]:
retrieval_chain = generate_doc | retriever

Finally, we take the retrieved documents, pass them as context to the final prompt, and instruct the model to generate an output:

In [9]:
from langchain_core.runnables import chain
from typing import Any

@chain
def hyde_qa(input: str) -> dict[str, Any]:
    # fetch relevant documents
    docs = retrieval_chain.invoke(input=input)
    # format prompt
    formatted_prompt = prompt.invoke(input={"context": docs, "question": input})
    # generate answer
    answer = llm.invoke(input=formatted_prompt)

    return {"answer": answer, "question": input, "docs": docs}

4. Run the model

In [10]:
response = hyde_qa.invoke(input="what are the main events described in the story of the ancyent marinere?")
print(f"question: {response['question']}\n\nanswer: {response['answer'].content}\n\ndocs: {response['docs']}")

question: what are the main events described in the story of the ancyent marinere?

answer: Based solely on the provided context, the main events described are:

1.  **The Killing of the Albatross:** The Ancient Mariner shoots the albatross with his crossbow after it had been following the ship and coming to his call for "food or play." The crew then condemns him for killing the bird that "made the Breeze to blow."

2.  **Supernatural Punishment and the Dead Crew:** A strong wind stops, and the ship is becalmed. The dead men on the ship groan, rise up, and begin to work the ropes without speaking or moving their eyes, creating a "ghastly crew." The Mariner is terrified, especially by the silent presence of his nephew's body.

3.  **The Sinking of the Ship and Rescue:** A loud, dreadful sound from under the water causes the ship to sink "like lead." The Mariner is found stunned and floating, but is swiftly pulled into a small boat (the Pilot's boat). The sight of him causes the Pilot to

To recap what we covered in this section, query transformation consists of taking the user’s original query and doing the following:
* Rewriting into one or more queries.
* Combining the results of those queries into a single set of the most relevant results.

**NOTE:** Do not forget to remove the pgvector container when done using this notebook. Execute ```docker compose down --volumes``` in the terminal.