Understand Query Translation through code

![Query Translation Pipeline](query_translation.jpg)

We don’t know beforehand if the generated questions will exactly match the document contents.
Your original query might be too narrow.
By generating multiple versions, you’re casting a wider net.
Even if 2–3 versions don’t match the document well, at least 1–2 of them are more likely to hit the right context in the vector store.
This is like searching in Google:
If you type one phrasing, maybe you miss the right page.
If you try synonyms or related phrasing, you have a better chance of finding the right result.

In [27]:
# 1. Multi Query 
"""
Instead of asking one question, we generate many variations of the same question. Unique retrieved documents are combined to form the final answer.
"""

#2 RAG Fusion
"""Here we merge results from different query variations (like in multi-query). remove duplicates and combine the results."""

#3. Decomposition 
"""
We break down complex questions into simpler sub-questions.
1. Either give the answer for one sub question recursively to the next sub question as context. 
2. generate the answer for all sub questions and then combine the results to LLM.
"""

#4. Step Back Prompting
"""
Instead of directly answering, we first ask a more general version of the question.
We can use few shot prompting for this, to know the history of the question or a more general version of the question.
Example:

Original: “What are the side effects of Paracetamol in children under 5?”

Step-back: “What are the general side effects of Paracetamol?”

"""

#5. HyDE (Hypothetical Document Embedding)
"""
We ask the LLM (Gemini in your case) to write a fake answer to the query.

Then we embed this fake answer and use it to search in the database.

Example:

Query: “Explain photosynthesis.”

LLM writes a short fake doc: “Photosynthesis is how plants use sunlight…”

Embed that doc → search DB → find real docs.
"""



'\nWe ask the LLM (Gemini in your case) to write a fake answer to the query.\n\nThen we embed this fake answer and use it to search in the database.\n\nExample:\n\nQuery: “Explain photosynthesis.”\n\nLLM writes a short fake doc: “Photosynthesis is how plants use sunlight…”\n\nEmbed that doc → search DB → find real docs.\n'

In [29]:
%pip install langchain langchain-google-genai sentence-transformers chromadb python-dotenv --quiet


[0mNote: you may need to restart the kernel to use updated packages.


In [30]:
import os
from dotenv import load_dotenv

from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from sentence_transformers import SentenceTransformer

load_dotenv()
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")
LANGCHAIN_API_KEY = os.getenv("LANGCHAIN_API_KEY")


In [32]:
# # Setup Gemini LLM (using .env key)

llm = ChatGoogleGenerativeAI(
    model="gemini-1.5-flash",
    google_api_key=GEMINI_API_KEY,
    temperature=0.3
)

# Local embedding model
embedding_model = SentenceTransformer("all-MiniLM-L6-v2")


In [34]:
# 1. MULTI QUERY
def generate_multi_queries(question):
    """Generate multiple variations of a question using Gemini."""
    prompt = PromptTemplate(
        input_variables=["question"],
        template="Generate 3 different rephrasings of the question:\n{question}"
    )
    chain = LLMChain(llm=llm, prompt=prompt)
    result = chain.run({"question": question})
    queries = result.split("\n")
    return [q.strip("- ").strip() for q in queries if q.strip()]

print("Multi-Query Example:")
print(generate_multi_queries("What is the capital of America?"))


Multi-Query Example:
['1. Which city serves as the capital of the United States?', '2. Where is the capital of America located?', "3. What's the name of the city that's the capital of the United States of America?"]


In [39]:
# 2. RAG-FUSION

def rag_fusion(question, retriever):
    """Ask multiple queries, combine all retrieved docs."""
    queries = generate_multi_queries(question)
    all_docs = []
    for q in queries:
        docs = retriever(q)  # Assume retriever is a function returning docs
        all_docs.extend(docs)
    # Deduplicate by page content
    unique_docs = list({doc.page_content: doc for doc in all_docs}.values())
    return unique_docs


In [40]:
# 3. DECOMPOSITION
def decompose_question(question):
    """Break a complex question into smaller sub-questions."""
    prompt = PromptTemplate(
        input_variables=["question"],
        template="Break this question into 3 smaller questions:\n{question}"
    )
    chain = LLMChain(llm=llm, prompt=prompt)
    result = chain.run({"question": question})
    return [q.strip("- ").strip() for q in result.split("\n") if q.strip()]

print("\nDecomposition Example:")
print(decompose_question("How did World War II affect the economy of the US?"))




Decomposition Example:
['1. How did World War II impact US industrial production and employment?', '2. What were the long-term effects of WWII on US government spending and debt?', "3. How did World War II reshape the US's role in the global economy?"]


In [42]:
# 4. STEP-BACK
def step_back_question(question):
    """Generate a more general version of a question."""
    prompt = PromptTemplate(
        input_variables=["question"],
        template="Rewrite this question in 3 more general way:\n{question}"
    )
    chain = LLMChain(llm=llm, prompt=prompt)
    return chain.run({"question": question})

print("\nStep-Back Example:")
print(step_back_question("What are the side effects of Paracetamol in children under 5?"))



Step-Back Example:
1. What are the risks associated with using over-the-counter pain relievers in young children?
2. What are the potential adverse effects of common pediatric medications?
3. What safety considerations are important when administering medication to infants and toddlers?


In [48]:
# 5. HyDE
def hyde_query(question):
    """Generate a hypothetical answer and embed it."""
    prompt = PromptTemplate(
        input_variables=["question"],
        template="Write a short hypothetical answer to this question:\n{question}"
    )
    chain = LLMChain(llm=llm, prompt=prompt)
    fake_answer = chain.run({"question": question})
    # Embed the fake answer with local embeddings
    embedding = embedding_model.encode(fake_answer)
    return embedding, fake_answer

print("\nHyDE Example:")
emb, fake_doc = hyde_query("Explain photosynthesis.")
print("Fake Doc:", fake_doc[:100], "...")


HyDE Example:
Fake Doc: Photosynthesis is how plants use sunlight, water, and carbon dioxide to create their own food (sugar ...
