# 06 - HyDe (Hypothetical Document Embeddings)

**Complexity:** ‚≠ê‚≠ê‚≠ê

**Use Cases:** Ambiguous queries, domain jargon, queries with abbreviations

**Key Feature:** Generates hypothetical "perfect answer" document, embeds it, uses for retrieval.

**Example:**
```
Query: "How does MMR work?"

Hypothetical Doc:
"MMR (Maximal Marginal Relevance) balances relevance with diversity by
iteratively selecting documents that are relevant to query AND dissimilar
to already selected documents..."

‚Üí Embedding this detailed description finds better semantic matches
```

In [1]:
import sys
sys.path.append('../..')

from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from shared.config import OPENAI_VECTOR_STORE_PATH, DEFAULT_MODEL
from shared.utils import load_vector_store, print_section_header, format_docs
from shared.prompts import HYDE_PROMPT, RAG_PROMPT_TEMPLATE
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough, RunnableLambda

print_section_header("Setup: HyDe")

embeddings = OpenAIEmbeddings()
vectorstore = load_vector_store(OPENAI_VECTOR_STORE_PATH, embeddings)
llm = ChatOpenAI(model=DEFAULT_MODEL, temperature=0)

print("‚úÖ Setup complete!")


SETUP: HYDE

‚úì Loaded vector store from /Users/gianlucamazza/Workspace/notebooks/llm_rag/notebooks/advanced_architectures/../../data/vector_stores/openai_embeddings
‚úÖ Setup complete!


## 2. HyDe Document Generator

In [2]:
print_section_header("HyDe Generator")

# Create HyDe document generator
hyde_generator = HYDE_PROMPT | llm | StrOutputParser()

# Test
query = "What is semantic search?"
print(f"Query: '{query}'\n")

hypo_doc = hyde_generator.invoke({"question": query})
print("Generated Hypothetical Document:")
print("=" * 80)
print(hypo_doc)
print("=" * 80)


HYDE GENERATOR

Query: 'What is semantic search?'

Generated Hypothetical Document:
# Understanding Semantic Search

## Introduction
Semantic search is an advanced search technique that aims to improve search accuracy by understanding the intent and contextual meaning of search queries. Unlike traditional keyword-based search methods, which rely heavily on matching keywords in the query with those in the database, semantic search focuses on the relationships between words and the concepts they represent. This document provides a comprehensive overview of semantic search, its principles, technologies, applications, and benefits.

## Key Principles of Semantic Search

### 1. Contextual Understanding
Semantic search systems analyze the context in which words are used. This involves understanding synonyms, antonyms, and the nuances of language. For example, the word "bank" can refer to a financial institution or the side of a river, and semantic search helps determine the correct meaning 

## 3. HyDe Retrieval

In [3]:
from shared.utils import print_results

print_section_header("HyDe vs Standard Retrieval")

query = "How to improve retrieval quality?"

# Standard retrieval
print("[STANDARD RETRIEVAL]")
standard_docs = vectorstore.similarity_search(query, k=3)
print_results(standard_docs, max_docs=2, preview_length=120)

# HyDe retrieval
print("\n" + "=" * 80)
print("\n[HYDE RETRIEVAL]")
hypo_doc = hyde_generator.invoke({"question": query})
print(f"\nGenerated doc preview: {hypo_doc[:200]}...\n")
hyde_docs = vectorstore.similarity_search(hypo_doc, k=3)
print_results(hyde_docs, max_docs=2, preview_length=120)

print("\nüí° HyDe often finds more semantically relevant documents")


HYDE VS STANDARD RETRIEVAL

[STANDARD RETRIEVAL]

Retrieved Documents
--------------------------------------------------------------------------------

1. Source: https://python.langchain.com/docs/use_cases/question_answering/
   Type: web_documentation
   Date: 2025-11-12
   Content: One of the most powerful applications enabled by LLMs is sophisticated question-answering (Q&A) chatbots. These are appl...

2. Source: https://python.langchain.com/docs/use_cases/chatbots/
   Type: web_documentation
   Date: 2025-11-12
   Content: One of the most powerful applications enabled by LLMs is sophisticated question-answering (Q&A) chatbots. These are appl...

... and 1 more documents


[HYDE RETRIEVAL]

Generated doc preview: # Improving Retrieval Quality: A Comprehensive Guide

## Introduction
Retrieval quality refers to the effectiveness and accuracy with which information is retrieved from a database, search engine, or ...


Retrieved Documents
---------------------------------------------

## 4. HyDe RAG Chain

In [4]:
print_section_header("HyDe RAG Chain")

def hyde_retrieve(query: str):
    hypo_doc = hyde_generator.invoke({"question": query})
    docs = vectorstore.similarity_search(hypo_doc, k=4)
    return docs

hyde_retriever = RunnableLambda(hyde_retrieve)

hyde_chain = (
    {"context": hyde_retriever | format_docs, "input": RunnablePassthrough()}
    | RAG_PROMPT_TEMPLATE
    | llm
    | StrOutputParser()
)

print("‚úì HyDe RAG chain created")

# Test
query = "Best practices for chunk sizing?"
print(f"\nQuery: '{query}'\n")
print("=" * 80)

response = hyde_chain.invoke(query)
print(response)
print("\n" + "=" * 80)


HYDE RAG CHAIN

‚úì HyDe RAG chain created

Query: 'Best practices for chunk sizing?'

The context provided does not contain specific information about best practices for chunk sizing. It only mentions that "Text splitters break large Documents into smaller chunks" and that "large chunks are harder to search over and won‚Äôt fit in a model‚Äôs finite context window." 

For best practices on chunk sizing, you may want to consider factors such as the model's context window size, the nature of the data, and the intended use of the chunks, but this information is not detailed in the context provided.



## Summary

**Flow:**
```
Query ‚Üí Generate Hypo Doc ‚Üí Embed ‚Üí Retrieve ‚Üí LLM ‚Üí Response
```

**Advantages:**
‚úÖ Better for ambiguous queries  
‚úÖ Handles jargon and abbreviations  
‚úÖ Improves semantic matching  
‚úÖ Works with specialized domains  

**Limitations:**
- Extra LLM call (cost + latency)
- May hallucinate in hypo doc
- Not always better than standard

**When to Use:**
- Vague or ambiguous queries
- Technical jargon
- Queries with abbreviations

**Next:** [07_adaptive_rag.ipynb](07_adaptive_rag.ipynb) - Intelligent query routing