#	What is HyDE?

Hypothetical Document Embeddings (HyDE) is an advance RAG technique that generates synthetic document embeddings based on the query, rather than relying solely on existing document embeddings. This approach creates a "hypothetical" perfect document that would ideally answer the query, and then uses this embedding to retrieve the most similar actual documents from the database. HyDE aims to bridge the gap between query intent and document content, especially for complex or nuanced queries.

# HyDE-RAG Implementation:

1. **Hypothetical Document Generation:** We use the LLM to generate a hypothetical passage that would answer the given query.
2. **Retrieval using Hypothetical Document:** We use the generated hypothetical document to retrieve relevant passages from our vector store.
3. **Final Response Generation:** Using the retrieved context, we generate a final response to the original query.

# Setup

1. **[LLM](https://deepmind.google/technologies/gemini/pro/):** Google's free gemini-pro api endpoint ([Google's API Key](https://console.cloud.google.com/apis/credentials))
2. **[Vector Store](https://www.pinecone.io/learn/vector-database/):** [ChromaDB](https://www.trychroma.com/)
3. **[Embedding Model](https://qdrant.tech/articles/what-are-embeddings/):** [nomic-embed-text-v1.5](https://www.nomic.ai/blog/posts/nomic-embed-text-v1)
4. **[LLM Framework](https://python.langchain.com/v0.2/docs/introduction/):** LangChain
5. **[Huggingface API Key](https://huggingface.co/settings/tokens)**





# Install required libraries

In [None]:
!pip install -q -U \
     Sentence-transformers==3.0.1 \
     langchain==0.2.11 \
     langchain-google-genai==1.0.7 \
     langchain-chroma==0.1.2 \
     langchain-community==0.2.10 \
     langchain-huggingface==0.0.3 \
     einops==0.8.0

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/67.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m5.1 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.4/50.4 kB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m227.1/227.1 kB[0m [31m10.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m990.3/990.3 kB[0m [31m30.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.3/2.3 MB[0m [31m55.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.2/43.2 kB[0m [31m868.5 kB/s[0m eta [36m0:

# Import related libraries related to Langchain, HuggingfaceEmbedding

In [None]:
from langchain_google_genai import (
    ChatGoogleGenerativeAI,
    HarmBlockThreshold,
    HarmCategory,
)
from langchain_huggingface import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.prompts import ChatPromptTemplate
from langchain.document_loaders import WebBaseLoader



In [None]:
import os
import getpass

# Provide Google API Key. You can create Google API key at following link

[Google Gemini-Pro API Creation Link](https://console.cloud.google.com/apis/credentials)

[YouTube Video](https://www.youtube.com/watch?v=ZHX7zxvDfoc)



In [None]:
os.environ["GOOGLE_API_KEY"] = getpass.getpass()

··········


# Provide Huggingface API Key. You can create Huggingface API key at following link

[Higgingface API Creation Link](https://huggingface.co/settings/tokens)




In [None]:
os.environ["HF_TOKEN"] = getpass.getpass()

··········


# Step 1: Load and preprocess data code

In [None]:
def load_and_process_data(url):
    # Load data from web
    loader = WebBaseLoader(url)
    data = loader.load()

    # Split text into chunks (Experiment with Chunk Size and Chunk Overlap to get optimal chunking)
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
    chunks = text_splitter.split_documents(data)

    return chunks

# Step 2: Create vector store code

In [None]:
def create_vector_store(chunks):
    embeddings = HuggingFaceEmbeddings(model_name="nomic-ai/nomic-embed-text-v1.5", model_kwargs = {'trust_remote_code': True})
    vectorstore = Chroma.from_documents(chunks, embeddings)
    return vectorstore

# Step 3: HyDE-RAG related code

1. **Hypothetical Document Generation:** We use the LLM to generate a hypothetical passage that would answer the given query.
2. **Retrieval using Hypothetical Document:** We use the generated hypothetical document to retrieve relevant passages from our vector store.
3. **Final Response Generation:** Using the retrieved context, we generate a final response to the original query.

In [None]:
def hyde_rag(query, vectorstore, llm):
    # Generate hypothetical document
    hyde_prompt = ChatPromptTemplate.from_template(
        "Given the following question, generate a hypothetical passage that would answer this question:\nQuestion: {query}\nHypothetical Passage:"
    )
    hyde_chain = hyde_prompt | llm
    hypothetical_doc = hyde_chain.invoke({"query": query})

    # Retrieve relevant documents using the hypothetical document
    retrieved_docs = vectorstore.similarity_search(hypothetical_doc.content, k=3)
    context = "\n".join([doc.page_content for doc in retrieved_docs])

    # Generate final response
    final_prompt = ChatPromptTemplate.from_template(
        "Based on the following context, please answer the question:\nContext: {context}\nQuestion: {query}\nAnswer:"
    )
    final_chain = final_prompt | llm
    final_response = final_chain.invoke({"context": context, "query": query})

    return {
        "hypothetical_document": hypothetical_doc.content,
        "retrieved_context": context,
        "final_answer": final_response.content
    }

# Step 4: Create chunk of web data to Chroma Vector Store

In [None]:
# Initialize the gemini-pro language model with specified settings (Change temeprature  and other parameters as per your requirement)
llm = ChatGoogleGenerativeAI(model="gemini-pro", temperature=0.3, safety_settings={
          HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_NONE,
        },)

# Load and process data
url = "https://en.wikipedia.org/wiki/Artificial_intelligence"
chunks = load_and_process_data(url)

# Create vector store
vectorstore = create_vector_store(chunks)

modules.json:   0%|          | 0.00/255 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/140 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/71.2k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/120 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/2.06k [00:00<?, ?B/s]

configuration_hf_nomic_bert.py:   0%|          | 0.00/1.96k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/nomic-ai/nomic-bert-2048:
- configuration_hf_nomic_bert.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


modeling_hf_nomic_bert.py:   0%|          | 0.00/84.7k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/nomic-ai/nomic-bert-2048:
- modeling_hf_nomic_bert.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


model.safetensors:   0%|          | 0.00/547M [00:00<?, ?B/s]



tokenizer_config.json:   0%|          | 0.00/1.19k [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/695 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/286 [00:00<?, ?B/s]

# Step 5: Run HyDE-RAG

This implementation shows the key parts of HyDE-RAG:

1. Generation of a hypothetical document based on the query
2. Using the hypothetical document for retrieval instead of the original query
3. Final response generation using the retrieved context

In [None]:
# Example query
query = "What are the ethical considerations in AI development?"
result = hyde_rag(query, vectorstore, llm)

print("Hypothetical Document:")
print(result["hypothetical_document"])
print("\nRetrieved Context:")
print(result["retrieved_context"])
print("\nFinal Answer:")
print(result["final_answer"])

Hypothetical Document:
**Ethical Considerations in AI Development**

The rapid advancement of artificial intelligence (AI) has brought forth a myriad of ethical considerations that must be carefully navigated to ensure responsible and beneficial development. Here are some key ethical concerns that arise in AI development:

**Bias and Discrimination:** AI algorithms can inherit biases from the data they are trained on, leading to unfair or discriminatory outcomes. Developers must ensure that AI systems are trained on diverse and representative datasets to mitigate bias and promote inclusivity.

**Privacy and Data Protection:** AI systems often process vast amounts of personal data, raising concerns about privacy and data protection. Developers must implement robust security measures and obtain informed consent from individuals before collecting and using their data.

**Autonomy and Responsibility:** As AI systems become more autonomous, questions arise about who is responsible for their