## Hypothetical Document Embeddings (HyDE)

modified from - https://github.com/langchain-ai/langchain/tree/master/cookbook

HyDE creates a "Hypothetical" answer with the LLM and then embeds that for search

HyDE = Base Embedding model+ LLM Chain (with prompts)

In [1]:
from langchain.chains import LLMChain, HypotheticalDocumentEmbedder
from langchain.prompts import PromptTemplate

from langchain.document_loaders import TextLoader
import langchain

In [2]:
from langchain_ollama.llms import OllamaLLM
from langchain_ollama import OllamaEmbeddings
from langchain_ollama import ChatOllama

In [3]:
model_name = "gemma2:2b"
ollama_embds = OllamaEmbeddings(model=model_name)
llm = OllamaLLM(model=model_name)

In [4]:
# Load with `web_search` prompt
embeddings = HypotheticalDocumentEmbedder.from_llm(
    llm,
    ollama_embds,  # ollama embeddings
    prompt_key="web_search"
)

In [6]:
embeddings.llm_chain.prompt

NameError: name 'embeddings' is not defined

In [7]:
langchain.debug = True

In [7]:
# Now we can use it as any embedding class!
result = embeddings.embed_query("What items does McDonalds make?")

[32;1m[1;3m[llm/start][0m [1m[llm:OllamaLLM] Entering LLM run with input:
[0m{
  "prompts": [
    "Please write a passage to answer the question \nQuestion: What items does McDonalds make?\nPassage:"
  ]
}
[36;1m[1;3m[llm/end][0m [1m[llm:OllamaLLM] [24.47s] Exiting LLM run with output:
[0m{
  "generations": [
    [
      {
        "text": "McDonald's is famous for its classic fast-food menu, featuring iconic items like french fries and hamburgers.  They are known for their Big Mac, Chicken McNuggets, Quarter Pounder, McChicken, Happy Meal toys, and a variety of other burgers, sandwiches, and breakfast items.  Beyond the core offerings, McDonald's also has salads, wraps, oatmeal, fruit cups, and beverages like milkshakes and soda.  Their menu is designed to be customizable with various toppings, sauces, and combinations.  Whether you're craving something familiar or looking for a quick bite, there's likely something at McDonald's to satisfy your hunger! \n",
        "generatio

## Using own prompts

In [8]:
prompt_template = """Please answer the user's question as a single food item
Question: {question}
Answer:"""

prompt = PromptTemplate(input_variables=["question"], template=prompt_template)

llm_chain = LLMChain(llm=llm, prompt=prompt)

  warn_deprecated(


In [9]:
embeddings = HypotheticalDocumentEmbedder(
    llm_chain=llm_chain,
    # base_embeddings=bge_embeddings
    base_embeddings=ollama_embds
)

In [10]:
result = embeddings.embed_query(
    "What is is McDonalds best selling item?"
)

[32;1m[1;3m[llm/start][0m [1m[llm:OllamaLLM] Entering LLM run with input:
[0m{
  "prompts": [
    "Please answer the user's question as a single food item\nQuestion: What is is McDonalds best selling item?\nAnswer:"
  ]
}
[36;1m[1;3m[llm/end][0m [1m[llm:OllamaLLM] [3.94s] Exiting LLM run with output:
[0m{
  "generations": [
    [
      {
        "text": "**Big Mac** üçî \n",
        "generation_info": {
          "model": "gemma2:2b",
          "created_at": "2024-08-21T05:43:34.4961947Z",
          "response": "",
          "done": true,
          "done_reason": "stop",
          "context": [
            106,
            1645,
            108,
            5958,
            3448,
            573,
            2425,
            235303,
            235256,
            2872,
            685,
            476,
            3821,
            2960,
            2599,
            108,
            9413,
            235292,
            2439,
            603,
            603,
            

## Using HyDE

In [4]:
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma

# with open("../../state_of_the_union.txt") as f:
#     state_of_the_union = f.read()

loaders = [
    TextLoader('./data//whats_next_for_podium.txt',
               encoding='UTF8'),
]
docs = []
for l in loaders:
    docs.extend(l.load())

text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)

texts = text_splitter.split_documents(docs) #split_text

Created a chunk of size 1244, which is longer than the specified 1000
Created a chunk of size 1050, which is longer than the specified 1000


In [9]:
texts

[Document(metadata={'source': './data//whats_next_for_podium.txt'}, page_content="How Podium optimized agent behavior and reduced engineering intervention by 90% with LangSmith\nSee how Podium tests across the lifecycle development of their AI employee agent, using LangSmith for dataset curation and finetuning. They improved agent F1 response quality to 98% and reduced the need for engineering intervention by 90%.\n\n5 min read\nAug 15, 2024\nAbout Podium\nPodium is a communication platform that helps small businesses connect quickly with customers via phone, text, email, and social media. Small businesses often have high-touch interactions with customers ‚Äî think automotive dealers, jewelers, bike shops ‚Äî yet are understaffed. Podium's mission is to help these businesses respond to customer inquiries promptly so that they can convert leads into sales."),
 Document(metadata={'source': './data//whats_next_for_podium.txt'}, page_content='Podium data shows that responding to customer i

In [5]:
prompt_template = """Please answer the user's question as related to Large Language Models
Question: {question}
Answer:"""

prompt = PromptTemplate(input_variables=["question"], template=prompt_template)

llm_chain = LLMChain(llm=llm, prompt=prompt)

  warn_deprecated(


In [6]:
embeddings = HypotheticalDocumentEmbedder(
    llm_chain=llm_chain,
    base_embeddings=ollama_embds
)

In [None]:
docsearch = Chroma.from_documents(texts, embeddings)

In [None]:
query = "What is podium?"
docs = docsearch.similarity_search(query)

In [8]:
print(docs[0].page_content)

Their engineering team then found it helpful to upgrade to a larger model, curating the outputs into a smaller model (using a technique called model distillation). Upgrading their model went smoothly since model inputs and outputs were automatically captured in LangSmith‚Äôs traces, allowing the team to easily curate datasets.

Podium engineers also enriched LangSmith traces with metadata on customer profiles, business types, and other parameters important to their business. They grouped traces using specific identifiers in LangSmith, making it easy to aggregate related traces during data curation. This enriched data enabled Podium to create a higher-quality and balanced dataset, which improved model fine-tuning and helped them avoid overfitting).


In [12]:
from langchain.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

# RAG
template = """Answer the following question based on this context:

{context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

final_rag_chain = (
    prompt
    | llm
    | StrOutputParser()
)

final_rag_chain.invoke({"context":docs[0].page_content,"question":query})

'Based on the provided text, **Podium** appears to be a company or organization that develops and uses AI models. \n\nHere\'s why:\n\n* **"Their engineering team..."**:  This suggests Podium has an engineering team responsible for developing and deploying their AI model.\n* **"Upgrading their model..."**: This implies they are actively involved in building, modifying, and improving their AI model. \n* **"LangSmith traces..."**: LangSmith is likely a tool or system that captures data and helps with the process of building and fine-tuning the AI model.\n* **"Enriched LangSmith traces..."**:  This indicates they are using LangSmith to add additional information to the data, making it more usable for their AI models.\n\n\nTherefore, based on the context provided, Podium is likely an organization involved in developing and implementing AI models. \n'

In [24]:
final_rag_chain.invoke({"context":docs[0].page_content,"question":query})

'Based on the context provided, **Podium** seems to be an engineering team or organization that develops and uses language models. \n\nHere\'s why:\n\n* **They use LangSmith:** This suggests Podium has a specific system for processing language model data (LangSmith might be a tool or platform).\n* **Curating datasets:**  The context mentions the engineering team "curating outputs" and "grouping traces" suggesting they work with large amounts of data. \n* **Model fine-tuning:** The team enhances their models with curated datasets, highlighting a focus on training language models for specific tasks (like customer service or business information analysis).\n\nTherefore, Podium likely focuses on building and applying sophisticated language models for various purposes like customer service interactions and business intelligence analysis.  \n'

## Comparing

In [None]:
from langchain_ollama import OllamaEmbeddings

In [None]:
vectorstore = Chroma.from_documents(
    texts,
    OllamaEmbeddings(model=model_name)
)

In [17]:
retriever = vectorstore.as_retriever(search_type="similarity")

In [20]:
template = """Answer the following question based on this context:

{context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

In [22]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

def format_docs(docs):
    # Í≤ÄÏÉâÌïú Î¨∏ÏÑú Í≤∞Í≥ºÎ•º ÌïòÎÇòÏùò Î¨∏Îã®ÏúºÎ°ú Ìï©Ï≥êÏ§çÎãàÎã§.
    return "\n\n".join(doc.page_content for doc in docs)


# Ï≤¥Ïù∏ÏùÑ ÏÉùÏÑ±Ìï©ÎãàÎã§.
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [23]:
rag_chain.invoke(query)

'Podium is a communication platform designed to help small businesses efficiently connect with their customers through phone, text, email, and social media. Their goal is to enable these businesses to handle high-touch interactions like those found in automotive dealerships, jewelry stores, or bike shops, ultimately helping them convert leads into sales. \n'