# Question Answering with LangChain, OpenAI, and MultiQuery Retriever

This interactive workbook demonstrates example of Elasticsearch's [MultiQuery Retriever](https://api.python.langchain.com/en/latest/retrievers/langchain.retrievers.multi_query.MultiQueryRetriever.html) to generate similar queries for a given user input and apply all queries to retrieve a larger set of relevant documents from a vectorstore.

Before we begin, we first split the fictional workplace documents into passages with `langchain` and uses OpenAI to transform these passages into embeddings and then store these into Elasticsearch.

We will then ask a question, generate similar questions using langchain and OpenAI, retrieve relevant passages from the vector store, and use langchain and OpenAI again to provide a summary for the questions.

## Install packages and import modules

In [9]:
# 1. CLEANUP: Uninstalls all common LangChain components and related libraries.
# This prevents residual packages from causing dependency conflicts.
!python3 -m pip uninstall -y langchain langchain-core langchain-community langchain-text-splitters langchain_openai langchain-elasticsearch langgraph-prebuilt langchain-classic tiktoken

# 2. INSTALLATION: Installs all required base, core, and specialized component packages.
# We explicitly include langchain and langchain-core for version stability.
!python3 -m pip install -qU \
    jq \
    lark \
    tiktoken \
    langchain \
    langchain-core \
    langchain-community \
    langchain-text-splitters \
    langchain-openai \
    langchain-elasticsearch

Found existing installation: langchain 0.3.27
Uninstalling langchain-0.3.27:
  Successfully uninstalled langchain-0.3.27
Found existing installation: langchain-core 1.0.7
Uninstalling langchain-core-1.0.7:
  Successfully uninstalled langchain-core-1.0.7
Found existing installation: langchain-community 0.4.1
Uninstalling langchain-community-0.4.1:
  Successfully uninstalled langchain-community-0.4.1
Found existing installation: langchain-text-splitters 1.0.0
Uninstalling langchain-text-splitters-1.0.0:
  Successfully uninstalled langchain-text-splitters-1.0.0
Found existing installation: langchain-openai 0.3.35
Uninstalling langchain-openai-0.3.35:
  Successfully uninstalled langchain-openai-0.3.35
Found existing installation: langchain-elasticsearch 0.4.0
Uninstalling langchain-elasticsearch-0.4.0:
  Successfully uninstalled langchain-elasticsearch-0.4.0
[0mFound existing installation: langchain-classic 1.0.0
Uninstalling langchain-classic-1.0.0:
  Successfully uninstalled langchain-c

## Connect to Elasticsearch

ℹ️ We're using an Elastic Cloud deployment of Elasticsearch for this notebook. If you don't have an Elastic Cloud deployment, sign up [here](https://cloud.elastic.co/registration?utm_source=github&utm_content=elasticsearch-labs-notebook) for a free trial.

We'll use the **Cloud ID** to identify our deployment, because we are using Elastic Cloud deployment. To find the Cloud ID for your deployment, go to https://cloud.elastic.co/deployments and select your deployment.

We will use [ElasticsearchStore](https://api.python.langchain.com/en/latest/vectorstores/langchain.vectorstores.elasticsearch.ElasticsearchStore.html) to connect to our elastic cloud deployment, This would help create and index data easily.  We would also send list of documents that we created in the previous step

In [1]:
# Set credentials directly (for demonstration only—never expose in shared code!)
ELASTIC_CLOUD_ID = "roland2:dXMtY2VudHJhbDEuZ2NwLmNsb3VkLmVzLmlvOjQ0MyQ2OGI0ZTE3ZWEwMmU0ZjVlYjMyNTcwNmMyMjU5YjcyNCRlYTAwOWJlYjZjYmE0OTQ0OWFkYWI3ZjhhMDMzNDNhMg=="
ELASTIC_API_KEY = "SndMQW9wb0JybXhhNEVOc3hlZTc6UEFBX0ZieXduMkZFMmhoLV9qNmdXQQ=="
OPENAI_API_KEY = "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"

from langchain_openai.embeddings import OpenAIEmbeddings
from langchain_elasticsearch import ElasticsearchStore

# --- ADDED IMPORT FOR THE LLM ---
from langchain_openai.llms import OpenAI
# --- END ADDED IMPORT ---

embeddings = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY)

vectorstore = ElasticsearchStore(
    es_cloud_id=ELASTIC_CLOUD_ID,
    es_api_key=ELASTIC_API_KEY,
    index_name="roland_index",
    embedding=embeddings,
)

# Now it can find OpenAI
llm = OpenAI(temperature=0, openai_api_key=OPENAI_API_KEY)

# MultiQueryRetriever still needs to be imported, but we'll put it in the next cell (Cell 21)
# where it's actually used, to keep cells clean.

## Indexing Data into Elasticsearch
Let's download the sample dataset and deserialize the document.

In [2]:
from urllib.request import urlopen
import json

url = "https://raw.githubusercontent.com/elastic/elasticsearch-labs/main/example-apps/chatbot-rag-app/data/data.json"

response = urlopen(url)
data = json.load(response)

with open("temp.json", "w") as json_file:
    json.dump(data, json_file)

### Split Documents into Passages

We’ll chunk documents into passages in order to improve the retrieval specificity and to ensure that we can provide multiple passages within the context window of the final question answering prompt.

Here we are chunking documents into 800 token passages with an overlap of 400 tokens.

Here we are using a simple splitter but Langchain offers more advanced splitters to reduce the chance of context being lost.

In [3]:
from langchain_community.document_loaders import JSONLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter


def metadata_func(record: dict, metadata: dict) -> dict:
    #Populate the metadata dictionary with keys name, summary, url, category, and updated_at.
    metadata["name"] = record.get("name")
    metadata["summary"] = record.get("summary")
    metadata["url"] = record.get("url")
    metadata["category"] = record.get("category")
    metadata["updated_at"] = record.get("updated_at")

    return metadata


# For more loaders https://python.langchain.com/docs/modules/data_connection/document_loaders/
# And 3rd party loaders https://python.langchain.com/docs/modules/data_connection/document_loaders/#third-party-loaders
loader = JSONLoader(
    file_path="temp.json",
    jq_schema=".[]",
    content_key="content",
    metadata_func=metadata_func,
)

text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=800, chunk_overlap=400 #define chunk size and chunk overlap
)
docs = loader.load_and_split(text_splitter=text_splitter)

### Bulk Import Passages

Now that we have split each document into the chunk size of 800, we will now index data to elasticsearch using [ElasticsearchStore.from_documents](https://api.python.langchain.com/en/latest/vectorstores/langchain.vectorstores.elasticsearch.ElasticsearchStore.html#langchain.vectorstores.elasticsearch.ElasticsearchStore.from_documents).

We will use Cloud ID, Password and Index name values set in the `Create cloud deployment` step.

In [5]:
from langchain.retrievers.multi_query import MultiQueryRetriever

documents = vectorstore.from_documents(
    docs,
    embeddings,
    index_name="roland_index",  # Changed 'roland index' to 'roland_index'
    es_cloud_id=ELASTIC_CLOUD_ID,
    es_api_key=ELASTIC_API_KEY,
)

llm = OpenAI(temperature=0, openai_api_key=OPENAI_API_KEY)

retriever = MultiQueryRetriever.from_llm(vectorstore.as_retriever(), llm)

# Question Answering with MultiQuery Retriever

Now that we have the passages stored in Elasticsearch, we can now ask a question to get the relevant passages.

In [6]:
from langchain.schema.runnable import RunnableParallel, RunnablePassthrough
from langchain.prompts import ChatPromptTemplate, PromptTemplate
from langchain.schema import format_document

import logging

logging.basicConfig()
logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.INFO)

LLM_CONTEXT_PROMPT = ChatPromptTemplate.from_template(
    """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Be as verbose and educational in your response as possible.

    context: {context}
    Question: "{question}"
    Answer:
    """
)

LLM_DOCUMENT_PROMPT = PromptTemplate.from_template(
    """
---
SOURCE: {name}
{page_content}
---
"""
)


def _combine_documents(
    docs, document_prompt=LLM_DOCUMENT_PROMPT, document_separator="\n\n"
):
    doc_strings = [format_document(doc, document_prompt) for doc in docs]
    return document_separator.join(doc_strings)


_context = RunnableParallel(
    context=retriever | _combine_documents,
    question=RunnablePassthrough(),
)

chain = _context | LLM_CONTEXT_PROMPT | llm

ans = chain.invoke("what is the nasa sales team?")

print("---- Answer ----")
print(ans)

INFO:langchain.retrievers.multi_query:Generated queries: ['1. Can you provide information on the sales team at NASA?', '2. How does the sales team at NASA operate?', '3. What are the responsibilities of the sales team at NASA?']


---- Answer ----
The NASA sales team is a part of the Americas region in the sales organization. It is led by two Area Vice-Presidents, Laura Martinez for North America and Gary Johnson for South America. The team is responsible for serving customers in the United States, Canada, Mexico, Central and South America. They work closely with other departments to identify new business opportunities, maintain existing client relationships, and ensure customer satisfaction.


**Generate at least two new iteratioins of the previous cells - Be creative.** Did you master Multi-
Query Retriever concepts through this lab?

In [7]:
# SUMMARY: Tests the MultiQuery Retriever with a complex question that requires
# the LLM to generate multiple sub-queries (e.g., 'What is the policy?',
# 'Who is the contact?', 'What's the official name?') to gather context from
# different parts of the knowledge base before synthesizing a single answer.

complex_query = "What is the process for submitting a bug report for the WonderVector5000, and who is the engineering lead for that product?"

ans_2 = chain.invoke(complex_query)

print("--- Query ---")
print(complex_query)
print("\n---- Answer ----")
print(ans_2)

INFO:langchain.retrievers.multi_query:Generated queries: ['1. How can I submit a bug report for the WonderVector5000? Who is the engineering lead for this product?', '2. What are the steps to follow when submitting a bug report for the WonderVector5000? Can you tell me who the engineering lead is for this product?', "3. Is there a specific process for reporting bugs related to the WonderVector5000? I'm also curious about the engineering lead for this product."]


--- Query ---
What is the process for submitting a bug report for the WonderVector5000, and who is the engineering lead for that product?

---- Answer ----

To submit a bug report for the WonderVector5000, you would first need to identify the issue and gather any relevant information, such as steps to reproduce the bug and any error messages. Then, you would need to log into the bug tracking system and create a new bug report, providing all the necessary details. This report would then be assigned to the engineering team responsible for the WonderVector5000.

The engineering lead for the WonderVector5000 would be the Principal Software Engineer, as they are responsible for leading the design, development, and maintenance of large-scale, mission-critical software applications and components. They also provide technical leadership and mentorship to software engineering teams, making them the point person for any issues related to the WonderVector5000.


In [11]:
# SUMMARY: This notebook demonstrates the MultiQuery Retriever, which is a technique
# that allows an LLM to generate multiple distinct search queries (thoughts) from a
# single complex user question to retrieve a wider and more relevant context from the
# Elasticsearch vector store before synthesizing a final, grounded answer.

complex_query_2 = "What is the work from home policy?"

ans_3 = chain.invoke(complex_query_2)

print("--- Query ---")
print(complex_query_2)
print("\n---- Answer ----")
print(ans_3)

INFO:langchain.retrievers.multi_query:Generated queries: ["1. Can you provide information on the company's work from home policy?", '2. How does the work from home policy at this company compare to others in the industry?', '3. What are the guidelines and restrictions of the work from home policy in place?']


--- Query ---
What is the work from home policy?

---- Answer ----

The work from home policy is a set of guidelines and support provided to employees to conduct their work remotely. It was implemented in March 2020 in response to the COVID-19 pandemic and is designed to ensure the continuity and productivity of business operations. The policy applies to all eligible employees and allows them to work from home full-time while maintaining the same level of performance and collaboration as they would in the office. Employees must have approval from their direct supervisor and the HR department to be eligible for this arrangement. The company provides necessary equipment and resources for remote work, and employees are responsible for creating a comfortable and safe workspace. Effective communication, maintaining regular work hours and availability, and meeting performance expectations are also important aspects of the policy. Employees are required to accurately track their work hours an

In [10]:
# SUMMARY: This notebook demonstrates the MultiQuery Retriever, which is a technique
# that allows an LLM to generate multiple distinct search queries (thoughts) from a
# single complex user question to retrieve a wider and more relevant context from the
# Elasticsearch vector store before synthesizing a final, grounded answer.

complex_query_3 = "What is the sales strategy for fiscal year 2024?"

ans_4 = chain.invoke(complex_query_3)

print("--- Query ---")
print(complex_query_3)
print("\n---- Answer ----")
print(ans_4)

INFO:langchain.retrievers.multi_query:Generated queries: ['1. What are the key components of the sales strategy for fiscal year 2024?', '2. How does the sales strategy for fiscal year 2024 differ from previous years?', '3. Can you provide insights into the sales strategy for fiscal year 2024 and its potential impact on revenue?']


--- Query ---
What is the sales strategy for fiscal year 2024?

---- Answer ----

The sales strategy for fiscal year 2024 is outlined in a document that includes key objectives, focus areas, and action plans for a tech company's sales operations. The primary goal of this strategy is to increase revenue, expand market share, and strengthen customer relationships in the company's target markets. The specific objectives for fiscal year 2024 include increasing revenue by 20%, expanding market share in key segments by 15%, retaining 95% of existing customers, and launching at least two new products or services in high-demand market segments. The focus areas of the strategy include targeting high-growth industries in existing markets, identifying and penetrating new markets, strengthening relationships with key accounts and strategic partners, pursuing new customers in underserved market segments, and developing tailored offerings for different customer segments. The action plans for achievi