# How to handle cases where no queries are generated

When using LangChain to analyze user input and generate search queries, sometimes the analysis might determine that no specific search query is needed. This can happen when the user's input is not a direct request for information retrieval, or when the analysis determines that the user is asking a question that is already known and doesn't require a database query.

In these situations, our LangChain workflow needs to be smart enough to handle the case where no queries are generated. We need to check the output of the query analysis step and decide whether to proceed with retrieving information from our data source (retriever) or to handle the situation in a different way.

To demonstrate this, we'll use simulated data to show how LangChain can gracefully handle scenarios where no queries are generated, and how we can add logic to our chain to make decisions based on the query analysis results.

Essentially, we're building LangChain workflows that are robust enough to handle all possible outputs of the query analysis, including the case where no queries are deemed necessary for retrieval.

# 1. Install Dependencies

In [16]:
%pip install -qU langchain langchain-community langchain-openai langchain-chroma

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/67.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m4.8 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m43.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m611.1/611.1 kB[0m [31m32.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m54.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m278.6/278.6 kB[0m [31m17.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m94.8/94.8 kB[0m [31m7.5 MB/s[0m eta [36m0:00:00

In [17]:
import getpass
import os

if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass.getpass()

# Optional, uncomment to trace runs with LangSmith. Sign up here: https://smith.langchain.com.
# os.environ["LANGSMITH_TRACING"] = "true"
# os.environ["LANGSMITH_API_KEY"] = getpass.getpass()

# 2. Create Index

This code does the following:

It takes a simple text string.

It uses OpenAI's "text-embedding-3-small" model to generate a numerical representation (embedding) of that string.

It stores the string and its embedding in a Chroma vector database.

It creates a retriever that can be used to find similar texts based on their embeddings.

In [18]:
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

texts = ["Harrison worked at Kensho"]
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma.from_texts(
    texts,
    embeddings,
)
retriever = vectorstore.as_retriever()

# 3. Query analysis

a. Import necessary dependencies

b. Search Class Definition

class Search(BaseModel):: Defines a new class named Search that inherits from BaseModel. This makes it a Pydantic model.

c. query Field Definition

query: str = Field(..., description="Similarity search query applied to job record."): Defines a field named query within the Search model.

query: str: Specifies that the query field should be a string.

Field(...): Uses the Field function to configure the field.

...: The ellipsis (...) as the first argument indicates that this field is required. If a Search object is created without a value for query, Pydantic will raise a validation error.

description="Similarity search query applied to job record.": Sets the description of the query field. This description is useful for documentation, generating schemas, and providing helpful error messages.



In [19]:
from typing import Optional

from pydantic import BaseModel, Field


class Search(BaseModel):
    """Search over a database of job records."""

    query: str = Field(
        ...,
        description="Similarity search query applied to job record.",
    )

### Design the langchain chain

1. Import necessary dependencies

2. System Prompt Definition:

system = """...""": Defines a system prompt, a crucial part of chat-based interactions.

This prompt instructs the language model on its role:

It can issue search queries.

It should only use search when necessary.

It should respond normally if search isn't needed.

3. Chat Prompt Template Creation:

prompt = ChatPromptTemplate.from_messages([...]): Creates a chat prompt template.

("system", system): Adds the system prompt to the template.

("human", "{question}"): Adds a human message placeholder, where the user's question will be inserted.

4. Language Model Initialization:

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0): Initializes a ChatOpenAI object.

model="gpt-4o-mini": Specifies the OpenAI chat model to use.

temperature=0: Sets the temperature to 0, making the model's responses more deterministic (less random).

5. Tool Binding:

structured_llm = llm.bind_tools([Search]): Binds the Search tool to the language model.

bind_tools([Search]): Enables the language model to use the Search tool when it determines it's necessary.

Search is assumed to be a Pydantic model (as seen in your previous question) that defines the structure for search queries.

This sets up the language model to be able to output a correctly formatted Search object.

6. Query Analyzer Chain Creation:

query_analyzer = {"question": RunnablePassthrough()} | prompt | structured_llm: Creates a runnable chain.

{"question": RunnablePassthrough()}: Takes the input (which is expected to be a dictionary with a "question" key) and passes the question value through.

| prompt: Passes the question to the chat prompt template, which combines it with the system prompt.

| structured_llm: Passes the formatted prompt to the language model, which generates a response, potentially including a Search tool call.

Summary :

This code sets up a pipeline that:

1. Takes a user question as input.

2. Formats the question with a system prompt that encourages the language model to use search when appropriate.

3. Sends the formatted prompt to the language model, which is capable of generating responses and/or tool calls (specifically, Search tool calls).

4. Returns the response from the language model, which could include a structured search query.

This setup is useful for building systems that can intelligently decide whether to perform a search to answer a user's question, or just answer it directly.

In [20]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI

system = """You have the ability to issue search queries to get information to help answer user information.

You do not NEED to look things up. If you don't need to, then just respond normally."""
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        ("human", "{question}"),
    ]
)
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
structured_llm = llm.bind_tools([Search])
query_analyzer = {"question": RunnablePassthrough()} | prompt | structured_llm

In [21]:
query_analyzer.invoke("where did Harrison Work")

AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_z8UHUeC9BbbbvNRKoQPjlC5i', 'function': {'arguments': '{"query":"Harrison Work"}', 'name': 'Search'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 16, 'prompt_tokens': 95, 'total_tokens': 111, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_06737a9306', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-eeb630ed-be4e-4d2f-a07a-0f6d673229b0-0', tool_calls=[{'name': 'Search', 'args': {'query': 'Harrison Work'}, 'id': 'call_z8UHUeC9BbbbvNRKoQPjlC5i', 'type': 'tool_call'}], usage_metadata={'input_tokens': 95, 'output_tokens': 16, 'total_tokens': 111, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reaso

# 4. Retrieval with query analysis

1. Import necessary dependencies

2. Output Parser Creation:

output_parser = PydanticToolsParser(tools=[Search]): Creates a PydanticToolsParser object.

tools=[Search]: Specifies the Pydantic models that the parser should use to parse tool calls. In this case, it's the Search model (which you defined in a previous example).

This means that the parser is configured to expect tool calls that match the structure defined in the Search model and to convert those tool calls into instances of the Search class.

In [22]:
from langchain_core.output_parsers.openai_tools import PydanticToolsParser
from langchain_core.runnables import chain

output_parser = PydanticToolsParser(tools=[Search])

1. @chain Decorator:

@chain: This decorator from LangChain transforms the custom_chain function into a runnable chain. This allows it to be used within LangChain's runnable framework.

2. custom_chain Function Definition:

def custom_chain(question):: Defines the function custom_chain that takes a question (presumably a string) as input.

3. query_analyzer.invoke(question):

response = query_analyzer.invoke(question): This line invokes the query_analyzer chain (defined in a previous example) with the input question.
The query_analyzer chain is designed to analyze the question and potentially generate a tool call (specifically, a Search tool call) if it determines that a search is necessary.
The result of invoking the query_analyzer chain is stored in the response variable.

4. Tool Call Check:

if "tool_calls" in response.additional_kwargs:: This line checks if the response from the query_analyzer chain contains tool calls.
response.additional_kwargs: This attribute of the response object holds additional information, including tool calls.
If the response contains tool calls, it means the query_analyzer determined that a search is necessary.

5. Tool Call Processing (if tool calls exist):

query = output_parser.invoke(response): This line invokes the output_parser (defined in a previous example) with the response from the query_analyzer.

The output_parser parses the tool calls from the response and converts them into Pydantic objects (in this case, Search objects).

The result of the parsing (a list of Search objects) is stored in the query variable.

docs = retriever.invoke(query[0].query): This line invokes the retriever (defined in a previous example) with the query extracted from the tool call.

query[0].query: This accesses the query attribute of the first Search object in the query list. This is the search query string.

The retriever performs a similarity search on the vector store and returns the most similar documents (or text chunks).

The results of the retrieval are stored in the docs variable.

# Could add more logic - like another LLM call - here: This is a comment indicating that you could add more logic here, such as another language model call to process the retrieved documents.

return docs: This line returns the retrieved documents.

6. No Tool Call Handling (if no tool calls exist):

else: return response: If the response from the query_analyzer chain does not contain tool calls, this line returns the response directly. This means the query_analyzer determined that a search was not necessary, and the language model's response is returned as is.

In essence:

The custom_chain function implements the following logic:

It takes a user question as input.

It uses the query_analyzer to analyze the question and potentially generate a search query.

If a search query is generated:

It uses the retriever to perform a search.

It returns the search results.

If no search query is generated:

It returns the language model's response directly.

In [23]:
@chain
def custom_chain(question):
    response = query_analyzer.invoke(question)
    if "tool_calls" in response.additional_kwargs:
        query = output_parser.invoke(response)
        docs = retriever.invoke(query[0].query)
        # Could add more logic - like another LLM call - here
        return docs
    else:
        return response

In [24]:
custom_chain.invoke("where did Harrison Work")



[Document(id='a15875cd-41ea-41cf-8e23-e228a20e68d3', metadata={}, page_content='Harrison worked at Kensho')]

### Output explanation

This is a warning message from Chroma, the vector database.
"Number of requested results 4 is greater than number of elements in index 1": This means that the retriever was asked to find the top 4 most similar documents (the default for many retrievers), but the Chroma vector store only contains 1 document.
"updating n_results = 1": Chroma automatically adjusted the number of results to 1, since that's the maximum it can return.
Essentially, it's telling you that your vector database is very small, and the retriever’s default setting is requesting more documents than exist.

In [25]:
custom_chain.invoke("hi!")

AIMessage(content='Hello! How can I assist you today?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 11, 'prompt_tokens': 93, 'total_tokens': 104, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_06737a9306', 'finish_reason': 'stop', 'logprobs': None}, id='run-33eada19-6f80-48eb-9e07-352003ba2b28-0', usage_metadata={'input_tokens': 93, 'output_tokens': 11, 'total_tokens': 104, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [26]:
custom_chain.invoke("Who is david !")

AIMessage(content='Could you please provide more context about which David you are referring to? There are many notable individuals named David, including historical figures, celebrities, and fictional characters.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 34, 'prompt_tokens': 95, 'total_tokens': 129, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_06737a9306', 'finish_reason': 'stop', 'logprobs': None}, id='run-5eab6ba8-fa97-4abc-8dfe-fda4ec97ee41-0', usage_metadata={'input_tokens': 95, 'output_tokens': 34, 'total_tokens': 129, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})