# Query analysis

## Problem
In any question answering application we need to search for, or retrieve, information based on a user question. To do this we create an index(es) of the relevant information that allows us to search over it by text. In the simplest case, we can search on the user input directly. This approach has a few common failure modes:

* The index supports searches and filters against specific fields of the data, and user input could be referring to any of these fields,
* The user input contains multiple distinct questions in it,
* To get the relevant information multiple queries are needed,
* Search quality is sensitive to phrasing,
* There are multiple indexes that could be searched over, and the user input could be reffering to any of them.

To handle these, we can do **query analysis** to translate the raw user question into a query or queries optimized for our indexes.

:::{.callout-note} 
This guide assumes familiarity with the basic building blocks of a simple RAG application outlined in the [Q&A with RAG Quickstart](/docs/use_cases/question_answering/quickstart).
:::

:::{.callout-note}
We focus here on retrieval, but query analysis is useful wherever unstructured user input needs to routed, structured, or otherwise optimized for downstream use.
:::

## Techniques

Query analysis is the process of transforming a user input into a query optimized for your indexes or tools. This can involve any of the folowing techniques:

* [Query decomposition](/docs/use_cases/query_analysis/techniques/decomposition): If a user input contains multiple distinct questions, we can decompose the input into separate queries that will each be executed independently.
* [Query expansion](/docs/use_cases/query_analysis/techniques/expansion): If an index is sensitive to query phrasing, we can generate multiple paraphrased versions of the user question to increase our chances of retrieving a relevant result.
* [Hypothetical document embedding (HyDE)](/docs/use_cases/query_analysis/techniques/hyde): If we're working with a similarity search-based index, like a vector store, then searching on raw questions may not work well because their embeddings may not be very similar to those of the relevant documents. Instead it might help to have the model generate a hypothetical relevant document, and then use that to perform similarity search.
* [Step back prompting](/docs/use_cases/query_analysis/techniques/step_back): Sometimes search quality and model generations can be tripped up by the specifics of a question. One way to handle this is to first generate a more abstract, "step back" question and to query based on both the original and step back question.
* [Query structuring](/docs/use_cases/query_analysis/techniques/structuring): If our documents have multiple searchable/filterable attributes, we can infer from any raw user question which specific attributes should be searched/filtered over. For example, when a user input specific something about video publication date, that should become a filter on the `publish_date` attribute of each document.
* [Routing](/docs/use_cases/query_analysis/techniques/routing): If we have multiple indexes and only a subset are useful for any given user input, we can route the input to only retrieve results from the relevant ones.

We can think of each of these techniques as taking in a string or list of Messages corresponding to the current conversation, and returning one or more query objects. At it's simplest, with something like HyDE for example, the returned query might just be a rewritten search string. As we incorporate more of these techniques our query objects will have not just a search string but also filters, the specific indexes to run the search over, multiple versions of the search string, etc.

## End-to-end example

Head to the [end-to-end example](/docs/use_cases/query_analysis/e2e) to see how to put together different techniques and use them to actually retrieve documents