From 3b8a4bc86b0b52548eb28fe16f44af679f386ae3 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Fri, 26 Apr 2024 17:48:35 +0200 Subject: [PATCH] [DOCS] Add retrievers overview --- .../retrievers-overview.asciidoc | 131 ++++++++++++++++++ .../search-your-data.asciidoc | 3 +- 2 files changed, 133 insertions(+), 1 deletion(-) create mode 100644 docs/reference/search/search-your-data/retrievers-overview.asciidoc diff --git a/docs/reference/search/search-your-data/retrievers-overview.asciidoc b/docs/reference/search/search-your-data/retrievers-overview.asciidoc new file mode 100644 index 0000000000000..c9b8b91e972d5 --- /dev/null +++ b/docs/reference/search/search-your-data/retrievers-overview.asciidoc @@ -0,0 +1,131 @@ +[[retrievers-overview]] +== Retrievers + +// Will move to a top level "Retrievers and reranking" section once reranking is live + +preview::[] + +A retriever is an abstraction that was added to the Search API in *8.14.0*. +This abstraction enables the configuration of multi-stage retrieval +pipelines within a single `_search` call. This simplifies your search +application logic, because you no longer need to configure complex searches via +multiple {es} calls or implement additional client-side logic to +combine results from different queries. + +This document provides a general overview of the retriever abstraction. +For implementation details, including notable restrictions, check out the +<> in the `_search` API docs. + +[discrete] +[[retrievers-overview-types]] +=== Retriever types + +Retrievers come in various types, each tailored for different search operations. +The following retrievers are currently available: + +* <>. Returns top documents from a +traditional https://www.elastic.co/guide/en/elasticsearch/reference/master/query-dsl.html[query]. +Mimics a traditional query but in the context of a retriever framework. This +ensures backward compatibility as existing `_search` requests remain supported. +That way you can transition to the new abstraction at your own pace without +mixing syntaxes. +* <>. Returns top documents from a <>, +in the context of a retriever framework. +* <>. Combines and ranks multiple standard retrievers using +the reciprocal rank fusion (RRF) algorithm. Allows you to combine multiple result sets +with different relevance indicators into a single result set. +An RRF retriever is a *compound retriever*, where its `filter` element is +propagated to its sub retrievers. ++ +Sub retrievers may not use elements that +are restricted by having a compound retriever as part of the retriever tree. +See the <> for detailed +examples and information on how to use the RRF retriever. + +[NOTE] +==== +Stay tuned for more retriever types in future releases! +==== + +[discrete] +=== What Makes Retrievers Useful? + +Here's an overview of what makes retrievers useful and how they differ from +regular queries. + +. *Simplified user experience*. Retrievers simplify the user experience by +allowing entire retrieval pipelines to be configured in a single API call. This +maintains backward compatibility with traditional query elements by +automatically translating them to the appropriate retriever. +. *Structured retrieval*. Retrievers provide a more structured way to define search +operations. They allow searches to be described using a "retriever tree", a +hierarchical structure that clarifies the sequence and logic of operations, +making complex searches more understandable and manageable. +. *Composability and flexibility*. Retrievers enable flexible composability, +allowing you to build pipelines and seamlessly integrate different retrieval +strategies into these pipelines. Retrievers make it easy to test out different +retrieval strategy combinations. +. *Compound operations*. A retriever can have sub retrievers. This +allows complex nested searches where the results of one retriever feed into +another, supporting sophisticated querying strategies that might involve +multiple stages or criteria. +. *Retrieval as a first-class concept*. Unlike +traditional queries, where the query is a part of a larger search API call, +retrievers are designed as standalone entities that can be combined or used in +isolation. This enables a more modular and flexible approach to constructing +searches. +. *Enhanced control over document scoring and ranking*. Retrievers +allow for more explicit control over how documents are scored and filtered. For +instance, you can specify minimum score thresholds, apply complex filters +without affecting scoring, and use parameters like `terminate_after` for +performance optimizations. +. *Integration with existing {es} functionalities*. Even though +retrievers can be used instead of existing `_search` API syntax (like the +`query` and `knn`), they are designed to integrate seamlessly with things like +pagination (`search_after`) and sorting. They also maintain compatibility with +aggregation operations by treating the combination of all leaf retrievers as +`should` clauses in a boolean query. +. *Cleaner separation of concerns*. When using compound retrievers, only the +query element is allowed, which enforces a cleaner separation of concerns +and prevents the complexity that might arise from overly nested or +interdependent configurations. + +[discrete] +[[retrievers-overview-example]] +=== Example: Before and after + +The following example demonstrates how using retrievers can +simplify building and testing complex search pipelines. + +// TODO: Add concrete example(s) sourced from the hive mind + +[discrete] +[[retrievers-overview-glossary]] +=== Glossary +// TODO: Probably remove this, is it useful? + +Here are some important terms: + +* *Retrieval Pipeline*. Defines the entire retrieval and ranking logic to +produce top hits. +* *Compound Retriever*. Builds on one or more retrievers, +enhancing document retrieval and ranking logic. +* *Combiners*. Compound retrievers that merge top hits +from multiple sub-retrievers. +//* NOT YET *Rerankers*. Special compound retrievers that reorder hits and may adjust the number of hits, with distinctions between first-stage and second-stage rerankers. + +[discrete] +[[retrievers-overview-play-in-search]] +=== Retrievers in action + +//Playground will be renamed + +The Search [Playground], builds Elasticsearch queries using the retriever abstraction. +It automatically detects the fields and types in your index and builds a retriever tree based on your selections. + +You can use the [Playground] to experiment with different retriever configurations and see how they affect search results. + +Refer to the {kibana-ref}/playground.html[[Playground] documentation] for more information. + + + diff --git a/docs/reference/search/search-your-data/search-your-data.asciidoc b/docs/reference/search/search-your-data/search-your-data.asciidoc index bed204985296c..e1c1618410f2f 100644 --- a/docs/reference/search/search-your-data/search-your-data.asciidoc +++ b/docs/reference/search/search-your-data/search-your-data.asciidoc @@ -43,10 +43,11 @@ DSL, with a simplified user experience. Create search applications based on your results directly in the Kibana Search UI. include::search-api.asciidoc[] -include::search-application-overview.asciidoc[] include::knn-search.asciidoc[] include::semantic-search.asciidoc[] +include::retrievers-overview.asciidoc[] include::learning-to-rank.asciidoc[] include::search-across-clusters.asciidoc[] include::search-with-synonyms.asciidoc[] +include::search-application-overview.asciidoc[] include::behavioral-analytics/behavioral-analytics-overview.asciidoc[]