Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 10 additions & 9 deletions docs/reference/mapping/types/semantic-text.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,14 @@ Long passages are <<auto-text-chunking, automatically chunked>> to smaller secti

The `semantic_text` field type specifies an inference endpoint identifier that will be used to generate embeddings.
You can create the inference endpoint by using the <<put-inference-api>>.
This field type and the <<query-dsl-semantic-query,`semantic` query>> type make it simpler to perform semantic search on your data.
If you don't specify an inference endpoint, the <<infer-service-elser,ELSER service>> is used by default.
This field type and the <<query-dsl-semantic-query,`semantic` query>> type make it simpler to perform semantic search on your data.

If you don’t specify an inference endpoint, the `inference_id` field defaults to `.elser-2-elasticsearch`, a preconfigured endpoint for the elasticsearch service.

Using `semantic_text`, you won't need to specify how to generate embeddings for your data, or how to index it.
The {infer} endpoint automatically determines the embedding generation, indexing, and query to use.

If you use the ELSER service, you can set up `semantic_text` with the following API request:
If you use the preconfigured `.elser-2-elasticsearch` endpoint, you can set up `semantic_text` with the following API request:

[source,console]
------------------------------------------------------------
Expand All @@ -34,7 +35,7 @@ PUT my-index-000001
}
------------------------------------------------------------

If you use a service other than ELSER, you must create an {infer} endpoint using the <<put-inference-api>> and reference it when setting up `semantic_text` as the following example demonstrates:
To use a custom {infer} endpoint instead of the default `.elser-2-elasticsearch`, you must <<put-inference-api>> and specify its `inference_id` when setting up the `semantic_text` field type.

[source,console]
------------------------------------------------------------
Expand All @@ -53,8 +54,7 @@ PUT my-index-000002
// TEST[skip:Requires inference endpoint]
<1> The `inference_id` of the {infer} endpoint to use to generate embeddings.


The recommended way to use semantic_text is by having dedicated {infer} endpoints for ingestion and search.
The recommended way to use `semantic_text` is by having dedicated {infer} endpoints for ingestion and search.
This ensures that search speed remains unaffected by ingestion workloads, and vice versa.
After creating dedicated {infer} endpoints for both, you can reference them using the `inference_id` and `search_inference_id` parameters when setting up the index mapping for an index that uses the `semantic_text` field.

Expand Down Expand Up @@ -82,10 +82,11 @@ PUT my-index-000003

`inference_id`::
(Required, string)
{infer-cap} endpoint that will be used to generate the embeddings for the field.
{infer-cap} endpoint that will be used to generate embeddings for the field.
By default, `.elser-2-elasticsearch` is used.
This parameter cannot be updated.
Use the <<put-inference-api>> to create the endpoint.
If `search_inference_id` is specified, the {infer} endpoint defined by `inference_id` will only be used at index time.
If `search_inference_id` is specified, the {infer} endpoint will only be used at index time.

`search_inference_id`::
(Optional, string)
Expand Down Expand Up @@ -208,7 +209,7 @@ PUT test-index
"properties": {
"infer_field": {
"type": "semantic_text",
"inference_id": "my-elser-endpoint"
"inference_id": ".elser-2-elasticsearch"
},
"source_field": {
"type": "text",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,15 @@ You don't need to define model related settings and parameters, or create {infer
The recommended way to use <<semantic-search,semantic search>> in the {stack} is following the `semantic_text` workflow.
When you need more control over indexing and query settings, you can still use the complete {infer} workflow (refer to <<semantic-search-inference,this tutorial>> to review the process).

This tutorial uses the <<inference-example-elser,`elser` service>> for demonstration, but you can use any service and their supported models offered by the {infer-cap} API.
This tutorial uses the <<infer-service-elasticsearch,`elasticsearch` service>> for demonstration, but you can use any service and their supported models offered by the {infer-cap} API.


[discrete]
[[semantic-text-requirements]]
==== Requirements

This tutorial uses the <<infer-service-elser,ELSER service>> for demonstration, which is created automatically as needed.
To use the `semantic_text` field type with an {infer} service other than ELSER, you must create an inference endpoint using the <<put-inference-api>>.
This tutorial uses the <<infer-service-elasticsearch,`elasticsearch` service>> for demonstration, which is created automatically as needed.
To use the `semantic_text` field type with an {infer} service other than `elasticsearch` service, you must create an inference endpoint using the <<put-inference-api>>.


[discrete]
Expand All @@ -48,7 +48,7 @@ PUT semantic-embeddings
// TEST[skip:TBD]
<1> The name of the field to contain the generated embeddings.
<2> The field to contain the embeddings is a `semantic_text` field.
Since no `inference_id` is provided, the <<infer-service-elser,ELSER service>> is used by default.
Since no `inference_id` is provided, the default endpoint `.elser-2-elasticsearch` for the <<infer-service-elasticsearch,`elasticsearch` service>> is used.
To use a different {infer} service, you must create an {infer} endpoint first using the <<put-inference-api>> and then specify it in the `semantic_text` field mapping using the `inference_id` parameter.


Expand Down
51 changes: 7 additions & 44 deletions docs/reference/search/search-your-data/semantic-text-hybrid-search
Original file line number Diff line number Diff line change
Expand Up @@ -8,47 +8,12 @@ This tutorial demonstrates how to perform hybrid search, combining semantic sear

In hybrid search, semantic search retrieves results based on the meaning of the text, while full-text search focuses on exact word matches. By combining both methods, hybrid search delivers more relevant results, particularly in cases where relying on a single approach may not be sufficient.

The recommended way to use hybrid search in the {stack} is following the `semantic_text` workflow. This tutorial uses the <<inference-example-elser,`elser` service>> for demonstration, but you can use any service and its supported models offered by the {infer-cap} API.

[discrete]
[[semantic-text-hybrid-infer-endpoint]]
==== Create the {infer} endpoint

Create an inference endpoint by using the <<put-inference-api>>:

[source,console]
------------------------------------------------------------
PUT _inference/sparse_embedding/my-elser-endpoint <1>
{
"service": "elser", <2>
"service_settings": {
"adaptive_allocations": { <3>
"enabled": true,
"min_number_of_allocations": 3,
"max_number_of_allocations": 10
},
"num_threads": 1
}
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The task type is `sparse_embedding` in the path as the `elser` service will
be used and ELSER creates sparse vectors. The `inference_id` is
`my-elser-endpoint`.
<2> The `elser` service is used in this example.
<3> This setting enables and configures adaptive allocations.
Adaptive allocations make it possible for ELSER to automatically scale up or down resources based on the current load on the process.

[NOTE]
====
You might see a 502 bad gateway error in the response when using the {kib} Console.
This error usually just reflects a timeout, while the model downloads in the background.
You can check the download progress in the {ml-app} UI.
====
The recommended way to use hybrid search in the {stack} is following the `semantic_text` workflow.
This tutorial uses the <<infer-service-elasticsearch,`elasticsearch` service>> for demonstration, but you can use any service and their supported models offered by the {infer-cap} API.

[discrete]
[[hybrid-search-create-index-mapping]]
==== Create an index mapping for hybrid search
==== Create an index mapping

The destination index will contain both the embeddings for semantic search and the original text field for full-text search. This structure enables the combination of semantic search and full-text search.

Expand All @@ -60,21 +25,19 @@ PUT semantic-embeddings
"properties": {
"semantic_text": { <1>
"type": "semantic_text",
"inference_id": "my-elser-endpoint" <2>
},
"content": { <3>
"content": { <2>
"type": "text",
"copy_to": "semantic_text" <4>
"copy_to": "semantic_text" <3>
}
}
}
}
------------------------------------------------------------
// TEST[skip:TBD]
<1> The name of the field to contain the generated embeddings for semantic search.
<2> The identifier of the inference endpoint that generates the embeddings based on the input text.
<3> The name of the field to contain the original text for lexical search.
<4> The textual data stored in the `content` field will be copied to `semantic_text` and processed by the {infer} endpoint.
<2> The name of the field to contain the original text for lexical search.
<3> The textual data stored in the `content` field will be copied to `semantic_text` and processed by the {infer} endpoint.

[NOTE]
====
Expand Down
Loading