From 68d3b9961ee33ae4d5043d8c91a4340ffd6f2e1d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?= Date: Thu, 18 Sep 2025 11:59:35 +0200 Subject: [PATCH 1/7] [E&A] Marks ELSER on EIS as GA. --- explore-analyze/elastic-inference/eis.md | 27 ++++++++++-------------- 1 file changed, 11 insertions(+), 16 deletions(-) diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md index 1c1ba3ba0d..8bcc23235b 100644 --- a/explore-analyze/elastic-inference/eis.md +++ b/explore-analyze/elastic-inference/eis.md @@ -7,7 +7,7 @@ applies_to: # Elastic {{infer-cap}} Service [elastic-inference-service-eis] -The Elastic {{infer-cap}} Service (EIS) enables you to leverage AI-powered search as a service without deploying a model in your cluster. +The Elastic {{infer-cap}} Service (EIS) enables you to leverage AI-powered search as a service without deploying a model in your environment. With EIS, you don't need to manage the infrastructure and resources required for {{ml}} {{infer}} by adding, configuring, and scaling {{ml}} nodes. Instead, you can use {{ml}} models for ingest, search, and chat independently of your {{es}} infrastructure. @@ -15,7 +15,7 @@ Instead, you can use {{ml}} models for ingest, search, and chat independently of * Your Elastic deployment or project comes with a default [`Elastic Managed LLM` connector](https://www.elastic.co/docs/reference/kibana/connectors-kibana/elastic-managed-llm). This connector is used in the AI Assistant, Attack Discovery, Automatic Import and Search Playground. -* You can use [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) to perform semantic search as a service (ELSER on EIS). {applies_to}`stack: preview 9.1` {applies_to}`serverless: preview` +* You can use [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) to perform semantic search as a service (ELSER on EIS). {applies_to}`stack: preview 9.1, ga 9.2` {applies_to}`serverless: ga` ## Region and hosting [eis-regions] @@ -27,25 +27,20 @@ ELSER requests are managed by Elastic's own EIS infrastructure. ## ELSER via Elastic {{infer-cap}} Service (ELSER on EIS) [elser-on-eis] ```{applies_to} -stack: preview 9.1 -serverless: preview +stack: preview 9.1, ga 9.2 +serverless: ga ``` -ELSER on EIS enables you to use the ELSER model on GPUs, without having to manage your own ML nodes. We expect better performance for throughput and latency than ML nodes, and will continue to benchmark, remove limitations and address concerns as we move towards General Availability. +ELSER on EIS enables you to use the ELSER model on GPUs, without having to manage your own ML nodes. We expect better performance for throughput and latency than ML nodes, and will continue to benchmark, remove limitations and address concerns. -### Limitations - -While we do encourage experimentation, we do not recommend implementing production use cases on top of this feature while it is in Technical Preview. +### Pricing -#### Access +ELSER on EIS usage is billed separately from your other Elastic deployment resources. +For details about request-based pricing and billing dimensions, refer to the [ELSER on GPU item on the pricing page](https://www.elastic.co/pricing/serverless-search). -This feature is being gradually rolled out to Serverless and Cloud Hosted customers. -It may not be available to all users at launch. - -#### Uptime +### Limitations -There are no uptime guarantees during the Technical Preview. -While Elastic will address issues promptly, the feature may be unavailable for extended periods. +Elastic is continuously working to remove these constraints and further improve performance and scalability. #### Throughput and latency @@ -58,6 +53,6 @@ Performance may vary during the Technical Preview. Batches are limited to a maximum of 16 documents. This is particularly relevant when using the [_bulk API](https://www.elastic.co/docs/api/doc/elasticsearch/v9/operation/operation-bulk) for data ingestion. -#### Rate Limits +#### Rate Limits Rate limit for search and ingest is currently at 2000 requests per minute. From b8b78bfa8442d56e159b9821763c94e60ca58bb4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?= Date: Tue, 23 Sep 2025 11:11:41 +0200 Subject: [PATCH 2/7] Update explore-analyze/elastic-inference/eis.md Co-authored-by: Sean Handley --- explore-analyze/elastic-inference/eis.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md index 8bcc23235b..1400e3fe8c 100644 --- a/explore-analyze/elastic-inference/eis.md +++ b/explore-analyze/elastic-inference/eis.md @@ -31,7 +31,7 @@ stack: preview 9.1, ga 9.2 serverless: ga ``` -ELSER on EIS enables you to use the ELSER model on GPUs, without having to manage your own ML nodes. We expect better performance for throughput and latency than ML nodes, and will continue to benchmark, remove limitations and address concerns. +ELSER on EIS enables you to use the ELSER model on GPUs, without having to manage your own ML nodes. We expect better performance for throughput than ML nodes and equivalent performance for latency. We will continue to benchmark, remove limitations and address concerns. ### Pricing From c2c380be33569c6b5924fde21879bec20ae0f422 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Thu, 16 Oct 2025 15:44:26 +0200 Subject: [PATCH 3/7] restore stuff lost in merge commit --- explore-analyze/elastic-inference/eis.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md index 6f46c61a01..aeaeaa0022 100644 --- a/explore-analyze/elastic-inference/eis.md +++ b/explore-analyze/elastic-inference/eis.md @@ -38,6 +38,21 @@ ELSER on EIS enables you to use the ELSER model on GPUs, without having to manag ### Using the ELSER on EIS endpoint +You can now use `semantic_text` with the new ELSER endpoint on EIS. To learn how to use the `.elser-2-elastic` inference endpoint, refer to [Using ELSER on EIS](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md#using-elser-on-eis). + +#### Get started with semantic search with ELSER on EIS + +[Semantic Search with `semantic_text`](/solutions/search/semantic-search/semantic-search-semantic-text.md) has a detailed tutorial on using the `semantic_text` field and using the ELSER endpoint on EIS instead of the default endpoint. This is a great way to get started and try the new endpoint. + +### Limitations + +While we do encourage experimentation, we do not recommend implementing production use cases on top of this feature while it is in Technical Preview. + +#### Uptime + +There are no uptime guarantees during the Technical Preview. +While Elastic will address issues promptly, the feature may be unavailable for extended periods. + Elastic is continuously working to remove these constraints and further improve performance and scalability. #### Throughput and latency From dfa79edec35a8a4087d312eb6d3308fd7cdc9581 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Thu, 16 Oct 2025 15:45:55 +0200 Subject: [PATCH 4/7] remove tech preview limitations --- explore-analyze/elastic-inference/eis.md | 19 ++----------------- 1 file changed, 2 insertions(+), 17 deletions(-) diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md index aeaeaa0022..815d196cbc 100644 --- a/explore-analyze/elastic-inference/eis.md +++ b/explore-analyze/elastic-inference/eis.md @@ -44,24 +44,9 @@ You can now use `semantic_text` with the new ELSER endpoint on EIS. To learn how [Semantic Search with `semantic_text`](/solutions/search/semantic-search/semantic-search-semantic-text.md) has a detailed tutorial on using the `semantic_text` field and using the ELSER endpoint on EIS instead of the default endpoint. This is a great way to get started and try the new endpoint. -### Limitations +## Limitations -While we do encourage experimentation, we do not recommend implementing production use cases on top of this feature while it is in Technical Preview. - -#### Uptime - -There are no uptime guarantees during the Technical Preview. -While Elastic will address issues promptly, the feature may be unavailable for extended periods. - -Elastic is continuously working to remove these constraints and further improve performance and scalability. - -#### Throughput and latency - -{{infer-cap}} throughput via this endpoint is expected to exceed that of {{infer}} operations on an ML node. -However, throughput and latency are not guaranteed. -Performance may vary during the Technical Preview. - -#### Batch size +### Batch size Batches are limited to a maximum of 16 documents. This is particularly relevant when using the [_bulk API](https://www.elastic.co/docs/api/doc/elasticsearch/v9/operation/operation-bulk) for data ingestion. From 0ed96e2d9811e3d9a0869a581cdece667af3798f Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Thu, 16 Oct 2025 16:43:31 +0200 Subject: [PATCH 5/7] Fix heading levels Co-authored-by: Max Jakob Co-authored-by: florent-leborgne --- explore-analyze/elastic-inference/eis.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md index 815d196cbc..1c107e8bbc 100644 --- a/explore-analyze/elastic-inference/eis.md +++ b/explore-analyze/elastic-inference/eis.md @@ -44,9 +44,9 @@ You can now use `semantic_text` with the new ELSER endpoint on EIS. To learn how [Semantic Search with `semantic_text`](/solutions/search/semantic-search/semantic-search-semantic-text.md) has a detailed tutorial on using the `semantic_text` field and using the ELSER endpoint on EIS instead of the default endpoint. This is a great way to get started and try the new endpoint. -## Limitations +### Limitations -### Batch size +#### Batch size Batches are limited to a maximum of 16 documents. This is particularly relevant when using the [_bulk API](https://www.elastic.co/docs/api/doc/elasticsearch/v9/operation/operation-bulk) for data ingestion. From a72dfbaff728efe9e99e2ac2be2beca17ba1939f Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Tue, 21 Oct 2025 12:04:44 +0200 Subject: [PATCH 6/7] Remove rate limits Removed rate limits section from eis.md. --- explore-analyze/elastic-inference/eis.md | 4 ---- 1 file changed, 4 deletions(-) diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md index 1c107e8bbc..d12d212550 100644 --- a/explore-analyze/elastic-inference/eis.md +++ b/explore-analyze/elastic-inference/eis.md @@ -51,10 +51,6 @@ You can now use `semantic_text` with the new ELSER endpoint on EIS. To learn how Batches are limited to a maximum of 16 documents. This is particularly relevant when using the [_bulk API](https://www.elastic.co/docs/api/doc/elasticsearch/v9/operation/operation-bulk) for data ingestion. -#### Rate limits - -Rate limit for search and ingest is currently at 500 requests per minute. This allows you to ingest approximately 8000 documents per minute at 16 documents per request. - ## Pricing All models on EIS incur a charge per million tokens. The pricing details are at our [Pricing page](https://www.elastic.co/pricing/serverless-search) for the Elastic Managed LLM and ELSER. From f3f82196040ecae5afa9897c705ef135339e21a7 Mon Sep 17 00:00:00 2001 From: Liam Thompson Date: Tue, 21 Oct 2025 12:36:17 +0200 Subject: [PATCH 7/7] clarify that ingest latency is not something we optimize for Co-authored-by: Max Jakob --- explore-analyze/elastic-inference/eis.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md index d12d212550..256be23e33 100644 --- a/explore-analyze/elastic-inference/eis.md +++ b/explore-analyze/elastic-inference/eis.md @@ -34,7 +34,7 @@ stack: preview 9.1, ga 9.2 serverless: ga ``` -ELSER on EIS enables you to use the ELSER model on GPUs, without having to manage your own ML nodes. We expect better performance for throughput than ML nodes and equivalent performance for latency. We will continue to benchmark, remove limitations and address concerns. +ELSER on EIS enables you to use the ELSER model on GPUs, without having to manage your own ML nodes. We expect better performance for ingest throughput than ML nodes and equivalent performance for search latency. We will continue to benchmark, remove limitations and address concerns. ### Using the ELSER on EIS endpoint