Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,18 @@ navigation_title: Elasticsearch
mapped_pages:
- https://www.elastic.co/guide/en/serverless/current/elasticsearch-billing.html
applies_to:
serverless: all
serverless:
elasticsearch: ga
products:
- id: cloud-serverless
description: Learn about how costs for Elasticsearch Serverless projects are calculated, and strategies you can use to lower your costs.
---

# {{es}} billing dimensions [elasticsearch-billing]

{{es}} is priced based on consumption of the underlying infrastructure that supports your use case, with the performance characteristics you need. Measurements are in Virtual Compute Units (VCUs). Each VCU represents a fraction of RAM, CPU, and local disk for caching.
{{es-serverless}} projects are priced based on consumption of the underlying infrastructure that supports your use case with the performance characteristics you need.
Measurements are in virtual compute units (VCUs).
Each VCU represents a fraction of RAM, CPU, and local disk for caching.

The number of VCUs you need is determined by:

Expand All @@ -20,40 +24,38 @@ The number of VCUs you need is determined by:
* Search Power setting
* Machine learning usage

For detailed {{es-serverless}} project rates, see the [{{es-serverless}} pricing page](https://www.elastic.co/pricing/serverless-search).
For detailed {{es-serverless}} project rates, refer to the [{{es-serverless}} pricing page](https://www.elastic.co/pricing/serverless-search).

## VCU types: search, indexing, and ML [elasticsearch-billing-information-about-the-vcu-types-search-ingest-and-ml]

## VCU types: Search, Indexing, and ML [elasticsearch-billing-information-about-the-vcu-types-search-ingest-and-ml]

{{es}} uses three VCU types:

* **Indexing:** The VCUs used to index incoming documents. Indexing VCUs account for compute resources consumed for ingestion. This is based on ingestion rate, and amount of data ingested at any given time. Transforms and ingest pipelines also contribute to ingest VCU consumption.
* **Search:** The VCUs used to return search results, with the latency and queries per second (QPS) you require. Search VCUs are calculated as a factor of the compute resources needed to run search queries, search throughput and latency. Search VCUs are not charged per search request, but instead are a factor of the compute resources that scale up and down based on amount of searchable data, search load (QPS) and performance (latency and availability).
* **Machine learning:** The VCUs used to perform inference, NLP tasks, and other ML activities. ML VCUs are a factor of the models deployed, and number of ML operations such as inference for search and ingest. ML VCUs are typically consumed for generating embeddings during ingestion, and during semantic search or reranking.
* **Tokens:** The Elastic Managed LLM is charged per 1Mn Input and Output tokens. The LLM powers all AI Search features such as Playground and AI Assistant for Search, and is enabled by default.
{{es-serverless}} uses the following VCU types:

* **Indexing:** The VCUs used to index incoming documents. Indexing VCUs account for compute resources consumed for ingestion. This is based on ingestion rate and amount of data ingested at any given time. Transforms and ingest pipelines also contribute to ingest VCU consumption.
* **Search:** The VCUs used to return search results with the latency and queries per second (QPS) you require. Search VCUs are calculated as a factor of the compute resources needed to run search queries, search throughput, and latency. Search VCUs are not charged per search request. Instead, they are a factor of the compute resources that scale up and down based on amount of searchable data, search load (QPS), and performance (latency and availability).
* **Machine learning:** The VCUs used to perform inference, NLP tasks, and other ML activities. ML VCUs are a factor of the models deployed and number of ML operations such as inference for search and ingest. ML VCUs are typically consumed for generating embeddings during ingestion and during semantic search or reranking.
* **Tokens:** The Elastic Managed LLM is charged per 1 million input and output tokens. The LLM powers all AI Search features such as Playground and AI Assistant for Search and is enabled by default.

## Data storage and billing [elasticsearch-billing-information-about-the-search-ai-lake-dimension-gb]

{{es-serverless}} projects store data in the [Search AI Lake](../../deploy/elastic-cloud/project-settings.md#elasticsearch-manage-project-search-ai-lake-settings). You are charged per GB of stored data at rest. Note that if you perform operations at ingest such as vectorization or enrichment, the size of your stored data will differ from the size of the original source data.

{{es-serverless}} projects store data in the [Search AI Lake](/deploy-manage/deploy/elastic-cloud/project-settings.md#elasticsearch-manage-project-search-ai-lake-settings). You are charged per GB of stored data at rest. Note that if you perform operations at ingest such as vectorization or enrichment, the size of your stored data will differ from the size of the original source data.

## Managing {{es}} costs [elasticsearch-billing-managing-elasticsearch-costs]

You can control costs using the following strategies:

* **Search Power setting:** [Search Power](../../deploy/elastic-cloud/project-settings.md#elasticsearch-manage-project-search-power-settings) controls the speed of searches against your data. With Search Power, you can improve search performance by adding more resources for querying, or you can reduce provisioned resources to cut costs.
* **Search boost window**: By limiting the number of days of [time series data](../../../solutions/search/ingest-for-search.md#elasticsearch-ingest-time-series-data) that are available for caching, you can reduce the number of search VCUs required.
* **Machine learning trained model autoscaling:** [Trained model autoscaling](/deploy-manage/autoscaling/trained-model-autoscaling.md) is always enabled and cannot be disabled, ensuring efficient resource usage, reduced costs, and optimal performance without manual configuration.
* **Search Power setting**: [Search Power](/deploy-manage/deploy/elastic-cloud/project-settings.md#elasticsearch-manage-project-search-power-settings) controls the speed of searches against your data. With Search Power, you can improve search performance by adding more resources for querying or you can reduce provisioned resources to cut costs.
* **Search boost window**: By limiting the number of days of [time series data](/solutions/search/ingest-for-search.md#elasticsearch-ingest-time-series-data) that are available for caching, you can reduce the number of search VCUs required.
* **Machine learning trained model autoscaling**: [Trained model autoscaling](/deploy-manage/autoscaling/trained-model-autoscaling.md) is always enabled and cannot be disabled, ensuring efficient resource usage, reduced costs, and optimal performance without manual configuration.

Trained model deployments automatically scale down to zero allocations after 24 hours without any inference requests. When they scale up again, they remain active for 5 minutes before they can scale down. During these cooldown periods, you will continue to be billed for the active resources.
* **Indexing Strategies:** Consider your indexing strategies and how they might impact overall VCU usage and costs:
* **Indexing strategies** Consider your indexing strategies and how they might impact overall VCU usage and costs.
To ensure optimal performance and cost-effectiveness for your project, it's important to consider how you structure your data.

* To ensure optimal performance and cost-effectiveness for your project, it’s important to consider how you structure your data.
* Consolidate small indices for better efficiency. We recommend avoiding a design where your project contains hundreds of very small indices, specifically those under 1GB each.
* Why is this important?
* Every index in Elasticsearch has a certain amount of resource overhead. This is because Elasticsearch needs to maintain metadata for each index to keep it running smoothly. When you have a very large number of small indices, the combined overhead from all of them can consume more CPU resources than if the same data were stored in fewer, larger indices. This can lead to higher resource consumption and hence higher costs and potentially impact the overall performance of your project.

* Recommended Approach
* If your use case naturally generates many small, separate streams of data, we advise implementing a process to consolidate them into fewer, larger indices. This practice leads to more efficient resource utilization. By grouping your data into larger indices, you can ensure a more performant and cost-efficient experience with Elasticsearch Serverless.
Consolidate small indices for better efficiency.
In general, avoid a design where your project contains hundreds of very small indices, specifically those under 1GB each.
Avoiding small indices is important because every index in {{es}} has a certain amount of resource overhead.
{{es}} needs to maintain metadata for each index to keep it running smoothly.
When you have a very large number of small indices, the combined overhead from all of them can consume more CPU resources than if the same data were stored in fewer, larger indices.
Higher resource consumption can lead to higher costs and potentially impact the overall performance of your project.

If your use case naturally generates many small, separate streams of data, the recommended approach is to implement a process to consolidate them into fewer, larger indices. This practice leads to more efficient resource utilization. By grouping your data into larger indices, you can ensure a more performant and cost-efficient experience with {{es-serverless}}.
6 changes: 1 addition & 5 deletions solutions/search/get-started/semantic-search.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,15 +22,11 @@ By playing with a simple use case, you'll take the first steps toward understand

## Prerequisites

- If you're using [{{es-serverless}}](/solutions/search/serverless-elasticsearch-get-started.md), create a project that is optimized for vectors. To add the sample data, you must have a `developer` or `admin` predefined role or an equivalent custom role.
- If you're using [{{es-serverless}}](/solutions/search/serverless-elasticsearch-get-started.md), create a project with a general purpose configuration. To add the sample data, you must have a `developer` or `admin` predefined role or an equivalent custom role.
- If you're using [{{ech}}](/deploy-manage/deploy/elastic-cloud/cloud-hosted.md) or [running {{es}} locally](/solutions/search/run-elasticsearch-locally.md), start {{es}} and {{kib}}. To add the sample data, log in with a user that has the `superuser` built-in role.

To learn about role-based access control, check out [](/deploy-manage/users-roles/cluster-or-deployment-auth/user-roles.md).

<!--
TBD: What is the impact of this "optimized for vectors" option?
-->

## Create a vector database

When you create vectors (or _vectorize_ your data), you convert complex and nuanced documents into multidimensional numerical representations.
Expand Down
2 changes: 1 addition & 1 deletion solutions/search/vector/bring-own-vectors.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ You'll also learn the syntax for searching these documents using a [k-nearest ne

## Prerequisites

- If you're using {{es-serverless}}, create a project that is optimized for vectors. To add the sample data, you must have a `developer` or `admin` predefined role or an equivalent custom role.
- If you're using {{es-serverless}}, create a project with the general purpose configuration. To add the sample data, you must have a `developer` or `admin` predefined role or an equivalent custom role.
- If you're using {{ech}} or a self-managed cluster, start {{es}} and {{kib}}. The simplest method to complete the steps in this guide is to log in with a user that has the `superuser` built-in role.

To learn about role-based access control, check out [](/deploy-manage/users-roles/cluster-or-deployment-auth/user-roles.md).
Expand Down
Loading