[RFC] Proposal for a Disk-based Tiered Caching Mechanism in OpenSearch #9001

kiranprakash154 · 2023-07-31T15:55:31Z

I'm writing to propose a new caching approach for OpenSearch that could significantly enhance its performance.

OpenSearch is used primarily for two purposes:

Search: OpenSearch provides robust support for text-based searches, such as when a user searches for an item on an e-commerce platform like Amazon.com.
Log Analytics: It enables the indexing of logs and other time-series data, allowing users to create comprehensive analytical dashboards, either using OpenSearch Dashboards or other proprietary software.

When dealing with log analytics, there's a consistent pattern where the indexed documents are time bound and progress into the future. For instance, if a query is generated to find the count of 4xx errors between two timestamps (T1 and T2, with T2 in the past), the result will invariably remain the same. This attribute presents an opportunity to cache the computed result, allowing for faster retrieval and a reduced processing load.

Presently, OpenSearch incorporates three types of in-memory, bounded caches:

Shard Request Cache: Executes search requests locally on each involved shard, returning local results to the coordinating node that compiles these shard-level results into a global result set.
Node Query Cache: Caches the results of queries used in the filter context, facilitating quick lookups. The cache, shared by all shards on a node, uses an LRU eviction policy.
Field Data Cache: Stores field data and global ordinals that support aggregations on certain field types. As these are on-heap data structures, monitoring the cache's use is crucial.

We limited these caches in size and subject to eviction as new or more frequent search requests demand cache space. However, this eviction mechanism may cause recomputation of certain queries, adding to the overall system's overhead.

Given the limitations of the current caching strategy, I propose implementing an optional disk-based caching tier. This tier could leverage either a remote data store (such as Amazon S3, Azure Blob Storage etc.), the disk on the node where the shard lives, or a combination of both.

The rationale for this proposal stems from our hypothesis that the cost of recomputation exceeds the expense incurred during a disk seek operation or making a call to an external storage. Introducing a disk-based cache tier would significantly reduce the need for such recomputations, leading to more efficient query processing and improved system performance.

Kindly review the proposal and provide feedback. We believe that this approach to caching would enhance OpenSearch's performance, especially in scenarios where high data throughput and fast query processing are of paramount importance.

Bukhtawar · 2023-08-01T12:25:53Z

Given the limitations of the current caching strategy, I propose implementing an optional disk-based caching tier. This tier could leverage either a remote data store (such as Amazon S3, Azure Blob Storage etc.), the disk on the node where the shard lives, or a combination of both.

Thanks for the proposal!
Curious why not leverage mmap or off-heap based caches since it provides faster data access rather than performing slower disk seeks. This comes with the caveat that for safety constraints, contents need to be immutable. So anything changes we have to rebuild the cache. This will help unleash unused system memory as long as we can this bounded.
With #8891 tiered file cache proposal the cache can also be automatically tiered across local storage and remote storage albeit few cache and the data have to be handled differently through a different policy?

sgup432 · 2023-08-01T19:23:37Z

@Bukhtawar off-heap tier does make sense but still constrained by the memory for larger datasets. Whereas disk-tier might not be constrained by that, tradeoff being latency. Though as part of this, we are also considering giving off-heap tier as an option.
Disk tier will provide substantial improvements in latency for most type of queries(all which can't be fit in-memory) as it will act as a simple key value store and return results in few ms. Here we can consider to leverage mmap for better performance and exploit system memory internally. This can also be used to warm up the cache, either keeping it on disk itself or promoting to memory accordingly.

Bukhtawar · 2023-08-18T14:57:09Z

I am not opposed to the disk tier, the point I am suggesting is we used a tiered approach heap -> off-heap -> disc based on access patterns and space/memory constraints

sgup432 · 2023-08-18T15:53:24Z

@Bukhtawar Make sense!

kiranprakash154 added enhancement Enhancement or improvement to existing feature or request untriaged discuss Issues intended to help drive brainstorming and decision making Search Search query, autocomplete ...etc RFC Issues requesting major changes labels Jul 31, 2023

kiranprakash154 mentioned this issue Jul 31, 2023

[RFC] High Level Vision for Core Search in OpenSearch #8879

Open

msfroh removed the untriaged label Aug 9, 2023

sgup432 mentioned this issue Sep 13, 2023

[Proposal] Tiered caching - OpenSearch #10024

Open

anasalkouz added Search:Performance and removed Search Search query, autocomplete ...etc labels Sep 20, 2023

sohami added the Roadmap:CoPS (Cost, Performance, Scale) Project-wide roadmap label label May 14, 2024

sohami mentioned this issue May 24, 2024

[RFC] Search performance on warm index #13806

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Proposal for a Disk-based Tiered Caching Mechanism in OpenSearch #9001

[RFC] Proposal for a Disk-based Tiered Caching Mechanism in OpenSearch #9001

kiranprakash154 commented Jul 31, 2023

Bukhtawar commented Aug 1, 2023

sgup432 commented Aug 1, 2023 •

edited

Loading

Bukhtawar commented Aug 18, 2023

sgup432 commented Aug 18, 2023

[RFC] Proposal for a Disk-based Tiered Caching Mechanism in OpenSearch #9001

[RFC] Proposal for a Disk-based Tiered Caching Mechanism in OpenSearch #9001

Comments

kiranprakash154 commented Jul 31, 2023

Bukhtawar commented Aug 1, 2023

sgup432 commented Aug 1, 2023 • edited Loading

Bukhtawar commented Aug 18, 2023

sgup432 commented Aug 18, 2023

sgup432 commented Aug 1, 2023 •

edited

Loading