Skip to content
Permalink
Browse files
Docs - query caching (#11584)
* Update caching.md

Knowledge from https://the-asf.slack.com/archives/CJ8D1JTB8/p1597781107153900

Update caching.md

A few additional updates OTBO https://the-asf.slack.com/archives/CJ8D1JTB8/p1608669046041300

* Update caching.md

Typos

* Amendments on the segment cache

Significant updates on content around the segment cache, pull process, and in-memory cache

* Update docs/design/historical.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/design/historical.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/design/historical.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/design/historical.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/design/historical.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/design/historical.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/caching.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/caching.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/caching.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/caching.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/design/historical.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/design/historical.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/design/historical.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/operations/basic-cluster-tuning.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/caching.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/caching.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/caching.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/caching.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/querying/caching.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update docs/operations/basic-cluster-tuning.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Update basic-cluster-tuning.md

typo

* Update docs/querying/caching.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>

* Whole-query caching update

Made more succinct and removed specific config to change.

* Update docs/design/historical.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
  • Loading branch information
petermarshallio and techdocsmith committed Apr 18, 2022
1 parent 408b46a commit 5167d328b10c5b0fd858c06c79bc8eb253bd5a39
Showing 4 changed files with 46 additions and 21 deletions.
@@ -39,17 +39,31 @@ org.apache.druid.cli.Main server historical

### Loading and serving segments

Each Historical process maintains a constant connection to Zookeeper and watches a configurable set of Zookeeper paths for new segment information. Historical processes do not communicate directly with each other or with the Coordinator processes but instead rely on Zookeeper for coordination.
Each Historical process copies or "pulls" segment files from Deep Storage to local disk in an area called the *segment cache*. Set the `druid.segmentCache.locations` to configure the size and location of the segment cache on each Historical process. See [Historical general configuration](../configuration/index.html#historical-general-configuration).

The [Coordinator](../design/coordinator.md) process is responsible for assigning new segments to Historical processes. Assignment is done by creating an ephemeral Zookeeper entry under a load queue path associated with a Historical process. For more information on how the Coordinator assigns segments to Historical processes, please see [Coordinator](../design/coordinator.md).
See the [Tuning Guide](../operations/basic-cluster-tuning.html#segment-cache-size) for more information.

When a Historical process notices a new load queue entry in its load queue path, it will first check a local disk directory (cache) for the information about segment. If no information about the segment exists in the cache, the Historical process will download metadata about the new segment to serve from Zookeeper. This metadata includes specifications about where the segment is located in deep storage and about how to decompress and process the segment. For more information about segment metadata and Druid segments in general, please see [Segments](../design/segments.md). Once a Historical process completes processing a segment, the segment is announced in Zookeeper under a served segments path associated with the process. At this point, the segment is available for querying.
The [Coordinator](../design/coordinator.html) controls the assignment of segments to Historicals and the balance of segments between Historicals. Historical processes do not communicate directly with each other, nor do they communicate directly with the Coordinator. Instead, the Coordinator creates ephemeral entries in Zookeeper in a [load queue path](../configuration/index.html#path-configuration). Each Historical process maintains a connection to Zookeeper, watching those paths for segment information.

For more information about how the Coordinator assigns segments to Historical processes, see [Coordinator](../design/coordinator.html).

When a Historical process detects a new entry in the Zookeeper load queue, it checks its own segment cache. If no information about the segment exists there, the Historical process first retrieves metadata from Zookeeper about the segment, including where the segment is located in Deep Storage and how it needs to decompress and process it.

For more information about segment metadata and Druid segments in general, see [Segments](../design/segments.html).

After a Historical process pulls down and processes a segment from Deep Storage, Druid advertises the segment as being available for queries from the Broker. This announcement by the Historical is made via Zookeeper, in a [served segments path](../configuration/index.html#path-configuration).

For more information about how the Broker determines what data is available for queries, please see [Broker](broker.html).

To make data from the segment cache available for querying as soon as possible, Historical services search the local segment cache upon startup and advertise the segments found there.

### Loading and serving segments from cache

Recall that when a Historical process notices a new segment entry in its load queue path, the Historical process first checks a configurable cache directory on its local disk to see if the segment had been previously downloaded. If a local cache entry already exists, the Historical process will directly read the segment binary files from disk and load the segment.
The segment cache uses [memory mapping](https://en.wikipedia.org/wiki/Mmap). The cache consumes memory from the underlying operating system so Historicals can hold parts of segment files in memory to increase query performance at the data level. The in-memory segment cache is affected by the size of the Historical JVM, heap / direct memory buffers, and other processes on the operating system itself.

At query time, if the required part of a segment file is available in the memory mapped cache or "page cache", the Historical re-uses it and reads it directly from memory. If it is not in the memory-mapped cache, the Historical reads that part of the segment from disk. In this case, there is potential for new data to flush other segment data from memory. This means that if free operating system memory is close to `druid.server.maxSize`, the more likely that segment data will be available in memory and reduce query times. Conversely, the lower the free operating system memory, the more likely a Historical is to read segments from disk.

The segment cache is also leveraged when a Historical process is first started. On startup, a Historical process will search through its cache directory and immediately load and serve all segments that are found. This feature allows Historical processes to be queried as soon they come online.
Note that this memory-mapped segment cache is in addition to other [query-level caches](../querying/caching.html).

### Querying segments

@@ -90,11 +90,9 @@ Tuning the cluster so that each Historical can accept 50 queries and 10 non-quer

#### Segment Cache Size

`druid.segmentCache.locations` specifies locations where segment data can be stored on the Historical. The sum of available disk space across these locations is set as the default value for property: `druid.server.maxSize`, which controls the total size of segment data that can be assigned by the Coordinator to a Historical.
For better query performance, do not allocate segment data to a Historical in excess of the system free memory. When `free system memory` is greater than or equal to `druid.segmentCache.locations`, the more segment data the Historical can be held in the memory-mapped segment cache.

Segments are memory-mapped by Historical processes using any available free system memory (i.e., memory not used by the Historical JVM and heap/direct memory buffers or other processes on the system). Segments that are not currently in memory will be paged from disk when queried.

Therefore, the size of cache locations set within `druid.segmentCache.locations` should be such that a Historical is not allocated an excessive amount of segment data. As the value of (`free system memory` / total size of all `druid.segmentCache.locations`) increases, a greater proportion of segments can be kept in memory, allowing for better query performance. The total segment data size assigned to a Historical can be overridden with `druid.server.maxSize`, but this is not required for most of the use cases.
Druid uses the `druid.segmentCache.locations` to calculate the total segment data size assigned to a Historical. For some rarer use cases, you can override this behavior with `druid.server.maxSize` property.

#### Number of Historicals

@@ -32,30 +32,40 @@ If you're unfamiliar with Druid architecture, review the following topics before

For instructions to configure query caching see [Using query caching](./using-caching.md).

Cache monitoring, including the hit rate and number of evictions, is available in [Druid metrics](../operations/metrics.html#cache).

Query-level caching is in addition to [data-level caching](../design/historical.md) on Historicals.

## Cache types

Druid supports the following types of caches:
Druid supports two types of query caching:

- **Per-segment** caching which stores _partial results_ of a query for a specific segment. Per-segment caching is enabled on Historicals by default.
- **Whole-query** caching which stores all results for a query.
- [Per-segment caching](#per-segment-caching) stores partial query results for a specific segment. It is enabled by default.
- [Whole-query caching](#whole-query-caching) stores final query results.

To avoid returning stale results, Druid invalidates the cache the moment any underlying data changes for both types of cache.
Druid invalidates any cache the moment any underlying data change to avoid returning stale results. This is especially important for `table` datasources that have highly-variable underlying data segments, including real-time data segments.

Druid can store cache data on the local JVM heap or in an external distributed key/value store. The default is a local cache based upon [Caffeine](https://github.com/ben-manes/caffeine). Maximum cache storage defaults to the minimum value of 1 GiB or the ten percent of the maximum runtime memory for the JVM with no cache expiration. See [Cache configuration](../configuration/index.md#cache-configuration) for information on how to configure cache storage.
> **Druid can store cache data on the local JVM heap or in an external distributed key/value store (e.g. memcached)**
>
> The default is a local cache based upon [Caffeine](https://github.com/ben-manes/caffeine). The default maximum cache storage size is the minimum of 1 GiB / ten percent of maximum runtime memory for the JVM, with no cache expiration. See [Cache configuration](../configuration/index.md#cache-configuration) for information on how to configure cache storage. When using caffeine, the cache is inside the JVM heap and is directly measurable. Heap usage will grow up to the maximum configured size, and then the least recently used segment results will be evicted and replaced with newer results.
### Per-segment caching

The primary form of caching in Druid is the **per-segment cache** which stores query results on a per-segment basis. It is enabled on Historical services by default.
The primary form of caching in Druid is a *per-segment results cache*. This cache stores partial query results on a per-segment basis and is enabled on Historical services by default.

The per-segment results cache allows Druid to maintain a low-eviction-rate cache for segments that do not change, especially important for those segments that [historical](../design/historical.html) processes pull into their local _segment cache_ from [deep storage](../dependencies/deep-storage.html). Real-time segments, on the other hand, continue to have results computed at query time.

Druid may potentially merge per-segment cached results with the results of later queries that use a similar basic shape with similar filters, aggregations, etc. For example, if the query is identical except that it covers a different time period.

When your queries include data from segments that are mutable and undergoing real-time ingestion, use a segment cache. In this case Druid caches query results for immutable historical segments when possible. It re-computes results for the real-time segments at query time.
Per-segment caching is controlled by the parameters `useCache` and `populateCache`.

For example, you have queries that frequently include incoming data from a Kafka or Kinesis stream alongside unchanging segments. Per-segment caching lets Druid cache results from older immutable segments and merge them with updated data. Whole-query caching would not be helpful in this scenario because the new data from real-time ingestion continually invalidates the cache.
Use per-segment caching with real-time data. For example, your queries request data actively arriving from Kafka alongside intervals in segments that are loaded on Historicals. Druid can merge cached results from Historical segments with real-time results from the stream. [Whole-query caching](#whole-query-caching), on the other hand, is not helpful in this scenario because new data from real-time ingestion will continually invalidate the entire cached result.

### Whole-query caching

If real-time ingestion invalidating the cache is not an issue for your queries, you can use **whole-query caching** on the Broker to increase query efficiency. The Broker performs whole-query caching operations before sending fan out queries to Historicals. Therefore Druid no longer needs to merge the per-segment results on the Broker.
With *whole-query caching*, Druid caches the entire results of individual queries, meaning the Broker no longer needs to merge per-segment results from data processes.

For instance, whole-query caching is a good option when you have queries that include data from a batch ingestion task that runs every few hours or once a day. Per-segment caching would be less efficient in this case because it requires Druid to merge the per-segment results for each query, even when the results are cached.
Use *whole-query caching* on the Broker to increase query efficiency when there is little risk of ingestion invalidating the cache at a segment level. This applies particularly, for example, when _not_ using real-time ingestion. Perhaps your queries tend to use batch-ingested data, in which case per-segment caching would be less efficient since the underlying segments hardly ever change, yet Druid would continue to acquire per-segment results for each query.

## Where to enable caching

@@ -69,9 +79,9 @@ For instance, whole-query caching is a good option when you have queries that in

- On Brokers for small production clusters with less than five servers.

Do not use per-segment caches on the Broker for large production clusters. When `druid.broker.cache.populateCache` is `true` and query context parameter `populateCache` _is not_ `false`, Historicals return results on a per-segment basis without merging results locally thus negatively impacting cluster scalability.
Avoid using per-segment cache at the Broker for large production clusters. When the Broker cache is enabled (`druid.broker.cache.populateCache` is `true`) and `populateCache` _is not_ `false` in the [query context](../querying/query-context.html), individual Historicals will _not_ merge individual segment-level results, and instead pass these back to the lead Broker. The Broker must then carry out a large merge from _all_ segments on its own.

**Whole-query cache** is only available on Brokers.
**Whole-query cache** is available exclusively on Brokers.

## Performance considerations for caching
Caching enables increased concurrency on the same system, therefore leading to noticeable performance improvements for queries on Druid clusters handling throughput for concurrent, mixed workloads.
@@ -61,11 +61,14 @@ To control **segment caching** on the Broker, set the `useCache` and `populateCa
druid.broker.cache.useCache=true
druid.broker.cache.populateCache=true
```

To control **whole-query caching** on the Broker, set the `useResultLevelCache` and `populateResultLevelCache` runtime properties. For example, to set the Broker to use and populate the whole-query cache for queries:

```
druid.broker.cache.useResultLevelCache=true
druid.broker.cache.populateResultLevelCache=true
```

See [Broker caching](../configuration/index.md#broker-caching) for a description of all available Broker cache configurations.

## Enabling caching in the query context

0 comments on commit 5167d32

Please sign in to comment.