Added information about the Warmup API #393

keithhc2 · 2021-01-22T19:49:36Z

Issue# 367: Add information on KNN Warmup API

Description of changes: Added new page of information about the Warmup API

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

aetter · 2021-01-22T20:45:29Z

docs/knn/warmup.md

+---
+
+# Warmup API
+## Overview


You can just delete this ## Overview line. We generally try to avoid stacked headers, and in this case, it's fair to assume the introduction is an overview.

aetter · 2021-01-22T20:47:42Z

docs/knn/warmup.md

+
+# Warmup API
+## Overview
+The HNSW graphs used to perform k-Approximate Nearest Neighbor Search are stored as `.hnsw` files with other Lucene segment files. In order to perform search on these graphs, they need to be loaded into native memory. If the graphs have not yet been loaded into native memory, upon search, they will first be loaded and then searched. This can cause high latency during initial queries. To avoid this, users will often run random queries during a warmup period. After this warmup period, the graphs will be loaded into native memory and their production workloads can begin. This process is indirect and requires extra effort.


Tip here: whenever you use "this" or "that" as a demonstrative pronoun, force yourself to add a noun afterwards. The difference in clarity is huge.

"This can cause high latency" vs. "This loading time can cause high latency"

"To avoid this" vs. "To avoid this situation"

aetter · 2021-01-22T20:50:18Z

docs/knn/warmup.md

+
+# Warmup API
+## Overview
+The HNSW graphs used to perform k-Approximate Nearest Neighbor Search are stored as `.hnsw` files with other Lucene segment files. In order to perform search on these graphs, they need to be loaded into native memory. If the graphs have not yet been loaded into native memory, upon search, they will first be loaded and then searched. This can cause high latency during initial queries. To avoid this, users will often run random queries during a warmup period. After this warmup period, the graphs will be loaded into native memory and their production workloads can begin. This process is indirect and requires extra effort.


Avoid future tense. So something like "users often run random queries during a warmup period to load graphs into native memory. After this warmup period, they can start their production workloads. This process is indirect and requires extra effort."

aetter · 2021-01-22T20:52:07Z

docs/knn/warmup.md

+## Overview
+The HNSW graphs used to perform k-Approximate Nearest Neighbor Search are stored as `.hnsw` files with other Lucene segment files. In order to perform search on these graphs, they need to be loaded into native memory. If the graphs have not yet been loaded into native memory, upon search, they will first be loaded and then searched. This can cause high latency during initial queries. To avoid this, users will often run random queries during a warmup period. After this warmup period, the graphs will be loaded into native memory and their production workloads can begin. This process is indirect and requires extra effort.
+
+As an alternative, you can run the k-NN plugin's warmup API on whatever indices you are interested in searching over. This API will load all the graphs for all of the shards (primaries and replicas) of all the indices specified in the request into native memory. After this process completes, you will be able to start searching against their indices with no initial latency penalties. The warmup API is idempotent, so if a segment's graphs are already loaded into memory, this operation will have no impact on them. It only loads graphs that are not currently in memory.


... interested in searching.

This API loads...

After this process finishes, you can start searching against...

... this operation has no impact on them.

aetter · 2021-01-22T21:00:01Z

docs/knn/warmup.md

+As an alternative, you can run the k-NN plugin's warmup API on whatever indices you are interested in searching over. This API will load all the graphs for all of the shards (primaries and replicas) of all the indices specified in the request into native memory. After this process completes, you will be able to start searching against their indices with no initial latency penalties. The warmup API is idempotent, so if a segment's graphs are already loaded into memory, this operation will have no impact on them. It only loads graphs that are not currently in memory.
+
+## Usage
+This command will perform warmup on index1, index2, and index3:


This request performs a warmup on three indices:

aetter · 2021-01-22T21:05:09Z

docs/knn/warmup.md

+```
+`total` indicates how many shards the warmup operation was performed on. `successful` indicates how many shards succeeded and `failed` indicates how many shards have failed.
+
+The call will not return until the warmup operation is complete or the request times out. If the request times out, the operation will still be going on in the cluster. To monitor this, use the Elasticsearch `_tasks` API.


The call does not return a response until...

... the operation still continues on the cluster...

Maybe include a sample call to the _tasks API if we don't have anything to link to.

aetter · 2021-01-22T21:05:54Z

docs/knn/warmup.md

+
+The call will not return until the warmup operation is complete or the request times out. If the request times out, the operation will still be going on in the cluster. To monitor this, use the Elasticsearch `_tasks` API.
+
+Following the completion of the operation, use the k-NN `_stats` API to see what has been loaded into the graph.


Link to the Settings and statistics page.

to see what the plugin loaded into the graph.

aetter · 2021-01-22T21:09:03Z

docs/knn/warmup.md

+Following the completion of the operation, use the k-NN `_stats` API to see what has been loaded into the graph.
+
+## Best practices
+In order for the warmup API to function properly, you need to follow a few best practices. First, you should not be running any merge operations on the indices you want to warm up. The reason for this is that, during merge, the k-NN plugin creates new segments, and old segments are (sometimes) deleted. You may see the situation where the warmup API loads graphs A and B into native memory, but segment C is created from segments A and B being merged. The graphs for A and B will no longer be in memory and neither will the graph for C. Then, the initial penalty of loading graph C on the first queries will still be present.


Make these commands.

"For the warmup API to function properly, follow these best practices. First, do not run merge operations on indices that you want to warm up. During merge, the k-NN plugin creates new segments, and..."

"For example, you could encounter a situation in which the warmup API loads graphs A and B..."

"In this case, the initial penalty for loading graph C is still present."

aetter · 2021-01-22T21:14:59Z

docs/knn/warmup.md

+## Best practices
+In order for the warmup API to function properly, you need to follow a few best practices. First, you should not be running any merge operations on the indices you want to warm up. The reason for this is that, during merge, the k-NN plugin creates new segments, and old segments are (sometimes) deleted. You may see the situation where the warmup API loads graphs A and B into native memory, but segment C is created from segments A and B being merged. The graphs for A and B will no longer be in memory and neither will the graph for C. Then, the initial penalty of loading graph C on the first queries will still be present.
+
+Second, you should first confirm that all of the graphs of interest can fit into native memory before running warmup. If they cannot all fit into memory, the cache will thrash.


"Second, confirm that all graphs you want to warm up can fit into native memory. See the knn.memory.circuit_breaker.limit statistic for guidance. High graph memory usage causes cache thrashing."

aetter · 2021-01-22T21:18:48Z

docs/knn/warmup.md

+
+Second, you should first confirm that all of the graphs of interest can fit into native memory before running warmup. If they cannot all fit into memory, the cache will thrash.
+
+Lastly, you should not index any documents you want to load into the cache. Writing new information to segments prevents the Warmup API from loading the graphs until they are searchable, so you would have to run the Warmup API again after indexing is complete.


Lastly, do not index any documentations you want to load into the cache.

prevents the warmup API

run the warmup API again after indexing finished.

aetter · 2021-01-26T18:17:21Z

docs/knn/settings.md

 `script_query_requests` | The number of query requests that use [the KNN script](../#custom-scoring).
 `script_query_errors` | The number of errors during script queries.
+
+## Tasks


Unfortunately, I think this is just the wrong spot for this content. The _tasks API isn't specific to KNN; any content on it should just go into the Elasticsearch section. That said, documenting the entire _tasks API is a large task in and of itself, so I think the right call here is to just include the simple GET call in the other file, like this:

"
To monitor the warmup operation, call the Elasticsearch _tasks API:

GET _tasks

"

You can omit the response, revert the header changes, and we'll (you'll?) eventually document the _tasks API fully under the Elasticsearch header.

aetter · 2021-01-26T18:18:05Z

docs/knn/settings.md

+GET /_tasks
+```
+
+This sample request returns the tasks currently running on a node named `odfe-node1`.


Remove and possibly save somewhere on your system for later use.

aetter · 2021-01-26T18:19:48Z

docs/knn/warmup.md

+
+`total` indicates how many shards the k-NN plugin attempted to warm up. The response also includes the number of shards the plugin succeeded and failed to warm up.
+
+The call does not return until the warmup operation is complete or the request times out. If the request times out, the operation still continues on the cluster. To monitor the warmup operation, use the [Elasticsearch `_tasks` API](../settings#tasks).


Remove link

aetter · 2021-01-26T18:23:16Z

docs/knn/warmup.md

+
+# Warmup API
+
+The HNSW graphs used to perform k-Approximate Nearest Neighbor Search are stored as `.hnsw` files with other Lucene segment files. In order to perform search on these graphs, they need to be loaded into native memory. If the graphs have not yet been loaded into native memory, upon search, they will first be loaded and then searched. This loading time can cause high latency during initial queries. To avoid this situation, users will often run random queries during a warmup period. After this warmup period, the graphs will be loaded into native memory and their production workloads can begin. This loading process is indirect and requires extra effort.


A couple more instances of future tense that I missed before:

"If the graphs have not yet been loaded into native memory, upon search, they will first be loaded and then searched." -> "If the plugin has not yet loaded the graphs into native memory, it loads them when it receives a search request. This loading time..."

"To avoid this situation, users will often run random queries during a warmup period." -> To avoid this situation, users often run random queries...

keithhc2 added 2 commits January 21, 2021 17:31

Added information about k-NN plugin's Warmup API

ab19995

Added more information about not indexing documents to be loaded

9015ab5

aetter suggested changes Jan 22, 2021

View reviewed changes

keithhc2 added 2 commits January 22, 2021 16:55

Language fixes

98ab5b9

Added information about _tasks API

5e3bd7c

aetter reviewed Jan 26, 2021

View reviewed changes

Language fixes and clarification

5d9368d

keithhc2 merged commit 277c800 into opendistro:master Feb 4, 2021

keithhc2 mentioned this pull request Feb 8, 2021

Add information on KNN warmup API #367

Closed


		The call will not return until the warmup operation is complete or the request times out. If the request times out, the operation will still be going on in the cluster. To monitor this, use the Elasticsearch `_tasks` API.

		Following the completion of the operation, use the k-NN `_stats` API to see what has been loaded into the graph.


		Second, you should first confirm that all of the graphs of interest can fit into native memory before running warmup. If they cannot all fit into memory, the cache will thrash.

		Lastly, you should not index any documents you want to load into the cache. Writing new information to segments prevents the Warmup API from loading the graphs until they are searchable, so you would have to run the Warmup API again after indexing is complete.


		`total` indicates how many shards the k-NN plugin attempted to warm up. The response also includes the number of shards the plugin succeeded and failed to warm up.

		The call does not return until the warmup operation is complete or the request times out. If the request times out, the operation still continues on the cluster. To monitor the warmup operation, use the [Elasticsearch `_tasks` API](../settings#tasks).


		# Warmup API

		The HNSW graphs used to perform k-Approximate Nearest Neighbor Search are stored as `.hnsw` files with other Lucene segment files. In order to perform search on these graphs, they need to be loaded into native memory. If the graphs have not yet been loaded into native memory, upon search, they will first be loaded and then searched. This loading time can cause high latency during initial queries. To avoid this situation, users will often run random queries during a warmup period. After this warmup period, the graphs will be loaded into native memory and their production workloads can begin. This loading process is indirect and requires extra effort.

Added information about the Warmup API #393

Added information about the Warmup API #393

Uh oh!

Conversation

keithhc2 commented Jan 22, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aetter Jan 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aetter Jan 26, 2021 •

edited

Loading