-
Notifications
You must be signed in to change notification settings - Fork 181
Add some initial docs on the new settings #3847
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🔍 Preview links for changed docs |
7f600ab to
ea5d10d
Compare
Vale Linting ResultsSummary: 3 warnings, 12 suggestions found
|
| File | Line | Rule | Message |
|---|---|---|---|
| deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md | 131 | Elastic.DontUse | Don't use 'very'. |
| deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md | 133 | Elastic.DontUse | Don't use 'Note that'. |
| solutions/search/vector/knn.md | 1247 | Elastic.DontUse | Don't use 'Note that'. |
💡 Suggestions (12)
| File | Line | Rule | Message |
|---|---|---|---|
| deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md | 131 | Elastic.WordChoice | Consider using 'can, might' instead of 'may', unless the term is in the UI. |
| deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md | 131 | Elastic.WordChoice | Consider using 'refer to (if it's a document), view (if it's a UI element)' instead of 'see', unless the term is in the UI. |
| deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md | 131 | Elastic.Acronyms | 'HNSW' has no definition. |
| deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md | 133 | Elastic.FutureTense | 'will need' might be in future tense. Write in the present tense to describe the state of the product as it is now. |
| solutions/search/vector/knn.md | 328 | Elastic.Capitalization | 'BFloat16 vector encoding [knn-search-bfloat16]' should use sentence-style capitalization. |
| solutions/search/vector/knn.md | 333 | Elastic.FutureTense | 'will automatically' might be in future tense. Write in the present tense to describe the state of the product as it is now. |
| solutions/search/vector/knn.md | 335 | Elastic.WordChoice | Consider using 'can, might' instead of 'may', unless the term is in the UI. |
| solutions/search/vector/knn.md | 1245 | Elastic.FutureTense | 'will read' might be in future tense. Write in the present tense to describe the state of the product as it is now. |
| solutions/search/vector/knn.md | 1245 | Elastic.WordChoice | Consider using 'can, might' instead of 'may', unless the term is in the UI. |
| solutions/search/vector/knn.md | 1245 | Elastic.FutureTense | 'will read' might be in future tense. Write in the present tense to describe the state of the product as it is now. |
| solutions/search/vector/knn.md | 1247 | Elastic.FutureTense | 'will only' might be in future tense. Write in the present tense to describe the state of the product as it is now. |
| solutions/search/vector/knn.md | 1247 | Elastic.Semicolons | Use semicolons judiciously. |
|
taking a look at this but poked quickly into elastic/elasticsearch#138492 your doc change here makes it seem like |
shainaraskas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally looks good. provided some recommended style edits for you.
|
|
||
|
|
||
| ## Use Direct IO when the vector data does not fit in RAM | ||
| ## Use on-disk rescoring when the vector data does not fit in RAM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the old direct IO guidance still valid in 9.1? did your team plan to not move forward with it (i.e. it is going from preview to removed)? Wonder if we need to keep it visible, but if it is going from preview to removed, your approach of just editing it out is ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The guidance is the same, but how you use it has changed. The JVM option has been replaced with an index setting
| serverless: unavailable | ||
| ``` | ||
| If your indices are of type `bbq_hnsw` and your nodes don't have enough off-heap RAM to store all vector data in memory, you may see very high query latencies. Vector data includes the HNSW graph, quantized vectors, and raw float32 vectors. | ||
| If you use quantized indices and your nodes don't have enough off-heap RAM to store all vector data in memory, you may see very high query latencies. Vector data includes the HNSW graph, quantized vectors, and raw float vectors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| If you use quantized indices and your nodes don't have enough off-heap RAM to store all vector data in memory, you may see very high query latencies. Vector data includes the HNSW graph, quantized vectors, and raw float vectors. | |
| If you use quantized indices and your nodes don't have enough off-heap RAM to store all vector data in memory, then you might experience high query latencies. Vector data includes the HNSW graph, quantized vectors, and raw float vectors. |
| In these scenarios, direct IO can significantly reduce query latency. Enable it by setting the JVM option `vector.rescoring.directio=true` on all vector search nodes in your cluster. | ||
|
|
||
| Only use this option if you're experiencing very high query latencies on indices of type `bbq_hnsw`. Otherwise, enabling direct IO may increase your query latencies. | ||
| In these scenarios, on-disk rescoring can significantly reduce query latency. Enable it by setting the `on_disk_rescore: true` option on your vector indices. Note that your data will need to be re-indexed or force-merged to use the new setting in subsequent searches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| In these scenarios, on-disk rescoring can significantly reduce query latency. Enable it by setting the `on_disk_rescore: true` option on your vector indices. Note that your data will need to be re-indexed or force-merged to use the new setting in subsequent searches. | |
| In these scenarios, on-disk rescoring can significantly reduce query latency. Enable it by setting the `on_disk_rescore: true` option on your vector indices. Your data must be re-indexed or force-merged to use the new setting in subsequent searches. |
| {applies_to} | ||
| stack: ga 9.3 | ||
| ``` | ||
| Instead of storing raw vectors as 4-byte values, you can use `element_type: bfloat16` to store each dimension as a 2-byte value. This can be useful if your indexed vectors are at bfloat16 precision already, or if you wish to reduce the disk space required to store vector data. Elasticsearch will automatically truncate 4-byte float values to 2-byte bfloat16 values when indexing vectors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Instead of storing raw vectors as 4-byte values, you can use `element_type: bfloat16` to store each dimension as a 2-byte value. This can be useful if your indexed vectors are at bfloat16 precision already, or if you wish to reduce the disk space required to store vector data. Elasticsearch will automatically truncate 4-byte float values to 2-byte bfloat16 values when indexing vectors. | |
| Instead of storing raw vectors as 4-byte values, you can use `element_type: bfloat16` to store each dimension as a 2-byte value. This can be useful if your indexed vectors are at bfloat16 precision already, or if you want to reduce the disk space required to store vector data. When this element type is used, {{es}} automatically truncates 4-byte float values to 2-byte bfloat16 values when indexing vectors. |
| ``` | ||
| Instead of storing raw vectors as 4-byte values, you can use `element_type: bfloat16` to store each dimension as a 2-byte value. This can be useful if your indexed vectors are at bfloat16 precision already, or if you wish to reduce the disk space required to store vector data. Elasticsearch will automatically truncate 4-byte float values to 2-byte bfloat16 values when indexing vectors. | ||
|
|
||
| Due to the reduced precision of bfloat16, any vectors retrieved from the index may have slightly different values to those originally indexed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Due to the reduced precision of bfloat16, any vectors retrieved from the index may have slightly different values to those originally indexed. | |
| Due to the reduced precision of bfloat16, any vectors retrieved from the index might have slightly different values to those originally indexed. |
| serverless: unavailable | ||
| ``` | ||
|
|
||
| By default, Elasticsearch will read raw vector data into memory to perform rescoring. This may have an effect on performance if the vector data is too large to all fit in off-heap memory at once. By specifying the `on_disk_rescore: true` index setting, Elasticsearch will read vector data from disk directly during rescoring. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| By default, Elasticsearch will read raw vector data into memory to perform rescoring. This may have an effect on performance if the vector data is too large to all fit in off-heap memory at once. By specifying the `on_disk_rescore: true` index setting, Elasticsearch will read vector data from disk directly during rescoring. | |
| By default, {{es}} reads raw vector data into memory to perform rescoring. This can have an effect on performance if the vector data is too large to all fit in off-heap memory at once. When the `on_disk_rescore: true` index setting is set, {{es}} reads vector data directly from disk during rescoring. |
|
|
||
| By default, Elasticsearch will read raw vector data into memory to perform rescoring. This may have an effect on performance if the vector data is too large to all fit in off-heap memory at once. By specifying the `on_disk_rescore: true` index setting, Elasticsearch will read vector data from disk directly during rescoring. | ||
|
|
||
| Note that this setting will only apply to newly indexed vectors; to apply the option to all vectors in the index, the vectors must be re-indexed or force-merged after changing the setting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Note that this setting will only apply to newly indexed vectors; to apply the option to all vectors in the index, the vectors must be re-indexed or force-merged after changing the setting. | |
| This setting only applies to newly indexed vectors. To apply the option to all vectors in the index, the vectors must be re-indexed or force-merged after changing the setting. |
|
@shainaraskas I'm not sure the best way to handle the docs in elastic/elasticsearch#138492 to make it clear bfloat16 is in 9.3+, other than specifying |
Add docs for bfloat16 and on_disk_rescore for changes in elastic/elasticsearch#138492