diff --git a/docs/reference/search/aggregations/bucket/significantterms-aggregation.asciidoc b/docs/reference/search/aggregations/bucket/significantterms-aggregation.asciidoc index 6cf4fccd09948..92bc012ee7fa1 100644 --- a/docs/reference/search/aggregations/bucket/significantterms-aggregation.asciidoc +++ b/docs/reference/search/aggregations/bucket/significantterms-aggregation.asciidoc @@ -474,12 +474,29 @@ http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#UNIX_LINES ===== Execution hint -There are two mechanisms by which terms aggregations can be executed: either by using field values directly in order to aggregate -data per-bucket (`map`), or by using ordinals of the field values instead of the values themselves (`ordinals`). Although the -latter execution mode can be expected to be slightly faster, it is only available for use when the underlying data source exposes -those terms ordinals. Moreover, it may actually be slower if most field values are unique. Elasticsearch tries to have sensible -defaults when it comes to the execution mode that should be used, but in case you know that an execution mode may perform better -than the other one, you have the ability to provide Elasticsearch with a hint: +added[1.2.0] Added the `global_ordinals`, `global_ordinals_hash` and `global_ordinals_low_cardinality` execution modes + +deprecated[1.3.0] Removed the `ordinals` execution mode + +There are different mechanisms by which terms aggregations can be executed: + + - by using field values directly in order to aggregate data per-bucket (`map`) + - by using ordinals of the field and preemptively allocating one bucket per ordinal value (`global_ordinals`) + - by using ordinals of the field and dynamically allocating one bucket per ordinal value (`global_ordinals_hash`) + +Elasticsearch tries to have sensible defaults so this is something that generally doesn't need to be configured. + +`map` should only be considered when very few documents match a query. Otherwise the ordinals-based execution modes +are significantly faster. By default, `map` is only used when running an aggregation on scripts, since they don't have +ordinals. + +`global_ordinals` is the second fastest option, but the fact that it preemptively allocates buckets can be memory-intensive, +especially if you have one or more sub aggregations. It is used by default on top-level terms aggregations. + +`global_ordinals_hash` on the contrary to `global_ordinals` and `global_ordinals_low_cardinality` allocates buckets dynamically +so memory usage is linear to the number of values of the documents that are part of the aggregation scope. It is used by default +in inner aggregations. + [source,js] -------------------------------------------------- @@ -495,6 +512,7 @@ than the other one, you have the ability to provide Elasticsearch with a hint: } -------------------------------------------------- -<1> the possible values are `map` and `ordinals` +<1> the possible values are `map`, `global_ordinals` and `global_ordinals_hash` Please note that Elasticsearch will ignore this execution hint if it is not applicable. +