diff --git a/explore-analyze/aggregations.md b/explore-analyze/aggregations.md index 879567a889..179ae5ccaf 100644 --- a/explore-analyze/aggregations.md +++ b/explore-analyze/aggregations.md @@ -21,8 +21,7 @@ An aggregation summarizes your data as metrics, statistics, or other analytics. * [Bucket](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket.html) aggregations that group documents into buckets, also called bins, based on field values, ranges, or other criteria. * [Pipeline](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline.html) aggregations that take input from other aggregations instead of documents or fields. - -## Run an aggregation [run-an-agg] +## Run an aggregation [run-an-agg] You can run aggregations as part of a [search](../solutions/search/querying-for-search.md) by specifying the [search API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html)'s `aggs` parameter. The following search runs a [terms aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html) on `my-field`: @@ -71,9 +70,7 @@ Aggregation results are in the response’s `aggregations` object: 1. Results for the `my-agg-name` aggregation. - - -## Change an aggregation’s scope [change-agg-scope] +## Change an aggregation’s scope [change-agg-scope] Use the `query` parameter to limit the documents on which an aggregation runs: @@ -98,8 +95,7 @@ GET /my-index-000001/_search } ``` - -## Return only aggregation results [return-only-agg-results] +## Return only aggregation results [return-only-agg-results] By default, searches containing an aggregation return both search hits and aggregation results. To return only aggregation results, set `size` to `0`: @@ -117,7 +113,6 @@ GET /my-index-000001/_search } ``` - ## Run multiple aggregations [run-multiple-aggs] You can specify multiple aggregations in the same request: @@ -140,8 +135,7 @@ GET /my-index-000001/_search } ``` - -## Run sub-aggregations [run-sub-aggs] +## Run sub-aggregations [run-sub-aggs] Bucket aggregations support bucket or metric sub-aggregations. For example, a terms aggregation with an [avg](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-avg-aggregation.html) sub-aggregation calculates an average value for each bucket of documents. There is no level or depth limit for nesting sub-aggregations. @@ -191,8 +185,6 @@ The response nests sub-aggregation results under their parent aggregation: 1. Results for the parent aggregation, `my-agg-name`. 2. Results for `my-agg-name`'s sub-aggregation, `my-sub-agg-name`. - - ## Add custom metadata [add-metadata-to-an-agg] Use the `meta` object to associate custom metadata with an aggregation: @@ -231,8 +223,7 @@ The response returns the `meta` object in place: } ``` - -## Return the aggregation type [return-agg-type] +## Return the aggregation type [return-agg-type] By default, aggregation results include the aggregation’s name but not its type. To return the aggregation type, use the `typed_keys` query parameter. @@ -252,11 +243,10 @@ GET /my-index-000001/_search?typed_keys The response returns the aggregation type as a prefix to the aggregation’s name. -::::{important} +::::{important} Some aggregations return a different aggregation type from the type in the request. For example, the terms, [significant terms](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-significantterms-aggregation.html), and [percentiles](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-percentile-aggregation.html) aggregations return different aggregations types depending on the data type of the aggregated field. :::: - ```console-result { ... @@ -270,8 +260,6 @@ Some aggregations return a different aggregation type from the type in the reque 1. The aggregation type, `histogram`, followed by a `#` separator and the aggregation’s name, `my-agg-name`. - - ## Use scripts in an aggregation [use-scripts-in-an-agg] When a field doesn’t exactly match the aggregation you need, you should aggregate on a [runtime field](../manage-data/data-store/mapping/runtime-fields.md): @@ -298,15 +286,12 @@ GET /my-index-000001/_search?size=0 Scripts calculate field values dynamically, which adds a little overhead to the aggregation. In addition to the time spent calculating, some aggregations like [`terms`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html) and [`filters`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filters-aggregation.html) can’t use some of their optimizations with runtime fields. In total, performance costs for using a runtime field varies from aggregation to aggregation. - -## Aggregation caches [agg-caches] +## Aggregation caches [agg-caches] For faster responses, {{es}} caches the results of frequently run aggregations in the [shard request cache](https://www.elastic.co/guide/en/elasticsearch/reference/current/shard-request-cache.html). To get cached results, use the same [`preference` string](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-shard-routing.html#shard-and-node-preference) for each search. If you don’t need search hits, [set `size` to `0`](#return-only-agg-results) to avoid filling the cache. {{es}} routes searches with the same preference string to the same shards. If the shards' data doesn’t change between searches, the shards return cached aggregation results. - -## Limits for `long` values [limits-for-long-values] +## Limits for `long` values [limits-for-long-values] When running aggregations, {{es}} uses [`double`](https://www.elastic.co/guide/en/elasticsearch/reference/current/number.html) values to hold and represent numeric data. As a result, aggregations on [`long`](https://www.elastic.co/guide/en/elasticsearch/reference/current/number.html) numbers greater than `253` are approximate. - diff --git a/explore-analyze/aggregations/tutorial-analyze-ecommerce-data-with-aggregations-using-query-dsl.md b/explore-analyze/aggregations/tutorial-analyze-ecommerce-data-with-aggregations-using-query-dsl.md index ebb39332f5..6398ade59e 100644 --- a/explore-analyze/aggregations/tutorial-analyze-ecommerce-data-with-aggregations-using-query-dsl.md +++ b/explore-analyze/aggregations/tutorial-analyze-ecommerce-data-with-aggregations-using-query-dsl.md @@ -7,11 +7,8 @@ mapped_pages: - https://www.elastic.co/guide/en/elasticsearch/reference/current/aggregations-tutorial.html --- - - # Tutorial: Analyze eCommerce data with aggregations using Query DSL [aggregations-tutorial] - This hands-on tutorial shows you how to analyze eCommerce data using {{es}} [aggregations](../aggregations.md) with the `_search` API and Query DSL. You’ll learn how to: @@ -21,7 +18,6 @@ You’ll learn how to: * Compare performance across product categories * Track moving averages and cumulative totals - ## Requirements [aggregations-tutorial-requirements] You’ll need: @@ -42,8 +38,6 @@ You’ll need: * Select the **Other sample data sets** collapsible. * Add the **Sample eCommerce orders** data set. This will create and populate an index called `kibana_sample_data_ecommerce`. - - ## Inspect index structure [aggregations-tutorial-inspect-data] Before we start analyzing the data, let’s examine the structure of the documents in our sample eCommerce index. Run this command to see the field [mappings](../../manage-data/data-store/index-basics.md#elasticsearch-intro-documents-fields-mappings): @@ -55,6 +49,7 @@ GET kibana_sample_data_ecommerce/_mapping The response shows the field mappings for the `kibana_sample_data_ecommerce` index. ::::{dropdown} Example response + ```console-response { "kibana_sample_data_ecommerce": { @@ -271,34 +266,28 @@ The response shows the field mappings for the `kibana_sample_data_ecommerce` ind 3. `geoip.location`: Geographic coordinates stored as geo_point for location-based queries 4. `products.properties`: Nested structure containing details about items in each order - :::: - The sample data includes the following [field data types](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html): * [`text`](https://www.elastic.co/guide/en/elasticsearch/reference/current/text.html) and [`keyword`](https://www.elastic.co/guide/en/elasticsearch/reference/current/keyword.html) for text fields - - * Most `text` fields have a `.keyword` subfield for exact matching using [multi-fields](https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html) + * Most `text` fields have a `.keyword` subfield for exact matching using [multi-fields](https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html) * [`date`](https://www.elastic.co/guide/en/elasticsearch/reference/current/date.html) for date fields * 3 [numeric](https://www.elastic.co/guide/en/elasticsearch/reference/current/number.html) types: - - * `integer` for whole numbers - * `long` for large whole numbers - * `half_float` for floating-point numbers + * `integer` for whole numbers + * `long` for large whole numbers + * `half_float` for floating-point numbers * [`geo_point`](https://www.elastic.co/guide/en/elasticsearch/reference/current/geo-point.html) for geographic coordinates * [`object`](https://www.elastic.co/guide/en/elasticsearch/reference/current/object.html) for nested structures such as `products`, `geoip`, `event` Now that we understand the structure of our sample data, let’s start analyzing it. - ## Get key business metrics [aggregations-tutorial-basic-metrics] Let’s start by calculating important metrics about orders and customers. - ### Get average order size [aggregations-tutorial-order-value] Calculate the average order value across all orders in the dataset using the [`avg`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-avg-aggregation.html) aggregation. @@ -321,8 +310,8 @@ GET kibana_sample_data_ecommerce/_search 2. A meaningful name that describes what this metric represents 3. Configures an `avg` aggregation, which calculates a simple arithmetic mean - ::::{dropdown} Example response + ```console-result { "took": 0, @@ -354,11 +343,8 @@ GET kibana_sample_data_ecommerce/_search 3. Results appear under the name we specified in the request 4. The average order value is calculated dynamically from all the orders in the dataset - :::: - - ### Get multiple order statistics at once [aggregations-tutorial-order-stats] Calculate multiple statistics about orders in one request using the [`stats`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-stats-aggregation.html) aggregation. @@ -380,8 +366,8 @@ GET kibana_sample_data_ecommerce/_search 1. A descriptive name for this set of statistics 2. `stats` returns count, min, max, avg, and sum at once - ::::{dropdown} Example response + ```console-result { "aggregations": { @@ -402,22 +388,17 @@ GET kibana_sample_data_ecommerce/_search 4. `"avg"`: Average value per order across all orders 5. `"sum"`: Total revenue from all orders combined - :::: - ::::{tip} The [stats aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-stats-aggregation.html) is more efficient than running individual min, max, avg, and sum aggregations. :::: - - ## Analyze sales patterns [aggregations-tutorial-sales-patterns] Let’s group orders in different ways to understand sales patterns. - ### Break down sales by category [aggregations-tutorial-category-breakdown] Group orders by category to see which product categories are most popular, using the [`terms`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html) aggregation. @@ -444,8 +425,8 @@ GET kibana_sample_data_ecommerce/_search 4. Limit to top 5 categories 5. Order by number of orders (descending) - ::::{dropdown} Example response + ```console-result { "took": 4, @@ -501,11 +482,8 @@ GET kibana_sample_data_ecommerce/_search 4. Category name. 5. Number of orders in this category. - :::: - - ### Track daily sales patterns [aggregations-tutorial-daily-sales] Group orders by day to track daily sales patterns using the [`date_histogram`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html) aggregation. @@ -533,8 +511,8 @@ GET kibana_sample_data_ecommerce/_search 4. Formats dates in response using [date patterns](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-date-format.html) (e.g. "yyyy-MM-dd"). Refer to [date math expressions](https://www.elastic.co/guide/en/elasticsearch/reference/current/common-options.html#date-math) for additional options. 5. When `min_doc_count` is 0, returns buckets for days with no orders, useful for continuous time series visualization. - ::::{dropdown} Example response + ```console-result { "took": 2, @@ -723,16 +701,12 @@ GET kibana_sample_data_ecommerce/_search 4. `key` is the same date represented as the Unix timestamp for this bucket 5. `doc_count` counts the number of documents that fall into this time bucket - :::: - - ## Combine metrics with groupings [aggregations-tutorial-combined-analysis] Now let’s calculate [metrics](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics.html) within each group to get deeper insights. - ### Compare category performance [aggregations-tutorial-category-metrics] Calculate metrics within each category to compare performance across categories. @@ -776,8 +750,8 @@ GET kibana_sample_data_ecommerce/_search 4. Average order value in the category 5. Total number of items sold - ::::{dropdown} Example response + ```console-result { "aggregations": { @@ -813,11 +787,8 @@ GET kibana_sample_data_ecommerce/_search 4. Average order value for this category 5. Total quantity of items sold - :::: - - ### Analyze daily sales performance [aggregations-tutorial-daily-metrics] Let’s combine metrics to track daily trends: daily revenue, unique customers, and average basket size. @@ -859,8 +830,8 @@ GET kibana_sample_data_ecommerce/_search 2. Uses the [`cardinality`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html) aggregation to count unique customers per day 3. Average number of items per order - ::::{dropdown} Example response + ```console-result { "took": 119, @@ -1324,13 +1295,10 @@ GET kibana_sample_data_ecommerce/_search :::: - - ## Track trends and patterns [aggregations-tutorial-trends] You can use [pipeline aggregations](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline.html) on the results of other aggregations. Let’s analyze how metrics change over time. - ### Smooth out daily fluctuations [aggregations-tutorial-moving-average] Moving averages help identify trends by reducing day-to-day noise in the data. Let’s observe sales trends more clearly by smoothing daily revenue variations, using the [Moving Function](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline-movfn-aggregation.html) aggregation. @@ -1371,8 +1339,8 @@ GET kibana_sample_data_ecommerce/_search 5. Use a 3-day window — use different window sizes to see trends at different time scales. 6. Use the built-in unweighted average function in the `moving_fn` aggregation. - ::::{dropdown} Example response + ```console-result { "took": 13, @@ -1747,17 +1715,13 @@ GET kibana_sample_data_ecommerce/_search 4. First day has no smoothed value as it needs previous days for the calculation 5. Moving average starts from second day, using a 3-day window - :::: - ::::{tip} Notice how the smoothed values lag behind the actual values - this is because they need previous days' data to calculate. The first day will always be null when using moving averages. :::: - - ### Track running totals [aggregations-tutorial-cumulative] Track running totals over time using the [`cumulative_sum`](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline-cumulative-sum-aggregation.html) aggregation. @@ -1793,8 +1757,8 @@ GET kibana_sample_data_ecommerce/_search 2. `cumulative_sum` adds up values across buckets 3. Reference the revenue we want to accumulate - ::::{dropdown} Example response + ```console-result { "took": 4, @@ -2169,11 +2133,8 @@ GET kibana_sample_data_ecommerce/_search 4. `revenue`: Daily revenue for this date 5. `cumulative_revenue`: Running total of revenue up to this date - :::: - - ## Next steps [aggregations-tutorial-next-steps] Refer to the [aggregations reference](../aggregations.md) for more details on all available aggregation types.