diff --git a/docs/reference/aggregations/bucket/datehistogram-aggregation.asciidoc b/docs/reference/aggregations/bucket/datehistogram-aggregation.asciidoc index 2ee40b24a8548..2ee9025b6ded8 100644 --- a/docs/reference/aggregations/bucket/datehistogram-aggregation.asciidoc +++ b/docs/reference/aggregations/bucket/datehistogram-aggregation.asciidoc @@ -10,122 +10,194 @@ that here the interval can be specified using date/time expressions. Time-based data requires special support because time-based intervals are not always a fixed length. -==== Setting intervals - -There seems to be no limit to the creativity we humans apply to setting our -clocks and calendars. We've invented leap years and leap seconds, standard and -daylight savings times, and timezone offsets of 30 or 45 minutes rather than a -full hour. While these creations help keep us in sync with the cosmos and our -environment, they can make specifying time intervals accurately a real challenge. -The only universal truth our researchers have yet to disprove is that a -millisecond is always the same duration, and a second is always 1000 milliseconds. -Beyond that, things get complicated. - -Generally speaking, when you specify a single time unit, such as 1 hour or 1 day, you -are working with a _calendar interval_, but multiples, such as 6 hours or 3 days, are -_fixed-length intervals_. - -For example, a specification of 1 day (1d) from now is a calendar interval that -means "at -this exact time tomorrow" no matter the length of the day. A change to or from -daylight savings time that results in a 23 or 25 hour day is compensated for and the -specification of "this exact time tomorrow" is maintained. But if you specify 2 or -more days, each day must be of the same fixed duration (24 hours). In this case, if -the specified interval includes the change to or from daylight savings time, the -interval will end an hour sooner or later than you expect. - -There are similar differences to consider when you specify single versus multiple -minutes or hours. Multiple time periods longer than a day are not supported. - -Here are the valid time specifications and their meanings: +==== Calendar and Fixed intervals -milliseconds (ms) :: -Fixed length interval; supports multiples. +When configuring a date histogram aggregation, the interval can be specified +in two manners: calendar-aware time intervals, and fixed time intervals. -seconds (s) :: -1000 milliseconds; fixed length interval (except for the last second of a -minute that contains a leap-second, which is 2000ms long); supports multiples. +Calendar-aware intervals understand that daylight savings changes the length +of specific days, months have different amounts of days, and leap seconds can +be tacked onto a particular year. -minutes (m) :: +Fixed intervals are, by contrast, always multiples of SI units and do not change +based on calendaring context. + +[NOTE] +.Combined `interval` field is deprecated +================================== +deprecated[7.2, `interval` field is deprecated] Historically both calendar and fixed +intervals were configured in a single `interval` field, which led to confusing +semantics. Specifying `1d` would be assumed as a calendar-aware time, +whereas `2d` would be interpreted as fixed time. To get "one day" of fixed time, +the user would need to specify the next smaller unit (in this case, `24h`). + +This combined behavior was often unknown to users, and even when knowledgeable about +the behavior it was difficult to use and confusing. + +This behavior has been deprecated in favor of two new, explicit fields: `calendar_interval` +and `fixed_interval`. + +By forcing a choice between calendar and intervals up front, the semantics of the interval +are clear to the user immediately and there is no ambiguity. The old `interval` field +will be removed in the future. +================================== + +===== Calendar Intervals + +Calendar-aware intervals are configured with the `calendar_interval` parameter. +Calendar intervals can only be specified in "singular" quantities of the unit +(`1d`, `1M`, etc). Multiples, such as `2d`, are not supported and will throw an exception. + +The accepted units for calendar intervals are: + +minute (`m`, `1m`) :: All minutes begin at 00 seconds. -* One minute (1m) is the interval between 00 seconds of the first minute and 00 +One minute is the interval between 00 seconds of the first minute and 00 seconds of the following minute in the specified timezone, compensating for any -intervening leap seconds, so that the number of minutes and seconds past the -hour is the same at the start and end. -* Multiple minutes (__n__m) are intervals of exactly 60x1000=60,000 milliseconds -each. +intervening leap seconds, so that the number of minutes and seconds past the +hour is the same at the start and end. -hours (h) :: +hours (`h`, `1h`) :: All hours begin at 00 minutes and 00 seconds. -* One hour (1h) is the interval between 00:00 minutes of the first hour and 00:00 +One hour (1h) is the interval between 00:00 minutes of the first hour and 00:00 minutes of the following hour in the specified timezone, compensating for any intervening leap seconds, so that the number of minutes and seconds past the hour -is the same at the start and end. -* Multiple hours (__n__h) are intervals of exactly 60x60x1000=3,600,000 milliseconds -each. +is the same at the start and end. -days (d) :: + +days (`d`, `1d`) :: All days begin at the earliest possible time, which is usually 00:00:00 (midnight). -* One day (1d) is the interval between the start of the day and the start of +One day (1d) is the interval between the start of the day and the start of of the following day in the specified timezone, compensating for any intervening time changes. -* Multiple days (__n__d) are intervals of exactly 24x60x60x1000=86,400,000 -milliseconds each. -weeks (w) :: +week (`w`, `1w`) :: -* One week (1w) is the interval between the start day_of_week:hour:minute:second -and the same day of the week and time of the following week in the specified +One week is the interval between the start day_of_week:hour:minute:second +and the same day of the week and time of the following week in the specified timezone. -* Multiple weeks (__n__w) are not supported. -months (M) :: +month (`M`, `1M`) :: -* One month (1M) is the interval between the start day of the month and time of +One month is the interval between the start day of the month and time of day and the same day of the month and time of the following month in the specified timezone, so that the day of the month and time of day are the same at the start and end. -* Multiple months (__n__M) are not supported. -quarters (q) :: +quarter (`q`, `1q`) :: -* One quarter (1q) is the interval between the start day of the month and +One quarter (1q) is the interval between the start day of the month and time of day and the same day of the month and time of day three months later, so that the day of the month and time of day are the same at the start and end. + -* Multiple quarters (__n__q) are not supported. -years (y) :: +year (`y`, `1y`) :: -* One year (1y) is the interval between the start day of the month and time of -day and the same day of the month and time of day the following year in the +One year (1y) is the interval between the start day of the month and time of +day and the same day of the month and time of day the following year in the specified timezone, so that the date and time are the same at the start and end. + -* Multiple years (__n__y) are not supported. -NOTE: -In all cases, when the specified end time does not exist, the actual end time is -the closest available time after the specified end. +===== Calendar Interval Examples +As an example, here is an aggregation requesting bucket intervals of a month in calendar time: -Widely distributed applications must also consider vagaries such as countries that -start and stop daylight savings time at 12:01 A.M., so end up with one minute of -Sunday followed by an additional 59 minutes of Saturday once a year, and countries -that decide to move across the international date line. Situations like -that can make irregular timezone offsets seem easy. +[source,js] +-------------------------------------------------- +POST /sales/_search?size=0 +{ + "aggs" : { + "sales_over_time" : { + "date_histogram" : { + "field" : "date", + "calendar_interval" : "month" + } + } + } +} +-------------------------------------------------- +// CONSOLE +// TEST[setup:sales] -As always, rigorous testing, especially around time-change events, will ensure -that your time interval specification is -what you intend it to be. +If you attempt to use multiples of calendar units, the aggregation will fail because only +singular calendar units are supported: -WARNING: -To avoid unexpected results, all connected servers and clients must sync to a -reliable network time service. +[source,js] +-------------------------------------------------- +POST /sales/_search?size=0 +{ + "aggs" : { + "sales_over_time" : { + "date_histogram" : { + "field" : "date", + "calendar_interval" : "2d" + } + } + } +} +-------------------------------------------------- +// CONSOLE +// TEST[setup:sales] +// TEST[catch:bad_request] -==== Examples +[source,js] +-------------------------------------------------- +{ + "error" : { + "root_cause" : [...], + "type" : "x_content_parse_exception", + "reason" : "[1:82] [date_histogram] failed to parse field [calendar_interval]", + "caused_by" : { + "type" : "illegal_argument_exception", + "reason" : "The supplied interval [2d] could not be parsed as a calendar interval.", + "stack_trace" : "java.lang.IllegalArgumentException: The supplied interval [2d] could not be parsed as a calendar interval." + } + } +} + +-------------------------------------------------- +// NOTCONSOLE + +===== Fixed Intervals + +Fixed intervals are configured with the `fixed_interval` parameter. + +In contrast to calendar-aware intervals, fixed intervals are a fixed number of SI +units and never deviate, regardless of where they fall on the calendar. One second +is always composed of 1000ms. This allows fixed intervals to be specified in +any multiple of the supported units. + +However, it means fixed intervals cannot express other units such as months, +since the duration of a month is not a fixed quantity. Attempting to specify +a calendar interval like month or quarter will throw an exception. + +The accepted units for fixed intervals are: + +milliseconds (ms) :: + +seconds (s) :: +Defined as 1000 milliseconds each + +minutes (m) :: +All minutes begin at 00 seconds. -Requesting bucket intervals of a month. +Defined as 60 seconds each (60,000 milliseconds) + +hours (h) :: +All hours begin at 00 minutes and 00 seconds. +Defined as 60 minutes each (3,600,000 milliseconds) + +days (d) :: +All days begin at the earliest possible time, which is usually 00:00:00 +(midnight). + +Defined as 24 hours (86,400,000 milliseconds) + +===== Fixed Interval Examples + +If we try to recreate the "month" `calendar_interval` from earlier, we can approximate that with +30 fixed days: [source,js] -------------------------------------------------- @@ -135,7 +207,7 @@ POST /sales/_search?size=0 "sales_over_time" : { "date_histogram" : { "field" : "date", - "calendar_interval" : "month" + "fixed_interval" : "30d" } } } @@ -144,11 +216,7 @@ POST /sales/_search?size=0 // CONSOLE // TEST[setup:sales] -You can also specify time values using abbreviations supported by -<> parsing. -Note that fractional time values are not supported, but you can address this by -shifting to another -time unit (e.g., `1.5h` could instead be specified as `90m`). +But if we try to use a calendar unit that is not supported, such as weeks, we'll get an exception: [source,js] -------------------------------------------------- @@ -158,7 +226,7 @@ POST /sales/_search?size=0 "sales_over_time" : { "date_histogram" : { "field" : "date", - "fixed_interval" : "90m" + "fixed_interval" : "2w" } } } @@ -166,6 +234,50 @@ POST /sales/_search?size=0 -------------------------------------------------- // CONSOLE // TEST[setup:sales] +// TEST[catch:bad_request] + +[source,js] +-------------------------------------------------- +{ + "error" : { + "root_cause" : [...], + "type" : "x_content_parse_exception", + "reason" : "[1:82] [date_histogram] failed to parse field [fixed_interval]", + "caused_by" : { + "type" : "illegal_argument_exception", + "reason" : "failed to parse setting [date_histogram.fixedInterval] with value [2w] as a time value: unit is missing or unrecognized", + "stack_trace" : "java.lang.IllegalArgumentException: failed to parse setting [date_histogram.fixedInterval] with value [2w] as a time value: unit is missing or unrecognized" + } + } +} + +-------------------------------------------------- +// NOTCONSOLE + +===== Notes + +In all cases, when the specified end time does not exist, the actual end time is +the closest available time after the specified end. + +Widely distributed applications must also consider vagaries such as countries that +start and stop daylight savings time at 12:01 A.M., so end up with one minute of +Sunday followed by an additional 59 minutes of Saturday once a year, and countries +that decide to move across the international date line. Situations like +that can make irregular timezone offsets seem easy. + +As always, rigorous testing, especially around time-change events, will ensure +that your time interval specification is +what you intend it to be. + +WARNING: +To avoid unexpected results, all connected servers and clients must sync to a +reliable network time service. + +NOTE: fractional time values are not supported, but you can address this by +shifting to another time unit (e.g., `1.5h` could instead be specified as `90m`). + +NOTE: You can also specify time values using abbreviations supported by +<> parsing. ===== Keys @@ -522,8 +634,6 @@ control the order using the `order` setting. This setting supports the same `order` functionality as <>. -deprecated[6.0.0, Use `_key` instead of `_time` to order buckets by their dates/keys] - ===== Using a script to aggregate by day of the week When you need to aggregate the results by day of the week, use a script that