Skip to content

Commit

Permalink
Support offset in composite aggs (#50609) (#50808)
Browse files Browse the repository at this point in the history
Adds support for the `offset` parameter to the `date_histogram` source
of composite aggs. The `offset` parameter is supported by the normal
`date_histogram` aggregation and is useful for folks that need to
measure things from, say, 6am one day to 6am the next day.

This is implemented by creating a new `Rounding` that knows how to
handle offsets and delegates to other rounding implementations. That
implementation doesn't fully implement the `Rounding` contract, namely
`nextRoundingValue`. That method isn't used by composite aggs so I can't
be sure that any implementation that I add will be correct. I propose to
leave it throwing `UnsupportedOperationException` until I need it.

Closes #48757
  • Loading branch information
nik9000 committed Jan 9, 2020
1 parent c51303d commit 1d8e51f
Show file tree
Hide file tree
Showing 10 changed files with 362 additions and 53 deletions.
66 changes: 66 additions & 0 deletions docs/reference/aggregations/bucket/composite-aggregation.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,72 @@ Time zones may either be specified as an ISO 8601 UTC offset (e.g. `+01:00` or
`-08:00`) or as a timezone id, an identifier used in the TZ database like
`America/Los_Angeles`.

*Offset*

include::datehistogram-aggregation.asciidoc[tag=offset-explanation]

[source,console,id=composite-aggregation-datehistogram-offset-example]
----
PUT my_index/_doc/1?refresh
{
"date": "2015-10-01T05:30:00Z"
}
PUT my_index/_doc/2?refresh
{
"date": "2015-10-01T06:30:00Z"
}
GET my_index/_search?size=0
{
"aggs": {
"my_buckets": {
"composite" : {
"sources" : [
{
"date": {
"date_histogram" : {
"field": "date",
"calendar_interval": "day",
"offset": "+6h",
"format": "iso8601"
}
}
}
]
}
}
}
}
----

include::datehistogram-aggregation.asciidoc[tag=offset-result-intro]

[source,console-result]
----
{
...
"aggregations": {
"my_buckets": {
"after_key": { "date": "2015-10-01T06:00:00.000Z" },
"buckets": [
{
"key": { "date": "2015-09-30T06:00:00.000Z" },
"doc_count": 1
},
{
"key": { "date": "2015-10-01T06:00:00.000Z" },
"doc_count": 1
}
]
}
}
}
----
// TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]

include::datehistogram-aggregation.asciidoc[tag=offset-note]

===== Mixing different values source

The `sources` parameter accepts an array of values source.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -461,16 +461,19 @@ the bucket covering that day will only hold data for 23 hours instead of the usu
where you'll have only a 11h bucket on the morning of 27 March when the DST shift
happens.

[[search-aggregations-bucket-datehistogram-offset]]
===== Offset

// tag::offset-explanation[]
Use the `offset` parameter to change the start value of each bucket by the
specified positive (`+`) or negative offset (`-`) duration, such as `1h` for
an hour, or `1d` for a day. See <<time-units>> for more possible time
duration options.

For example, when using an interval of `day`, each bucket runs from midnight
to midnight. Setting the `offset` parameter to `+6h` changes each bucket
to midnight. Setting the `offset` parameter to `+6h` changes each bucket
to run from 6am to 6am:
// end::offset-explanation[]

[source,console]
-----------------------------
Expand Down Expand Up @@ -498,8 +501,10 @@ GET my_index/_search?size=0
}
-----------------------------

// tag::offset-result-intro[]
Instead of a single bucket starting at midnight, the above request groups the
documents into buckets starting at 6am:
// end::offset-result-intro[]

[source,console-result]
-----------------------------
Expand All @@ -525,8 +530,10 @@ documents into buckets starting at 6am:
-----------------------------
// TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]

// tag::offset-note[]
NOTE: The start `offset` of each bucket is calculated after `time_zone`
adjustments have been made.
// end::offset-note[]

===== Keyed Response

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -390,6 +390,40 @@ setup:
- match: { aggregations.test.buckets.0.key.date: "2017-10-21" }
- match: { aggregations.test.buckets.0.doc_count: 1 }

---
"Composite aggregation with date_histogram offset":
- skip:
version: " - 7.5.99"
reason: offset introduced in 7.6.0

- do:
search:
rest_total_hits_as_int: true
index: test
body:
aggregations:
test:
composite:
sources: [
{
"date": {
"date_histogram": {
"field": "date",
"calendar_interval": "1d",
"offset": "4h",
"format": "iso8601" # Format makes the comparisons a little more obvious
}
}
}
]

- match: {hits.total: 6}
- length: { aggregations.test.buckets: 2 }
- match: { aggregations.test.buckets.0.key.date: "2017-10-19T04:00:00.000Z" }
- match: { aggregations.test.buckets.0.doc_count: 1 }
- match: { aggregations.test.buckets.1.key.date: "2017-10-21T04:00:00.000Z" }
- match: { aggregations.test.buckets.1.doc_count: 1 }

---
"Composite aggregation with after_key in the response":
- skip:
Expand Down Expand Up @@ -702,7 +736,6 @@ setup:
reason: geotile_grid is not supported until 7.5.0
- do:
search:
rest_total_hits_as_int: true
index: test
body:
aggregations:
Expand All @@ -725,7 +758,8 @@ setup:
]
after: { "geo": "12/730/1590", "kw": "foo" }

- match: {hits.total: 6}
- match: { hits.total.value: 6 }
- match: { hits.total.relation: "eq" }
- length: { aggregations.test.buckets: 3 }
- match: { aggregations.test.buckets.0.key.geo: "12/1236/1533" }
- match: { aggregations.test.buckets.0.key.kw: "bar" }
Expand Down

0 comments on commit 1d8e51f

Please sign in to comment.