Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

composite agg fails to bucket date_nanos with date_histogram source #53168

Closed
benwtrent opened this issue Mar 5, 2020 · 2 comments · Fixed by #53315
Closed

composite agg fails to bucket date_nanos with date_histogram source #53168

benwtrent opened this issue Mar 5, 2020 · 2 comments · Fixed by #53315
Assignees

Comments

@benwtrent
Copy link
Member

This bug is present in at least 7.6. Probably earlier.

composite agg does not behave the same as date_histogram on a date_nanos field.

Example:
Lets build a date_nanos index with some docs

PUT mah_nano
{
  "mappings": {
    "properties": {
      "@timestamp": {
        "type": "date_nanos"
      }
    }
  }
}

POST mah_nano/_doc
{
  "@timestamp": "2019-07-15T16:50:31+00:00"
}

POST mah_nano/_doc
{
  "@timestamp": "2019-07-16T16:50:31+00:00"
}

POST mah_nano/_doc
{
  "@timestamp": "2019-07-17T16:50:31+00:00"
}

POST mah_nano/_doc
{
  "@timestamp": "2019-07-17T16:55:31+00:00"
}

A good call from a date_histogram agg:

GET mah_nano/_search?size=0
{
  "aggs": {
    "buckets": {
      "date_histogram": {
        "field": "@timestamp",
        "calendar_interval": "day"
      },
      "aggs": {
        "avg": {
          "avg": {
            "field": "@timestamp"
          }
        }
      }
    }
  }
}

Returns:

"aggregations" : {
    "buckets" : {
      "buckets" : [
        {
          "key_as_string" : "2019-07-15T00:00:00.000Z",
          "key" : 1563148800000,
          "doc_count" : 1,
          "avg" : {
            "value" : 1.563209431E12,
            "value_as_string" : "2019-07-15T16:50:31.000Z"
          }
        },
        {
          "key_as_string" : "2019-07-16T00:00:00.000Z",
          "key" : 1563235200000,
          "doc_count" : 1,
          "avg" : {
            "value" : 1.563295831E12,
            "value_as_string" : "2019-07-16T16:50:31.000Z"
          }
        },
        {
          "key_as_string" : "2019-07-17T00:00:00.000Z",
          "key" : 1563321600000,
          "doc_count" : 2,
          "avg" : {
            "value" : 1.563382381E12,
            "value_as_string" : "2019-07-17T16:53:01.000Z"
          }
        }
      ]
    }

But, the following composite agg does not return the same thing:

GET mah_nano/_search?size=0
{
  "aggs": {
    "buckets": {
      "composite": {
        "sources": [
          {
            "timestamp": {
              "date_histogram": {
                "field": "@timestamp",
                "calendar_interval": "day"
              }
            }
          }
        ]
      },
      "aggs": {
        "avg": {
          "avg": {
            "field": "@timestamp"
          }
        }
      }
    }
  }
}

There are instead strange null values and bucket keys

"aggregations" : {
    "buckets" : {
      "after_key" : {
        "timestamp" : 1563382530921600000
      },
      "buckets" : [
        {
          "key" : {
            "timestamp" : 1563209430940800000
          },
          "doc_count" : 1,
          "avg" : {
            "value" : null
          }
        },
        {
          "key" : {
            "timestamp" : 1563295830940800000
          },
          "doc_count" : 1,
          "avg" : {
            "value" : null
          }
        },
        {
          "key" : {
            "timestamp" : 1563382230940800000
          },
          "doc_count" : 1,
          "avg" : {
            "value" : null
          }
        },
        {
          "key" : {
            "timestamp" : 1563382530921600000
          },
          "doc_count" : 1,
          "avg" : {
            "value" : null
          }
        }
      ]
    }
  }
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

@nik9000
Copy link
Member

nik9000 commented Mar 9, 2020

This is what you get with a non-nanos-date:

{
  "took" : 17,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "buckets" : {
      "after_key" : {
        "timestamp" : 1563321600000
      },
      "buckets" : [
        {
          "key" : {
            "timestamp" : 1563148800000
          },
          "doc_count" : 1,
          "avg" : {
            "value" : 1.563209431E12,
            "value_as_string" : "2019-07-15T16:50:31.000Z"
          }
        },
        {
          "key" : {
            "timestamp" : 1563235200000
          },
          "doc_count" : 1,
          "avg" : {
            "value" : 1.563295831E12,
            "value_as_string" : "2019-07-16T16:50:31.000Z"
          }
        },
        {
          "key" : {
            "timestamp" : 1563321600000
          },
          "doc_count" : 2,
          "avg" : {
            "value" : 1.563382381E12,
            "value_as_string" : "2019-07-17T16:53:01.000Z"
          }
        }
      ]
    }
  }
}

Which, I think, is pretty much what you should expect from a nanos date. I'll see if I can make it do that.

nik9000 added a commit to nik9000/elasticsearch that referenced this issue Mar 9, 2020
It looks like `date_nanos` fields weren't likely to work properly in
composite aggs because composites iterate field values using points and
we weren't converting the points into milliseconds. Because the doc
values were coming back in milliseconds we ended up geting very confused
and just never collecting sub-aggregations.

This fixes that by adding a method to `DateFieldMapper.Resolution` to
`parsePointAsMillis` which is similarly in name and function to
`NumberFieldMapper.NumberType`'s `parsePoint` except that it normalizes
to milliseconds which is what aggs need at the moment.

Closes elastic#53168
nik9000 added a commit that referenced this issue Mar 10, 2020
It looks like `date_nanos` fields weren't likely to work properly in
composite aggs because composites iterate field values using points and
we weren't converting the points into milliseconds. Because the doc
values were coming back in milliseconds we ended up geting very confused
and just never collecting sub-aggregations.

This fixes that by adding a method to `DateFieldMapper.Resolution` to
`parsePointAsMillis` which is similarly in name and function to
`NumberFieldMapper.NumberType`'s `parsePoint` except that it normalizes
to milliseconds which is what aggs need at the moment.

Closes #53168
nik9000 added a commit to nik9000/elasticsearch that referenced this issue Mar 10, 2020
It looks like `date_nanos` fields weren't likely to work properly in
composite aggs because composites iterate field values using points and
we weren't converting the points into milliseconds. Because the doc
values were coming back in milliseconds we ended up geting very confused
and just never collecting sub-aggregations.

This fixes that by adding a method to `DateFieldMapper.Resolution` to
`parsePointAsMillis` which is similarly in name and function to
`NumberFieldMapper.NumberType`'s `parsePoint` except that it normalizes
to milliseconds which is what aggs need at the moment.

Closes elastic#53168
nik9000 added a commit that referenced this issue Mar 11, 2020
It looks like `date_nanos` fields weren't likely to work properly in
composite aggs because composites iterate field values using points and
we weren't converting the points into milliseconds. Because the doc
values were coming back in milliseconds we ended up geting very confused
and just never collecting sub-aggregations.

This fixes that by adding a method to `DateFieldMapper.Resolution` to
`parsePointAsMillis` which is similarly in name and function to
`NumberFieldMapper.NumberType`'s `parsePoint` except that it normalizes
to milliseconds which is what aggs need at the moment.

Closes #53168
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants