Skip to content

BUG when sorting (realtime) topN results by postAggregation vs aggregation #6375

@max-schmidt54321

Description

@max-schmidt54321

I'm running a simple query to get the trend of pageviews on recent data (middle manager):

{
   "queryType":"topN",
   "dataSource":"pageviews",
   "dimension":"onlineId",
   "metric": {
    "type": "numeric",
    "metric": "score"
	},
   "granularity": {"type": "duration", "duration": 1900000, "origin": "2018-09-25T09:12:00.000Z"},
   "threshold": 60,
   "intervals":[
      "2018-09-25T09:12:00.000Z/PT30M"
   ],
   "aggregations":[
      {
         "type":"filtered",
         "filter":{
            "type":"interval",
            "dimension":"__time",
            "intervals":[
               "2018-09-25T09:12:00.000Z/PT15M"
            ]
         },
         "aggregator":{
            "type":"longSum",
            "name":"total_old",
            "fieldName":"count"
         }
      },
      {
         "type":"filtered",
         "filter":{
            "type":"interval",
            "dimension":"__time",
            "intervals":[
               "2018-09-25T09:27:00.000Z/PT15M"
            ]
         },
         "aggregator":{
            "type":"longSum",
            "name":"total_new",
            "fieldName":"count"
         }
      }
   ],
   "postAggregations":[
      {
         "type":"arithmetic",
         "name":"score",
         "fn":"-",
         "fields":[
            {
               "type":"fieldAccess",
               "fieldName":"total_new"
            },
            {
               "type":"fieldAccess",
               "fieldName":"total_old"
            }
         ]
      }
   ]
}

The outcomes vary based on which metric I select for sorting. "score" returns wrong values, while "total_new" returns correct results.

Example Results:

Sorted by postAggregation "metric":"score"

"result": [
            {
                "total_old": 344,
                "onlineId": "10264391",
                "total_new": 1424,
                "score": 1080
            },
            {
                "total_old": 134,
                "onlineId": "6372612",
                "total_new": 606,
                "score": 472
            },
            {
                "total_old": 12,
                "onlineId": "10271038",
                "total_new": 263,
                "score": 251
            },
            {
                "total_old": 53,
                "onlineId": "10261042",
                "total_new": 285,
                "score": 232
            },
	    ...
]

Sorted by aggregation "metric":"total_new"

"result": [
            {
                "total_old": 1250,
                "onlineId": "10264391",
                "total_new": 1424,
                "score": 174
            },
            {
                "total_old": 421,
                "onlineId": "6372612",
                "total_new": 606,
                "score": 185
            },
            {
                "total_old": 408,
                "onlineId": "10271038",
                "total_new": 360,
                "score": -48
            },
            {
                "total_old": 251,
                "onlineId": "10261042",
                "total_new": 285,
                "score": 34
            },
	    ...
]

Notes:

  • Druid v0.12.3
  • Data is ingested from Kafka
  • When I set the threshold high enough, both sortings return the correct results
  • When I query data from the historicals (-1day), both sortings return the correct results

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions