Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

druid post aggregation KeyError for arithmetic division type #7511

Open
david-z-johnson opened this issue May 15, 2019 · 7 comments

Comments

Projects
None yet
2 participants
@david-z-johnson
Copy link

commented May 15, 2019

added a customed post aggregation in superset, but got KeyError.

Environment

(please complete the following information):

  • superset version: 0.28.1/0.29.0rc6/0.32.0rc2 , install by pip
  • python version: 3.6.8

Checklist

Make sure these boxes are checked before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • I have reproduced the issue with at least the latest released version of superset.
  • I have checked the issue tracker for the same issue and I haven't found one similar.

callstack:

2019-05-15 10:40:59,534:ERROR:root:'totalActionTimeMs'
Traceback (most recent call last):
File "/home/work/superset-venv/lib/python3.6/site-packages/superset/viz.py", line 406, in get_df_payload
df = self.get_df(query_obj)
File "/home/work/superset-venv/lib/python3.6/site-packages/superset/viz.py", line 211, in get_df
self.results = self.datasource.query(query_obj)
File "/home/work/superset-venv/lib/python3.6/site-packages/superset/connectors/druid/models.py", line 1342, in query
client=client, query_obj=query_obj, phase=2)
File "/home/work/superset-venv/lib/python3.6/site-packages/superset/connectors/druid/models.py", line 940, in get_query_str
return self.run_query(client=client, phase=phase, **query_obj)
File "/home/work/superset-venv/lib/python3.6/site-packages/superset/connectors/druid/models.py", line 1141, in run_query
metrics_dict)
File "/home/work/superset-venv/lib/python3.6/site-packages/superset/connectors/druid/models.py", line 903, in metrics_and_post_aggs
postagg, post_aggs, saved_agg_names, visited_postaggs, metrics_dict)
File "/home/work/superset-venv/lib/python3.6/site-packages/superset/connectors/druid/models.py", line 869, in resolve_postagg
required_fields, metrics_dict)
File "/home/work/superset-venv/lib/python3.6/site-packages/superset/connectors/druid/models.py", line 830, in find_postaggs_for
metrics_dict[name] for name in postagg_names
File "/home/work/superset-venv/lib/python3.6/site-packages/superset/connectors/druid/models.py", line 831, in
if metrics_dict[name].metric_type == POST_AGG_TYPE
KeyError: 'totalActionTimeMs'

superset customed metric

type: postagg
json:
{
"type": "arithmetic",
"name": "avgActionTimeInMs",
"fn": "/",
"fields": [
{
"type": "fieldAccess",
"name": "totalActionTimeMs",
"fieldName": "totalActionTimeMs"
},
{
"type": "fieldAccess",
"name": "countRows",
"fieldName": "countRows"
}
]
}

DRUID metic spec

    "metricsSpec": [
        {
            "type": "count",
            "name": "countRows"
        },
        {
            "type" : "doubleSum",
            "name" : "totalActionTimeMs",
            "fieldName" : "actionTimeInMs"
        }
    ],

@issue-label-bot issue-label-bot bot added the #bug label May 15, 2019

@issue-label-bot

This comment has been minimized.

Copy link

commented May 15, 2019

Issue-Label Bot is automatically applying the label #bug to this issue, with a confidence of 0.96. Please mark this comment with 👍 or 👎 to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

@elukey

This comment has been minimized.

Copy link
Contributor

commented May 15, 2019

@david-z-johnson I'd suggest to test 0.32 or later to get some attention, there are a ton of fixes between your version and the most recent ones..

@david-z-johnson

This comment has been minimized.

Copy link
Author

commented May 16, 2019

@elukey , I have tried on latest version: 0.32.0rc2, got same result as mentioned before.

Below is the error callstack:

2019-05-16 20:56:54,345:ERROR:root:'countRows'
Traceback (most recent call last):
File "/home/monitor/incubator-superset/superset/viz.py", line 410, in get_df_payload
df = self.get_df(query_obj)
File "/home/monitor/incubator-superset/superset/viz.py", line 213, in get_df
self.results = self.datasource.query(query_obj)
File "/home/monitor/incubator-superset/superset/connectors/druid/models.py", line 1286, in query
client=client, query_obj=query_obj, phase=2)
File "/home/monitor/incubator-superset/superset/connectors/druid/models.py", line 883, in get_query_str
return self.run_query(client=client, phase=phase, **query_obj)
File "/home/monitor/incubator-superset/superset/connectors/druid/models.py", line 1084, in run_query
metrics_dict)
File "/home/monitor/incubator-superset/superset/connectors/druid/models.py", line 846, in metrics_and_post_aggs
postagg, post_aggs, saved_agg_names, visited_postaggs, metrics_dict)
File "/home/monitor/incubator-superset/superset/connectors/druid/models.py", line 812, in resolve_postagg
required_fields, metrics_dict)
File "/home/monitor/incubator-superset/superset/connectors/druid/models.py", line 773, in find_postaggs_for
metrics_dict[name] for name in postagg_names
File "/home/monitor/incubator-superset/superset/connectors/druid/models.py", line 774, in
if metrics_dict[name].metric_type == POST_AGG_TYPE
KeyError: 'countRows'

@elukey

This comment has been minimized.

Copy link
Contributor

commented May 16, 2019

The might be something wrong with the Druid post aggregation, I haven't checked closely. Can you confirm that it works as expected querying druid?

@david-z-johnson

This comment has been minimized.

Copy link
Author

commented May 17, 2019

@elukey , I checked with below DRUID query, it works for DRUID http query json below:

{
"queryType": "groupBy",
"dataSource": "xxxxxxxxxx",
"granularity": "day",
"dimensions": [
"actionName"
],
"aggregations": [
{
"type": "count",
"name": "countRows"
},
{
"type": "doubleSum",
"name": "totalActionTimeMs",
"fieldName": "actionTimeInMs"
}
],
"postAggregations": [
{
"type": "arithmetic",
"name": "avgActionTimeInMs",
"fn": "/",
"fields": [
{
"type": "fieldAccess",
"name": "totalActionTimeMs",
"fieldName": "totalActionTimeMs"
},
{
"type": "fieldAccess",
"name": "countRows",
"fieldName": "countRows"
}
]
}
],
"intervals": [
"2019-05-01T00:00:00.000Z/2019-05-16T00:00:00.000Z"
]
}

@elukey

This comment has been minimized.

Copy link
Contributor

commented May 18, 2019

Follow up question: when you check the datasource via Sources -> Druid datasources -> your_datasource do you see the countRows metric, or are you defining it in another way? (trying to understand since I don't have a ton of experience with Druid)

My goal is to reproduce the issue, but so far I was able only when adding name of metrics that were not listed in the datasource's details.

@david-z-johnson

This comment has been minimized.

Copy link
Author

commented May 20, 2019

@elukey , the fields countRows and totalActionTimeMs are both can be seen in Superset DRUID data source.
The DRUID metric spec:

    "metricsSpec": [
        {
            "type": "count",
            "name": "countRows"
        },
        {
            "type" : "doubleSum",
            "name" : "totalActionTimeMs",
            "fieldName" : "actionTimeInMs"
        }
    ]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.