Better support for Druid cardinality estimation mertics #613

axeisghost · 2016-06-13T22:46:42Z

Default Druid aggregators include some aggregation operation in ingestion time that works specified for cardinality estimation. hyperUnique and thetaSketch are two aggregations that are already included in Druid package. They will create two metric columns after ingestion and we want caravel automatically add the count_distinct metric by using these two kinds of metrics for our users.

The change keeps the original logic for is_count_distinct column, but for hyperUnique and thetaSketch columns, it will set is_count_distinct to true when refreshing metadata and the metric for those columns will be automatically generated with the corresponding aggregator JSON.

Some screen shots for this feature applying on wikiticker database:

The columns after refreshing wikiticker's metadata. hyperUnique and thetaSketch columns are marked as count_distinct columns.

The metrics generated from those columns.

…_thetasketch

coveralls · 2016-06-13T22:59:06Z

Coverage decreased (-0.08%) to 82.015% when pulling 7da9a90 on joshwalters:druid_thetasketch into 267c019 on airbnb:master.

axeisghost · 2016-06-14T01:06:33Z

The coverage decreased seems because the test cases do not include any column of hyperUnique or thetaSketch, so the branches in the change were not touched.

mistercrunch · 2016-06-14T03:49:47Z

Neat!

axeisghost added 3 commits June 10, 2016 13:58

added rocognition of thetasketch and HLL metrics

4d4918d

make sure the name agreed with SQL convention

f3e8b5d

Merge branch 'master' of https://github.com/airbnb/caravel into druid…

7da9a90

…_thetasketch

axeisghost changed the title ~~Druid thetasketch~~ Better support for Druid cardinality estimation mertics Jun 13, 2016

joshwalters mentioned this pull request Jun 14, 2016

Can I use sketch columns with druid datasource #433

Closed

mistercrunch merged commit 347c39b into apache:master Jun 14, 2016

zhaoyongjie pushed a commit to zhaoyongjie/incubator-superset that referenced this pull request Nov 17, 2021

chore: upgrade @types/react (apache#613)

761ccea

zhaoyongjie pushed a commit to zhaoyongjie/incubator-superset that referenced this pull request Nov 24, 2021

chore: upgrade @types/react (apache#613)

e8c6c49

zhaoyongjie pushed a commit to zhaoyongjie/incubator-superset that referenced this pull request Nov 25, 2021

chore: upgrade @types/react (apache#613)

032b1c4

zhaoyongjie pushed a commit to zhaoyongjie/incubator-superset that referenced this pull request Nov 26, 2021

chore: upgrade @types/react (apache#613)

269a7df

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.9.1 labels Feb 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better support for Druid cardinality estimation mertics #613

Better support for Druid cardinality estimation mertics #613

axeisghost commented Jun 13, 2016 •

edited

Loading

coveralls commented Jun 13, 2016 •

edited

Loading

axeisghost commented Jun 14, 2016

mistercrunch commented Jun 14, 2016

Better support for Druid cardinality estimation mertics #613

Better support for Druid cardinality estimation mertics #613

Conversation

axeisghost commented Jun 13, 2016 • edited Loading

coveralls commented Jun 13, 2016 • edited Loading

axeisghost commented Jun 14, 2016

mistercrunch commented Jun 14, 2016

axeisghost commented Jun 13, 2016 •

edited

Loading

coveralls commented Jun 13, 2016 •

edited

Loading