Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support for Druid cardinality estimation mertics #613

Merged
merged 3 commits into from
Jun 14, 2016
Merged

Better support for Druid cardinality estimation mertics #613

merged 3 commits into from
Jun 14, 2016

Conversation

axeisghost
Copy link
Contributor

@axeisghost axeisghost commented Jun 13, 2016

Default Druid aggregators include some aggregation operation in ingestion time that works specified for cardinality estimation. hyperUnique and thetaSketch are two aggregations that are already included in Druid package. They will create two metric columns after ingestion and we want caravel automatically add the count_distinct metric by using these two kinds of metrics for our users.

The change keeps the original logic for is_count_distinct column, but for hyperUnique and thetaSketch columns, it will set is_count_distinct to true when refreshing metadata and the metric for those columns will be automatically generated with the corresponding aggregator JSON.

Some screen shots for this feature applying on wikiticker database:
screenshot_5
The columns after refreshing wikiticker's metadata. hyperUnique and thetaSketch columns are marked as count_distinct columns.
screenshot_6
The metrics generated from those columns.
screenshot_7

@axeisghost axeisghost changed the title Druid thetasketch Better support for Druid cardinality estimation mertics Jun 13, 2016
@coveralls
Copy link

coveralls commented Jun 13, 2016

Coverage Status

Coverage decreased (-0.08%) to 82.015% when pulling 7da9a90 on joshwalters:druid_thetasketch into 267c019 on airbnb:master.

@axeisghost
Copy link
Contributor Author

The coverage decreased seems because the test cases do not include any column of hyperUnique or thetaSketch, so the branches in the change were not touched.

@mistercrunch
Copy link
Member

Neat!

@mistercrunch mistercrunch merged commit 347c39b into apache:master Jun 14, 2016
zhaoyongjie pushed a commit to zhaoyongjie/incubator-superset that referenced this pull request Nov 17, 2021
zhaoyongjie pushed a commit to zhaoyongjie/incubator-superset that referenced this pull request Nov 24, 2021
zhaoyongjie pushed a commit to zhaoyongjie/incubator-superset that referenced this pull request Nov 25, 2021
zhaoyongjie pushed a commit to zhaoyongjie/incubator-superset that referenced this pull request Nov 26, 2021
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.9.1 labels Feb 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.9.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants