Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(api): count nulls with topk #8531

Merged
merged 1 commit into from Mar 14, 2024
Merged

Conversation

cpcloud
Copy link
Member

@cpcloud cpcloud commented Mar 3, 2024

Default to count star instead of count(expr) in topk API. Closes #8518.

BREAKING CHANGE: expr.topk(...) now includes null counts. The row count of the topk call will not differ, but the number of nulls counted will no longer be zero. To drop the null row use the dropna method.

@cpcloud cpcloud added this to the 9.0 milestone Mar 3, 2024
@cpcloud cpcloud added feature Features or general enhancements breaking change Changes that introduce an API break at any level labels Mar 3, 2024
@cpcloud cpcloud force-pushed the topk-count-nulls branch 2 times, most recently from 0de09d6 to b67bbc2 Compare March 3, 2024 17:52
@NickCrews
Copy link
Contributor

related, since this is breaking perhaps want to consider these both at the same time: #8540

@cpcloud cpcloud force-pushed the topk-count-nulls branch 2 times, most recently from 115860a to 3a889bd Compare March 14, 2024 09:22
@cpcloud
Copy link
Member Author

cpcloud commented Mar 14, 2024

@NickCrews Making column names match should be done in a separate PR

BREAKING CHANGE: `expr.topk(...)` now includes null counts. The row count of the `topk` call will not differ, but the number of nulls counted will no longer be zero. To drop the `null` row use the `dropna` method.
@cpcloud cpcloud requested a review from kszucs March 14, 2024 09:49
@cpcloud cpcloud added the ci-run-cloud Add this label to trigger a run of Bigquery and Snowflake in CI label Mar 14, 2024
@cpcloud cpcloud enabled auto-merge (squash) March 14, 2024 09:50
@ibis-docs-bot ibis-docs-bot bot removed the ci-run-cloud Add this label to trigger a run of Bigquery and Snowflake in CI label Mar 14, 2024
@cpcloud cpcloud disabled auto-merge March 14, 2024 09:50
@cpcloud cpcloud enabled auto-merge (squash) March 14, 2024 09:51
@cpcloud cpcloud disabled auto-merge March 14, 2024 09:51
@cpcloud cpcloud enabled auto-merge (squash) March 14, 2024 09:51
Copy link
Member

@kszucs kszucs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cpcloud cpcloud merged commit 54c2c70 into ibis-project:main Mar 14, 2024
83 checks passed
@cpcloud cpcloud deleted the topk-count-nulls branch March 14, 2024 11:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change Changes that introduce an API break at any level feature Features or general enhancements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: topk() always says a column contains 0 NULLs
3 participants