Remove unnecessary bit counting code from spark `bit_count` #18841

pepijnve · 2025-11-20T10:22:10Z

Which issue does this PR close?

Followup to bit_count in spark create not fuly compatible with spark #18225 and PR Fix: spark bit_count function #18322

Rationale for this change

Spark's bit_count function always operators on 64-bit values, while the original bit_count implementation in datafusion_spark operated on the native size of the input value.
In order to fix this a custom bit counting implementation was ported over from the Java Spark implementation. This isn't really necessary though. Widening signed integers to i64 and then using i64::count_ones will get you the exact same result and is less obscure.

What changes are included in this PR?

Remove custom bitcount logic and use i64::count_ones instead.

Are these changes tested?

Covered by existing tests that were added for #18225

Are there any user-facing changes?

No

alamb

Looks like a nice improvement to me -- thanks @pepijnve

comphead

Thanks @pepijnve in Spark/JVM and Rust sometimes there are discrepancies, like treating decimals, regexp, etc.

please add tests for booleans T/F/null

Remove unnecessary bit counting code

8960159

github-actions bot added the spark label Nov 20, 2025

alamb approved these changes Nov 20, 2025

View reviewed changes

martin-g approved these changes Nov 20, 2025

View reviewed changes

alamb changed the title ~~Remove unnecessary bit counting code~~ Remove unnecessary bit counting code from spark bit_count Nov 20, 2025

comphead reviewed Nov 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove unnecessary bit counting code from spark `bit_count` #18841

Remove unnecessary bit counting code from spark `bit_count` #18841

pepijnve commented Nov 20, 2025

Uh oh!

alamb left a comment

Uh oh!

comphead left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Remove unnecessary bit counting code from spark bit_count #18841

Are you sure you want to change the base?

Remove unnecessary bit counting code from spark bit_count #18841

Conversation

pepijnve commented Nov 20, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

comphead left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Remove unnecessary bit counting code from spark `bit_count` #18841

Remove unnecessary bit counting code from spark `bit_count` #18841