Skip to content

COUNT(expr) always returns the COUNT(colname) #12023

@rtyler

Description

@rtyler

Describe the bug

The COUNT() function appears to accept expressions but kind of disregards user input and then just drives into walls.

Imagine for example a table which has id[int], status[int], url[utf8] and has 50 rows with status == 200 and 50 rows with status == 500

SELECT COUNT(status) FROM source

Will return 100

SELECT COUNT(status > 200) FROM source

will also return 100 😕

It appears that this the count function is just taking the first argument in the expression and calling it a day, so the invocation syntax is valid (no errors) but produces incorrect results.

To Reproduce

No response

Expected behavior

Ideally I would like the predicate expression in COUNT() to work as it does in Spark SQL, which would result in SELECT COUNT(status == 500) as errors, COUNT(status == 200) as oks FROM source producing two columns:

+-----------+-------+
|   errors  |  oks  |
+-----------+-------+
|     50    |  50   |
+-----------+-------+

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions