Skip to content

Fix aggregate group by count#645

Merged
levkk merged 6 commits intopgdogdev:mainfrom
meetcleo:fix-aggregate-group-by-count
Dec 4, 2025
Merged

Fix aggregate group by count#645
levkk merged 6 commits intopgdogdev:mainfrom
meetcleo:fix-aggregate-group-by-count

Conversation

@mijoharas
Copy link
Copy Markdown
Contributor

Addresses #638.

I had a bit of spare time, and was curious about this.

So, it turns out the issue is an incorrectly parsed out group_by most of the time. The group_by expects an index into the target_list, so I made that work for the case where you have the exact table_name.column_name or same column_name in the select and the group by queries.

this feels a little bit janky to me, but 🤷 figured I'd raise for feedback, especially as I see a draft PR for a rewritten query engine.

(michael)@127.0.0.1:6432 16:54:06 [repro_sharded]
> select count(1), user_id from example group by example.user_id;
 count | user_id
-------+---------
     6 |       1
     3 |       2
     4 |       3
(3 rows)

(michael)@127.0.0.1:6432 16:54:07 [repro_sharded]
> select count(1), example.user_id from example group by example.user_id;
 count | user_id
-------+---------
     6 |       1
     3 |       2
     4 |       3
(3 rows)

(michael)@127.0.0.1:6432 16:54:14 [repro_sharded]
> select example.user_id, count(1), example.user_id from example group by example.user_id;
unexpected field count in "D" message
(michael)@127.0.0.1:6432 16:54:21 [repro_sharded]
> select example.user_id, count(1) from example group by example.user_id;
 user_id | count
---------+-------
       1 |     6
       2 |     3
       3 |     4
(3 rows)

(michael)@127.0.0.1:6432 16:54:28 [repro_sharded]
> select user_id, count(1) from example group by example.user_id;
 user_id | count
---------+-------
       3 |     4
       1 |     6
       2 |     3
(3 rows)

(michael)@127.0.0.1:6432 16:54:32 [repro_sharded]
> select example.user_id, count(1) from example group by user_id;
 user_id | count
---------+-------
         |    13
(1 row)

(michael)@127.0.0.1:6432 16:54:37 [repro_sharded]
> select user_id, count(1) from example group by user_id;
 user_id | count
---------+-------
       3 |     4
       1 |     6
       2 |     3
(3 rows)

See here for results, including some cases where it fails to work correctly. (where we duplicate the column in the select, and where we specify inconsistent example.user_id / user_id (NOTE: it only fails if we're more specific in the select than in the group_by.

@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 3, 2025

Codecov Report

❌ Patch coverage is 93.11927% with 15 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
pgdog/src/frontend/router/parser/aggregate.rs 93.11% 15 Missing ⚠️

📢 Thoughts on this report? Let us know!

@mijoharas mijoharas marked this pull request as draft December 3, 2025 17:12
@mijoharas
Copy link
Copy Markdown
Contributor Author

oh, and this'd probably want some tests if we went with something like this. 😅.

Copy link
Copy Markdown
Collaborator

@levkk levkk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very reasonable. I remember building support for this actually, I'm not entirely sure where it went. Maybe I'm wrong, it's been a few months. Would you be able to add a quick test, just to make sure the parsing logic works as you'd expect. Can put it in the same file and assert that it detected the correct column index. Cheers!

@mijoharas
Copy link
Copy Markdown
Contributor Author

Ok, nice, I'll throw those in tomorrow.

@mijoharas mijoharas force-pushed the fix-aggregate-group-by-count branch from b1fea01 to 31336bc Compare December 4, 2025 13:55
@mijoharas mijoharas marked this pull request as ready for review December 4, 2025 13:55
@mijoharas
Copy link
Copy Markdown
Contributor Author

Alright, I cleaned it up a bit, and added some tests. Should be good. There might be more tests than we actually want, so feel free to delete (a few of them cover the same ground).

Copy link
Copy Markdown
Collaborator

@levkk levkk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

@levkk levkk merged commit 345bd28 into pgdogdev:main Dec 4, 2025
8 checks passed
dev-lew pushed a commit to dev-lew/pgdog that referenced this pull request Feb 18, 2026
Addresses pgdogdev#638.

I had a bit of spare time, and was curious about this.

So, it turns out the issue is an incorrectly parsed out `group_by` most
of the time. The `group_by` expects an index into the `target_list`, so
I made that work for the case where you have the exact
`table_name.column_name` or same `column_name` in the `select` and the
`group by` queries.

this feels a little bit janky to me, but 🤷 figured I'd raise for
feedback, especially as I see a draft PR for a rewritten query engine.

```sql
(michael)@127.0.0.1:6432 16:54:06 [repro_sharded]
> select count(1), user_id from example group by example.user_id;
 count | user_id
-------+---------
     6 |       1
     3 |       2
     4 |       3
(3 rows)

(michael)@127.0.0.1:6432 16:54:07 [repro_sharded]
> select count(1), example.user_id from example group by example.user_id;
 count | user_id
-------+---------
     6 |       1
     3 |       2
     4 |       3
(3 rows)

(michael)@127.0.0.1:6432 16:54:14 [repro_sharded]
> select example.user_id, count(1), example.user_id from example group by example.user_id;
unexpected field count in "D" message
(michael)@127.0.0.1:6432 16:54:21 [repro_sharded]
> select example.user_id, count(1) from example group by example.user_id;
 user_id | count
---------+-------
       1 |     6
       2 |     3
       3 |     4
(3 rows)

(michael)@127.0.0.1:6432 16:54:28 [repro_sharded]
> select user_id, count(1) from example group by example.user_id;
 user_id | count
---------+-------
       3 |     4
       1 |     6
       2 |     3
(3 rows)

(michael)@127.0.0.1:6432 16:54:32 [repro_sharded]
> select example.user_id, count(1) from example group by user_id;
 user_id | count
---------+-------
         |    13
(1 row)

(michael)@127.0.0.1:6432 16:54:37 [repro_sharded]
> select user_id, count(1) from example group by user_id;
 user_id | count
---------+-------
       3 |     4
       1 |     6
       2 |     3
(3 rows)
```

See here for results, including some cases where it fails to work
correctly. (where we duplicate the column in the select, and where we
specify inconsistent `example.user_id` / `user_id` (NOTE: it only fails
if we're more specific in the `select` than in the `group_by`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants