Fix aggregate group by count#645
Conversation
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
oh, and this'd probably want some tests if we went with something like this. 😅. |
levkk
left a comment
There was a problem hiding this comment.
Looks very reasonable. I remember building support for this actually, I'm not entirely sure where it went. Maybe I'm wrong, it's been a few months. Would you be able to add a quick test, just to make sure the parsing logic works as you'd expect. Can put it in the same file and assert that it detected the correct column index. Cheers!
|
Ok, nice, I'll throw those in tomorrow. |
b1fea01 to
31336bc
Compare
|
Alright, I cleaned it up a bit, and added some tests. Should be good. There might be more tests than we actually want, so feel free to delete (a few of them cover the same ground). |
Addresses pgdogdev#638. I had a bit of spare time, and was curious about this. So, it turns out the issue is an incorrectly parsed out `group_by` most of the time. The `group_by` expects an index into the `target_list`, so I made that work for the case where you have the exact `table_name.column_name` or same `column_name` in the `select` and the `group by` queries. this feels a little bit janky to me, but 🤷 figured I'd raise for feedback, especially as I see a draft PR for a rewritten query engine. ```sql (michael)@127.0.0.1:6432 16:54:06 [repro_sharded] > select count(1), user_id from example group by example.user_id; count | user_id -------+--------- 6 | 1 3 | 2 4 | 3 (3 rows) (michael)@127.0.0.1:6432 16:54:07 [repro_sharded] > select count(1), example.user_id from example group by example.user_id; count | user_id -------+--------- 6 | 1 3 | 2 4 | 3 (3 rows) (michael)@127.0.0.1:6432 16:54:14 [repro_sharded] > select example.user_id, count(1), example.user_id from example group by example.user_id; unexpected field count in "D" message (michael)@127.0.0.1:6432 16:54:21 [repro_sharded] > select example.user_id, count(1) from example group by example.user_id; user_id | count ---------+------- 1 | 6 2 | 3 3 | 4 (3 rows) (michael)@127.0.0.1:6432 16:54:28 [repro_sharded] > select user_id, count(1) from example group by example.user_id; user_id | count ---------+------- 3 | 4 1 | 6 2 | 3 (3 rows) (michael)@127.0.0.1:6432 16:54:32 [repro_sharded] > select example.user_id, count(1) from example group by user_id; user_id | count ---------+------- | 13 (1 row) (michael)@127.0.0.1:6432 16:54:37 [repro_sharded] > select user_id, count(1) from example group by user_id; user_id | count ---------+------- 3 | 4 1 | 6 2 | 3 (3 rows) ``` See here for results, including some cases where it fails to work correctly. (where we duplicate the column in the select, and where we specify inconsistent `example.user_id` / `user_id` (NOTE: it only fails if we're more specific in the `select` than in the `group_by`.
Addresses #638.
I had a bit of spare time, and was curious about this.
So, it turns out the issue is an incorrectly parsed out
group_bymost of the time. Thegroup_byexpects an index into thetarget_list, so I made that work for the case where you have the exacttable_name.column_nameor samecolumn_namein theselectand thegroup byqueries.this feels a little bit janky to me, but 🤷 figured I'd raise for feedback, especially as I see a draft PR for a rewritten query engine.
See here for results, including some cases where it fails to work correctly. (where we duplicate the column in the select, and where we specify inconsistent
example.user_id/user_id(NOTE: it only fails if we're more specific in theselectthan in thegroup_by.