Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: countDistinct with multiple columns #2449

Merged
merged 11 commits into from Mar 12, 2018

Conversation

@joelmukuthu
Copy link
Contributor

@joelmukuthu joelmukuthu commented Jan 31, 2018

Closes #1776

@joelmukuthu joelmukuthu force-pushed the feature/count-distinct-multiple branch from 1259fc2 to 3c1236e Jan 31, 2018
@@ -2498,6 +2498,31 @@ describe("QueryBuilder", function() {
});
});

it("count distinct with multiple columns", function() {
Copy link
Member

@elhigu elhigu Feb 1, 2018

+1 for writing test first <3 Also integration smoke test would be needed though to make sure that syntax is correct.

@joelmukuthu
Copy link
Contributor Author

@joelmukuthu joelmukuthu commented Feb 16, 2018

Sorry for the delay on this, but I'm looking to implement it now and found a weird edge-case. At the moment, you can specify an alias for countDistinct like builder.countDistinct('column as alias') and that get's transformed to select count(distinct column) as alias.

This of course won't work with multiple fields so I suggest we change this to the regular object notation like builder.countDistinct({ alias: 'column' }) or with multiple fields builder.countDistinct({ alias: ['column1', 'column2'] }).

I'll leave the current behaviour and only add the object notation as a new feature but maybe we can even add a deprecation notice for it. What do you think @elhigu?

@joelmukuthu joelmukuthu force-pushed the feature/count-distinct-multiple branch from dde6e3e to e264765 Feb 16, 2018
@joelmukuthu joelmukuthu force-pushed the feature/count-distinct-multiple branch from 7bd1528 to acfe102 Feb 16, 2018
@joelmukuthu joelmukuthu force-pushed the feature/count-distinct-multiple branch 2 times, most recently from 38f0c35 to 0f1b681 Feb 16, 2018
@joelmukuthu joelmukuthu force-pushed the feature/count-distinct-multiple branch from 0f1b681 to 0d42905 Feb 16, 2018
@elhigu
Copy link
Member

@elhigu elhigu commented Feb 18, 2018

I'm thinking about what usually { alias: ['foo', 'bar']} syntax should do. I don't know if there is any use cases which should be left untouched. Are you planning to add special case how that syntax works only with .countDistinct or general change how {alias:[]} is interpreted?

@joelmukuthu I'm sorry I'm not sure if I understood what are you planning to do and what effects it will have to knex backwards compatibility? :)

@joelmukuthu
Copy link
Contributor Author

@joelmukuthu joelmukuthu commented Feb 18, 2018

I could be missing something but as far as I could tell the parts I touched are only dealing with building statements of type aggregate (excluding aggregateRaw) so the new {alias: [column1, column2]} syntax would only be used for those. I think that makes sense since you can't do

select (column1, column2) as alias from table

(or is it possible?) but you can definitely do

select sum(column1, column2) as alias from table

I made sure that the changes don't break the current behaviour, so for instance this test still passes: https://github.com/joelmukuthu/knex/blob/a3a5255ecbb2b4613eab21c237473d407f42591b/test/unit/query/builder.js#L2878-L2905.

FYI I decided to not bother fixing that failing mariasql test since it seems the dialect is being dropped anyway

@elhigu elhigu mentioned this pull request Feb 20, 2018
@joelmukuthu joelmukuthu force-pushed the feature/count-distinct-multiple branch from 7f8c238 to 10c3e22 Feb 22, 2018
joelmukuthu added 2 commits Feb 23, 2018
for count with multiple distinct columns in databases where this is
not supported
@joelmukuthu joelmukuthu force-pushed the feature/count-distinct-multiple branch from 2072f4a to f471d35 Feb 23, 2018
@joelmukuthu
Copy link
Contributor Author

@joelmukuthu joelmukuthu commented Feb 23, 2018

This is ready for review. Notes:

  • Added support for multiple calls for all aggregate methods:
// single column:
builder.count('column1', 'column2');
// multiple columns:
builder.countDistinct('column1', 'column2');
  • Added support for object alias notation for all aggregate methods:
// single column:
builder.count({ alias: 'column' });
// multiple columns:
builder.countDistinct({ alias: ['column1', 'column2'] });
  • Kept support for string alias notation, though this only works with one column:
builder.count('column as alias');

I skipped validating whether or not the dialect supports calling whatever aggregate function with multiple columns i.e. no error is thrown if a user does something they're not supposed to do like countDistinct with multiple columns if the database doesn't support it.

I also skipped integration tests for aggregates with multiple columns for these dialects.

@joelmukuthu
Copy link
Contributor Author

@joelmukuthu joelmukuthu commented Mar 12, 2018

@elhigu
Copy link
Member

@elhigu elhigu commented Mar 12, 2018

I think this is good, sorry for taking so much time lookin these. I've been even more busy lately than I was before 😬

@elhigu elhigu merged commit fbef9fc into knex:master Mar 12, 2018
1 check passed
@joelmukuthu joelmukuthu deleted the feature/count-distinct-multiple branch Mar 12, 2018
@joelmukuthu
Copy link
Contributor Author

@joelmukuthu joelmukuthu commented Mar 12, 2018

@elhigu no worries, thanks for taking a look! We appreciate the hard work you put in :)

@jrschumacher
Copy link

@jrschumacher jrschumacher commented Dec 30, 2018

@joelmukuthu was this tested with sqlite? I'm getting an error:

select count(distinct `CustomerThing`.`Customer_id`, `CustomerThing`.`thing`) from `CustomerThing` - SQLITE_ERROR: wrong number of arguments to function count()

jrschumacher added a commit to jrschumacher/bookshelf that referenced this issue Dec 30, 2018
@kibertoad
Copy link
Collaborator

@kibertoad kibertoad commented Dec 30, 2018

@jrschumacher No, MySQL and PostgreSQL only, judging by tests. Does SQLite support all necessary features for this?

@jrschumacher
Copy link

@jrschumacher jrschumacher commented Dec 31, 2018

@kibertoad after some tests:

select count(distinct 'CustomerThing'.'Customer_id', 'CustomerThing'.'thing') from 'CustomerThing' - wrong number of arguments to function count()
select count(distinct('CustomerThing'.'Customer_id', 'CustomerThing'.'thing')) from 'CustomerThing' - row value misused

Works

select count(*) from (
  select distinct "CustomerThing"."Customer_id", "CustomerThing".thing from "CustomerThing"
) "CustomerThing"

@kibertoad
Copy link
Collaborator

@kibertoad kibertoad commented Dec 31, 2018

@jrschumacher Would you consider raising a PR with a fix and expanded test?

@jrschumacher
Copy link

@jrschumacher jrschumacher commented Dec 31, 2018

@kibertoad Absolutely, just wanting to get some input. I'm not familiar with the use of distinct. I'm working on bookshelf/bookshelf#1936

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants