Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distinct modifier support for aggregate functions #1046

Merged
merged 1 commit into from
Nov 9, 2015
Merged

Distinct modifier support for aggregate functions #1046

merged 1 commit into from
Nov 9, 2015

Conversation

ErisDS
Copy link
Contributor

@ErisDS ErisDS commented Nov 4, 2015

This adds support for count(distinct *) as discussed in #1028.

I have also added support for avg(distinct *) and sum(distinct *) as both of these seemed valuable, I have not added distinct support for min and max as I don't think that adding distinct to those aggregates can impact the outcome. If it is felt that these should be added, I'm more than happy to update the PR.

There may be some confusion here between the standard distinct expression, and the distinct expression that goes inside of an aggregate function. I'm not sure whether to refer to it as 'inner distinct' or perhaps 'aggregate distinct'. When building the statement, I used aggregateDistinct to differ from the standard distinct.

I should probably also rename both _distinct() and _distinctFlag to also clarify the distinction between the distincts (see what I did there? 😛) but wanted to ask if there were thoughts on the name - naming things is always the hardest part.

I have tested this locally against sqlite3 only.

More than happy to provide more tests or other updates as necessary.


closes #1028

  • add support for count(distinct *), avg(distinct *) and sum(distinct *)
  • min and max don't really make sense with distinct, so didn't add those

@elhigu
Copy link
Member

elhigu commented Nov 7, 2015

Looks perfect to me. Only thing I would have changed would have been to add changed code and autogenerated files in separate commits :) @rhys-vdw any comments to this one? It had good tests and implementation was pretty much the same with e.g. not modifier.

EDIT: Actually now I noticed that documentation of the new APIs is still missing from index.html

@ErisDS
Copy link
Contributor Author

ErisDS commented Nov 8, 2015

I can always pull changes out into separate commits if that is desirable, however I think having separate commits for src and built code would be undesirable - it is not useful history & creates a point at which things will not work as you expect.

I've tried to implement this in an idiomatic way so it fits the library - I felt the naming was definitely confusing so I've updated that. These names are long, but other than calling it aggDistinct I wasn't sure where to go with it 😁 happy to change it.

Also updated the docs, good shout @elhigu. Also happy to modify / improve them if anyone has a better idea.

@@ -26,6 +26,7 @@ function Builder(client) {
this._joinFlag = 'inner';
this._boolFlag = 'and';
this._notFlag = false;
this._aggregateDistinctFlag = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to store this on QueryBuilder instance.

@rhys-vdw
Copy link
Member

rhys-vdw commented Nov 9, 2015

@ErisDS @elhigu: Having generated code in the repository at all is undesirable. Don't worry about it. The solution is to add it to the .gitignore, but there will have to be some tweaks to the release script first.

Sorry for the wait, @ErisDS. I approve of the API change.

One thing to change in the implementation before I can merge:

Instead of adding an _aggregateDistinct(), just pass an extra argument directly to _aggregate.

  // Retrieve the sum of the distinct values of a given column.
  sumDistinct: function(column) {
    return this._aggregate('sum', column, true);
  },

closes #1028

- add support for count(distinct *), avg(distinct *) and sum(distinct *)
- min and max don't really make sense with distinct, so didn't add those
@ErisDS
Copy link
Contributor Author

ErisDS commented Nov 9, 2015

I have updated this PR with the suggested changes. Reduced surface area 👍

I think whether or not generated code should be checked in depends a lot on the project and how it works - there are pros and cons from a git perspective. In this case, as the documentation in the repo depends on the generated code, I don't think it's so bad to have it checked in. I didn't realise how the docs worked at first - but actually, writing docs is a great extra check that your code does what you expect - in fact, the docs almost work like a REPL, very handy.

May I (in a separate PR) update the contribution guidelines with what I learned about the documentation, and that an update there is needed when making api changes?

rhys-vdw added a commit that referenced this pull request Nov 9, 2015
Distinct modifier support for aggregate functions
@rhys-vdw rhys-vdw merged commit bd97610 into knex:master Nov 9, 2015
@rhys-vdw
Copy link
Member

rhys-vdw commented Nov 9, 2015

I have updated this PR with the suggested changes.

Great, thanks for that.

I think whether or not generated code should be checked in depends a lot on the project and how it works - there are pros and cons from a git perspective.

The included knex.min.js script matters in the gh-pages branch, so that can be worked around. Docs, as you are obviously aware, are not generated content. I think the docs would be the same other than missing the script outside of gh-pages.

May I (in a separate PR) update the contribution guidelines with what I learned about the documentation, and that an update there is needed when making api changes?

Yes, please. 👍

@ErisDS ErisDS deleted the issue-1028 branch November 9, 2015 20:01
@ErisDS
Copy link
Contributor Author

ErisDS commented Nov 9, 2015

Another reason why it may be desirable to keep the generated files in git - I just changed my package.json file to "knex": "git://github.com/tgriesser/knex.git#bd97610" to check if this fixes an issue I'm encountering in my project (it doesn't... that's another story).

I know there are both pros and cons to keeping built files in the repo, just wanted to suggest that if it isn't causing major headaches (I don't know if it is or isn't ), changing it may result in some you aren't expecting 😉

@rhys-vdw
Copy link
Member

rhys-vdw commented Nov 9, 2015

@ErisDS That case can be addressed with an install script in package.json.

@rhys-vdw
Copy link
Member

rhys-vdw commented Nov 9, 2015

I've done a bit of research on the matter recently as I just made this change to Bookshelf. I'm fairly convinced it's the right way. Note that the generated code is not necessarily correct at each commit - so keeping lib/ under version control could be more problematic than you realize. Also nobody wants to be solving merge conflicts in compiled output.

@tgriesser
Copy link
Member

I think that having /src with a postinstall script as react-router does it will be the way to go here, https://github.com/rackt/react-router/blob/620d2703b6314a03b8a54a1fccaac097d436567a/npm-scripts/postinstall.js that way if you have a copy off of github it should also work so long as you have babel installed.

@rhys-vdw
Copy link
Member

@tgriesser Problem here is that the global version of babel may be incorrect. IMO it's totally valid to call npm install babel in the postinstall script and use the devDependencies version of Babel. Babel has just undergone some massive changes so you can't rely on configuration being compatible across the board.

I've been advised that the publish script is technically the correct place for this. It is also run on install, but will run before a release.

@rhys-vdw
Copy link
Member

Here is how I handled it in Bookshelf. Still requires some work.

@ricardogama
Copy link

@tgriesser @rhys-vdw This feature is awesome, when can we expect a new release?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How to do count(distinct) ?
5 participants