Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Add type-safe count, avg, sum, max and min aggregate functions #1487

Merged
merged 13 commits into from
Nov 27, 2023

Conversation

L-Mario564
Copy link
Contributor

@L-Mario564 L-Mario564 commented Nov 10, 2023

This PR implements type-safety for the following aggregate functions to all dialects:

  • count
  • avg
  • sum
  • max
  • min

⚠️ Everything below this is now outdated ⚠️ Check this comment for an up to date implementation.

The following is valid Drizzle syntax:

await db.select({
  scores: count('*'),
  myScores: count('*').filterWhere(eq(scores.setById, 1)), // `filterWhere` not available for MySQL
  noReplays: count(distinct(scores.setById)),
  avg: avg(scores.score, { mode: 'number' }), // `avg` and `sum` return a string by default in case the user desires greater control over the floating point value's precision
  total: sum(scores.score),
  highest: max(scores.score),
  lowest: min(scores.score),
  oldest: min(scores.submittedAt), // This will be returned as a Date instead of a number
}).from(scores);

This PR also opens the doors for contributors to add other (built-in) functions via the BuiltInFunction class that sort of acts like a wrapper for the sql operator. So something like:

sql<string>`lower(employees.name)`

Can be implemented as:

lower(employees.name)

Just to name a very simple example.

Implement `count`, `avg`, `sum`, `max` and `min` to PG dialect
Implement `count`, `avg`, `sum`, `max` and `min` to MySQL dialect and all of the previously mentioned and `total` to SQLite dialect
Copy link
Collaborator

@Angelelz Angelelz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really impressive work! You can tell this was a lot of work.
Although I'm going through the code wondering why do you think the BuiltInFunction class was necessary?
Also, it seems like all the implementations for each dialect are exactly the same. Do you think we could put it in a common place?
Please keep in mind that if we implement the built-in functions as suggested, none of the changes in the dialect, query-builders, and sql file would be necessary. We should avoid complexity as much as we can.

drizzle-orm/src/built-in-function.ts Outdated Show resolved Hide resolved
drizzle-orm/src/mysql-core/functions/aggregate.ts Outdated Show resolved Hide resolved
drizzle-orm/src/mysql-core/functions/aggregate.ts Outdated Show resolved Hide resolved
drizzle-orm/src/pg-core/dialect.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@dankochetov dankochetov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's how the count() function might be implemented a bit clearer:

class Count extends SQL<number> {
    constructor(value: SQLWrapper) {
        super([sql`cast(count(${value}) as int)`]);
    }

    filterWhere(value: SQLWrapper) {
        this.queryChunks.push(sql` filter (where ${value})`);
        return this;
    }
}

function count(value: SQLWrapper) {
    return new Count(value);
}

For other operators, if they don't require extra chaining methods, you can use the same approach as with filter operators (sql/expressions/conditions.ts)

@Angelelz
Copy link
Collaborator

Here's how the count() function might be implemented a bit clearer:

class Count extends SQL<number> {
    constructor(value: SQLWrapper) {
        super([sql`cast(count(${value}) as int)`]);
    }

    filterWhere(value: SQLWrapper) {
        this.queryChunks.push(sql` filter (where ${value})`);
        return this;
    }
}

function count(value: SQLWrapper) {
    return new Count(value);
}

For other operators, if they don't require extra chaining methods, you can use the same approach as with filter operators (sql/expressions/conditions.ts)

Just for clarification, we just discussed this case, we shouldn't do any casting at the database level:

constructor(value: SQLWrapper) {
    super([sql`count(${value})`]);
    this.decoder = Number;
    return return this as SQL<number>
}

Avoiding casting at db level would prevent possible issues down the road with function composition, ie: max(count(users.name)), that are dialect specific.

@L-Mario564
Copy link
Contributor Author

Adding to @Angelelz's comment, having (in this case) Count extend SQL creates a circular dependency issue if created in a file that isn't in src/sql/sql.ts, so, either these aggregate functions have to be shoved into sql.ts, making it very out of place when classes like Table and Column are in their own separate common and dialect files, or we can take some inspiration from a code snippet written by @Angelelz which although a bit hacky, does get the job done, could be cleaned up for it to be reusable across the other aggregate functions and would keep the code organized. The snippet is the following:

interface count extends SQL {
  filterWhere: (where: SQL) => SQL;
}
const count = (value: SQL | AnyColumn) => {
  interface Count extends SQL {
    filterWhere: (where: SQL) => SQL;
  }
  const count = sql`count(${value})`;
  (count as Count)["filterWhere"] = (where: SQL) => {
    return count.append(sql` filter (where ${where})`);
  };
  return count as Count;
};

// This works
count(users.name).filterWhere(eq(users.name, "user"));

@dankochetov Would like to know your thoughts on how I should reimplement this.

@Angelelz
Copy link
Collaborator

Where is the circular dependency? This file should only be importing from sql/sql.ts? and no other file in the repo should be importing from here...

@L-Mario564
Copy link
Contributor Author

Where is the circular dependency? This file should only be importing from sql/sql.ts? and no other file in the repo should be importing from here...

Not to familiar with this circular dependency error and therefore not sure why it's there in the first place, but if I import SQL as a value (class) to extend it, it throws a circular dependency error at runtime. Like for example, extending the SQL class within /pg-core/functions/aggregate.ts results in this behavior, regardless if I import it from sql/sql.ts or sql/index.ts. Not sure if I'm missing something here.

@Angelelz
Copy link
Collaborator

Can you push your changes so I can take a look?

@L-Mario564
Copy link
Contributor Author

Can you push your changes so I can take a look?

Whilst trying to do an example for this error, now I'm not getting it. It seems like a joke but I'm being serious here.
Not sure what exactly may have caused this error to disappear, I imported SQL from ~/sql/index.ts and extended that class, it compiled fine and tests are running as expected.
Mind you, I encountered this error on my first attempt of implementing this, which was done before the release of Drizzle ORM 0.29.0. After the release, I merged drizzle-team's commits from beta into this branch, and did so after I was done with this implementation, so something could have changed that now makes this a non-issue, whatever that may be.

With that said, it seems a reimplementation is very much possible without any "hacks", without disorganizing the code and, most importantly, without introducing greater complexity like I've done so far.

@L-Mario564 L-Mario564 marked this pull request as draft November 14, 2023 20:46
@L-Mario564
Copy link
Contributor Author

After discussing the implementation of these functions with @Angelelz, I've made the following changes:

  • BuiltInFunction class has been removed and now all functions are just wrappers for sql in a similar fashion to the functions in ~/sql/expressions/conditions.ts. This also removes the ability to do filter (where ...) for PG and SQLite.
  • Each aggregate function is now dialect agnostic, and are now exported from drizzle-orm instead of drizzle-orm/[dialect]-core.
  • Removed the distinct function and its related classes and types in favor of having a distinct and non-distinct variant of each function (except min and max due to adding distinct resulting in the same output). For example: count and countDistinct.
  • Removed count('*') syntax in favor of just count(). So db.select({ value: count() }).from(user) would be as follows in SQL: select count(*) from "user". This change was made to make it syntactically consistent with db.select().
  • Removed total function for SQLite, seeing as it's already incredibly similar to sum anyway.

With all these changes in place and using the same example as I did previously, the current implementation looks as follows:

await db.select({
  scores: count(),
  noReplays: countDistinct(scores.setById),
  avg: avg(scores.score).mapWith(Number), // `avg` and `sum` return a string by default in case the user desires greater control over the floating point value's precision
  total: sum(scores.score),
  highest: max(scores.score),
  lowest: min(scores.score),
  oldest: min(scores.submittedAt), // This will be returned as a Date instead of a number
}).from(scores);

@L-Mario564 L-Mario564 changed the title Feat: Add (some) type-safe aggregate functions Feat: Add type-safe count, avg, sum, max and min aggregate functions Nov 18, 2023
@L-Mario564 L-Mario564 marked this pull request as ready for review November 18, 2023 23:02
@AndriiSherman
Copy link
Member

❤️

Copy link
Collaborator

@Angelelz Angelelz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! Thank you for your work. Just making sure that generic parameter is necessary, I saw it in several places. 👍🏻

drizzle-orm/src/mysql-core/dialect.ts Outdated Show resolved Hide resolved
@@ -141,7 +142,7 @@ export class PgDeleteBase<
returning(
fields: SelectedFieldsFlat = this.config.table[Table.Symbol.Columns],
): PgDeleteReturning<this, TDynamic, any> {
this.config.returning = orderSelectedFields(fields);
this.config.returning = orderSelectedFields<PgColumn>(fields);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be left over code from your initial implementation. Or is it necessary to make the compiler happy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit of both. Yes, it's leftover from the previous implementation, but decided to leave it as it has nothing to do with the BuiltInFunction class. This makes it consistent with the MySQL dialect, as MySqlColumn was specified there but not in PG or SQLite for some reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Angelelz Let me know if you think this should stay or not.

@yaziciahmet
Copy link

yaziciahmet commented Nov 21, 2023

Just throwing it out here: In order to use aggregate functions in a select statement, either all the selected expressions need to be aggregate functions, or there needs to be a group by statement in the query. For instance following is illegal select name, count(*) from users. Could we somehow put a guard against this with type checks?

@L-Mario564
Copy link
Contributor Author

Just throwing it out here: In order to use aggregate functions in a select statement, either all the selected expressions need to be aggregate functions, or there needs to be a group by statement in the query. For instance following is illegal select name, count(*) from users. Could we somehow put a guard against this with type checks?

Although a good idea on paper, this would likely require many changes to existing types as well as having to add new ones, and even then, there can be scenarios in which catching that sort of thing at a type level is simply not possible, so it would be a ton of effort with very little reward as it may not handle certain cases. This is out this PR's scope.

@Angelelz
Copy link
Collaborator

Angelelz commented Nov 23, 2023

Just throwing it out here: In order to use aggregate functions in a select statement, either all the selected expressions need to be aggregate functions, or there needs to be a group by statement in the query. For instance following is illegal select name, count(*) from users. Could we somehow put a guard against this with type checks?

This seems like a good fit for a linter. Definitely out of the scope for this PR.
Edit: Thank you for pointing it out though

@AndriiSherman
Copy link
Member

Will be added to 0.29.1 release

Check for group by will be added to eslint package, that @Angelelz created and I guess we will release it tomorrow

Just throwing it out here: In order to use aggregate functions in a select statement, either all the selected expressions need to be aggregate functions, or there needs to be a group by statement in the query. For instance following is illegal select name, count(*) from users. Could we somehow put a guard against this with type checks?

@AndriiSherman AndriiSherman merged commit 255a7c8 into drizzle-team:beta Nov 27, 2023
4 checks passed
@phiresky
Copy link

Question: Why were these added as mapWith(String)? Why weren't they added with accurate types? (i.e. sum(int) returns bigint, sum(bigint) returns numeric, etc)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants