Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to generate queries with distinct/group by #915

Closed
daurnimator opened this issue Jul 19, 2017 · 76 comments
Closed

Ability to generate queries with distinct/group by #915

daurnimator opened this issue Jul 19, 2017 · 76 comments
Labels
enhancement a feature, ready for implementation

Comments

@daurnimator
Copy link
Contributor

daurnimator commented Jul 19, 2017

PostgREST should have a way to perform a SELECT DISTINCT query.

I suggest a new reserved query word 'distinct'.

  • With no value it would just do a select distinct e.g. /mytable?distinct
  • Otherwise it is a comma separated list of columns (or computed columns) which would be used inside of SELECT DISTINCT ON(....) e.g. /mytable?distinct=name would be SELECT DISTINCT ON(name) * FROM mytable

Note that the columns that you are distinct on will also need to have order applied.

@begriffs
Copy link
Member

begriffs commented Aug 6, 2017

This has certainly been a frequently requested feature. My only hesitation is making "distinct" another reserved word, meaning it would block any filters on a column that happened to be called "distinct." However, like "select," this one is a sql reserved word too so it makes sense that we reserve it as a special url parameter.

I hesitate to try messing with the url parsing and query generation part of the code myself, that's more @steve-chavez's domain. Perhaps he can estimate the difficulty of this feature.

daurnimator added a commit to daurnimator/postgrest that referenced this issue Aug 15, 2017
@steve-chavez
Copy link
Member

steve-chavez commented Aug 15, 2017

Not sure if this should be implemented, by allowing distinct to be applied to any column unrestricted clients could potentially ddos a database.

I've bumped into a slow distinct query in postgresql a while ago and solved it by using a group by instead of distinct, remember distinct generating a more expensive seq scan, I don't have the details anymore but a quick googling suggest the problem could still persist:

https://dba.stackexchange.com/questions/93158/how-to-speed-up-select-distinct
https://dba.stackexchange.com/questions/63268/postgresql-being-slow-on-count-distinct-for-dates

As mentioned in one of the answers, you can mostly avoid doing distinct with better database-design or better queries.

So in general I think distinct is a expensive operation(similar to group by) and it should be better exposed behind a view or a stored procedure.

@ruslantalpa
Copy link
Contributor

@steve-chavez this is a good point against this. PostgrREST should expose only safe/fast operations

@russelldavies
Copy link
Contributor

For what it's worth, DISTINCT is just syntactic sugar for GROUP BY and I think in modern versions of postgres it's even implemented using the same code. (And DISTINCT ON is a postgres extension from way back that's a bit of a performance hack). So @steve-chavez's conclusion is correct.

@daurnimator
Copy link
Contributor Author

daurnimator commented Aug 16, 2017

As mentioned in one of the answers, you can mostly avoid doing distinct with better database-design or better queries.

I have many situations where I've wished for postgrest to have a distinct operator.
e.g. a table that contains events (timestamp, device, event_type, data), I want the latest event for each device subject to filters such as timestamp/event_type. So how would you allow a query to be written that accomplishes this? with my proposal it would be http://myapp.com/api/events?timestamp=lt.2017-08-08&event_type=eq.alarm&order=timestamp.desc&distinct=device

@steve-chavez
Copy link
Member

The query would have to be behind a stored procedure/view, it's definitely necessary to use distinct for some cases, but as I've mentioned before exposing distinct unrestricted poses a threat for the database, a similar argument has been made for group by support in #158 (comment).

@daurnimator
Copy link
Contributor Author

daurnimator commented Aug 16, 2017

The query would have to be behind a stored procedure/view,

When it's behind a view/stored procedure then you can no longer support the arbitrary filters that I love about postgrest.

it's definitely necessary to use distinct for some cases,

In that case I think it's a requirement that we include the functionality.

but as I've mentioned before exposing distinct unrestricted poses a threat for the database

If you are afraid of performance issues, then perhaps we can only allow distinct when the jwt claim (or other setting?) permits it?
However I think there are already performance questions around postgrest: for a large table, the default route with no filters will be a huge response that may slow the database down to build.

@steve-chavez
Copy link
Member

You can create a stored procedure that returns setof events and apply filters like in a table see the tests, the same for a view you can create it as in: create view distinct_devices as select distinct device, timestamp.. and do regular filters on that.

@daurnimator
Copy link
Contributor Author

You can create a stored procedure that returns setof events and apply filters like in a table see the tests, the same for a view you can create it as in: create view distinct_devices as select distinct device, timestamp.. and do regular filters on that.

But the filtering needs to happen before the distinct operation

@ruslantalpa
Copy link
Contributor

There will always be a fight between the two opposing needs expose more sql features vs protecting the database, striking the balance is hard. in this particular thread i am with @steve-chavez for now.

No if features like this http://www.databasesoup.com/2015/02/why-you-might-need-statementcostlimit.html make it into the core, we might relax a bit and expose more of the sql, in that situation i would go with implementing the groupby (while at the same time, having the default setting for cost limit to a low value).

should we close this for now? i don't see it going further in the immediate future.

@ruslantalpa ruslantalpa changed the title distinct/unique Ability to generate queries with distinct/unique/group by Aug 26, 2017
@ruslantalpa ruslantalpa added the enhancement a feature, ready for implementation label Aug 26, 2017
@daurnimator
Copy link
Contributor Author

This is probably my most wanted feature; and it certainly gets requested a lot in the chat room.

Perhaps we could have a toggle in the config file allow_distinct for the performance concerned?

@steve-chavez
Copy link
Member

@ruslantalpa I also think that the idea of the cost limit is a good one, though is not that reliable a max limit could be worked out, that also is a must for other features like multiple requests in a transaction, I made a related comment on "Customize query timeouts" #249, maybe we should continue there with the discussion and see if we can implement that feature independently.

@daurnimator That toggle could be easily abused by the users, I don't think it should be added to the config, this phrase summarizes my reasoning:

A well-designed system makes it easy to do the right things and annoying (but not impossible) to do the wrong things.

Users should be thinking twice before exposing expensive operations as distinct, as it is now, PostgREST should not make it easy for users to make those kind of mistakes.

@russelldavies
Copy link
Contributor

@daurnimator Have you had a look at pREST? It exposes more SQL directly, at the cost of potential performance issues, so may satisfy your DISTINCT needs via its GROUP BY support.

@ashish369in
Copy link

Just chiming in to request this enhancement be implemented. Distinct and Group-by will make quite a few views unnecessary in our case.

@steve-chavez
Copy link
Member

If one's really careful and uses the format function, this can be done with a proc and dynamic sql.

Example for DISTINCT:

create function projects(dist name) returns setof projects as $_$
begin
  return query execute format($$
    select distinct on(%I) * from projects
  $$, dist);
end; $_$ language plpgsql;

Then you can use all of pgrst filters normally:

-- filtering
curl "localhost:3000/rpc/projects?id=eq.1&dist=name&"
-- selecting
curl "localhost:3000/rpc/projects?select=id,name&dist=name"

Still, exposing DISTINCT to clients should be thought twice.

@ruslantalpa
Copy link
Contributor

ruslantalpa commented Aug 2, 2020

I've thought about this and came up with a solution.
The idea is this. group by and function calling in select operate on the "virtual table" returned by the where filters. In context of public apis, you never want to return more then a couple of thousand rows, so considering this, alowing group by on a few thousand rows poses no risk. The problem then becomes how do you make views that do not allow returning a lot of rows at the same time. And it's done like this

first you have a function like this one

create or replace function validate(
  valid boolean, 
  err text,
  details text default '',
  hint text default '',
  errcode text default 'P0001'
) returns boolean as $$
begin
   if valid then
      return true;
   else
      RAISE EXCEPTION '%', err USING
      DETAIL = details, 
      HINT = hint, 
      ERRCODE = errcode;
   end if;
end
$$ stable language plpgsql;

and you define the view like so

create view test.protected_books as 
  select id, title, publication_year, author_id 
  from private.books 
  where 
    validate(
      (current_setting('request.path', true) != '/protected_books') or -- do not check for filters when the view is embedded
      ( -- check at least one filter is set
        -- this branch is activated only when request.path = /protected_books
        (current_setting('request.get.id', true) is not null) or
        (current_setting('request.get.publication_year', true) is not null) or
        (current_setting('request.get.author_id', true) is not null)
      ),
      'Filter parameters not provided',
      'Please provide at least one of id, publication_year, author_id filters'
    )

 ;

This is the way 😁. The first parameter of validate can be any boolean expression and you can make arbitrary complex check on the incoming requests, for example on a time series table one could check that both start/end dates are provided and the diff is no more then a month. For this to work, the request.get needs to be made available in postgrest :), i've only implemented it my commercial version along with support for group by, function calling in select like max/sum ... and window functions, though not released yet, only on my computer and it seems to work great.

@wolfgangwalther
Copy link
Member

So the general idea would be to protect an endpoint against unwanted, unlimited requests, by throwing an error for those.

I guess you could achieve that in a function called through pre-request (http://postgrest.org/en/latest/configuration.html#pre-request). If you throw an error there, that should be good. You would of course need the query data there as well - but maybe those could be passed in as an argument (e.g. of type json) to the pre-request function. I think this approach would be a bit more flexible regarding the "checking of query string filters", because you can look for filters where you don't know the name beforehand as well. I imagine this is not as easy with the request.get.... approach.

@steve-chavez
Copy link
Member

the request.get needs to be made available in postgrest

If that one refers to the url query string, then it would be better as request.query_string.<key> or maybe request.qs.<key>. I can see support in postgrest for that, though I'm not sure if the value(eq.some, lt.123) would be useful for other means than to check if the value is set. Unless we somehow change the pgrst prefix to the pg operator lt.123 to < 123(not worth it probably).

support for group by, function calling in select like max/sum

Support for aggregates looks cool. Personally I would prefer the pre-request approach that Wolfgang mentioned, that way the VIEWs are not tangled with pgrst logic. Looks like that could be done by conditionally checking the path in the pre-request and then checking if the filters are applied.

Come to think of it, the required filter approach looks like an extended version of pg-safeupdate.

@ruslantalpa
Copy link
Contributor

pre-request was a design mistake imo. There's rarely a valid use-case for it that can't be implemented better in the proxy.
Having the conditions checked in the view definition, with static boolean logic (and not in an unrelated imperative function) is exactly the point, a self contained "smart view".
Query string as opposed to separate parameters can work (the query can be split up by a db function) but i don't know which way is better/easier to use, haven't thought about it

@ruslantalpa
Copy link
Contributor

about the values, you are talking as if the programing env in pg is some crippled thing, it's trivial to write a function to split the lt.123 value and let the consuming function decide what to do with that (since it knows the context)

@wolfgangwalther
Copy link
Member

pre-request was a design mistake imo. There's rarely a valid use-case for it that can't be implemented better in the proxy.

How about checking a custom header for validity based on data in the DB? That's exactly what I am doing for a subdomain-based multi-tenancy setup. Proxy sets the subdomain as a header and redirects the request. If that client/subdomain doesn't exist, I need to throw. I wouldn't know how to do that without pre-request and only proxy-based. So definitely not a design mistake, but very much needed.

Having the conditions checked in the view definition, with static boolean logic (and not in an unrelated imperative function) is exactly the point, a self contained "smart view".

That's for sure a positive, I agree. Although you can turn that around as well: If you need multiple endpoints to be limited in a very similar way, you can reduce code duplication a lot by putting that in a pre-request function instead of the VIEW.

But anyway: When setting the query string parameters as variables, both approaches can be taken.

Query string as opposed to separate parameters can work (the query can be split up by a db function) but i don't know which way is better/easier to use, haven't thought about it

I think it was just a different name instead of the .get. part, that Steve was suggesting, but the idea was still to split the query string up into parameters already.

@wolfgangwalther
Copy link
Member

about the values, you are talking as if the programing env in pg is some crippled thing, it's trivial to write a function to split the lt.123 value and let the consuming function decide what to do with that (since it knows the context)

It will not be as trivial, though, once you start adding nested logic, like or=(a.eq.A,b.eq.B,...) and more complex stuff.

I thought about adding the already parsed filters for a second, but I doubt that will be any less complex.. - because this would mean we had to pass the whole logic tree in... I'm not sure whether that would be worth it, tbh, - but if we wanted to have the logic tree available, passing this as a JSON argument to pre-request seems much more likely to solve that..

Given the complexity of implementing something that really works well not only for a couple of cases, but also in general, I wonder, whether we should maybe take a completely different route to solve the original problem:

What do you guys think about adding some kind of limit on the expected cost of the query? So if we were to do something roughly along the lines of:

BEGIN;

SET request.x=... -- all kinds of variables, like we currently do (no need for .get. / .qs.)

EXPLAIN read_query;
SET pgrst.estimated_cost=<from the explain>;

SELECT pre_request_function(); -- can use the estimated cost here to throw if you'd like

read_query; -- or use the estimated cost in the query implicitly: VIEW definition throwing like Ruslan suggested.

@wolfgangwalther
Copy link
Member

We could even provide a sane default for pgrst.max_estimated_cost and throw immediately if we wanted to. Of course with a nice error message explaining how to increase/disable the allowed cost if needed. ;)

@steve-chavez
Copy link
Member

steve-chavez commented Aug 18, 2020

@wolfgangwalther Yeah, I believe that approach would also be worth exploring. Check this related issue: #249 (comment) (some ideas around the pg_plan_filter module).

@wolfgangwalther
Copy link
Member

I thought a bit more about the proposed addition to the query string in the form of groupby=.... I think we can do without.

As long as we detect the possible aggregation functions via schema cache, we should know whenever an aggregate is called. Once that's the case, we can add all columns mentioned in the select= part, that are not aggregates to the GROUP BY. They'd need to be added anyway, if they were not functionally dependent on each other - and in that case it doesn't hurt to add them either.

This would:

  • still allow us to add groupby= later, if we need to. But I guess this should solve a lot of use-cases already. And we shouldn't try to expose the full SQL language via HTTP anyway, so it's fine if not everything is possible there.
  • allow us to add something like a Prefer: groupby=distinct,cube header to use some of the options available in grouping sets: DISTINCT, CUBE and ROLLUP should be possible here.
  • give us the knowledge on when to run a cost-check-query (as soon as we are in aggregation mode)
  • allow us to move filters on aggregates to a HAVING clause

The last part would not be compatible with the FILTER approach suggested above. But I guess we'd have the same conflict with ORDER BY already, which could also be used inside the aggregate - or on the outside, based on the result of the aggregate. So maybe we add a slightly different syntax for filters, some examples based on the shortened example from above:

GET /users?select=males&males=count.*&males.gender=eq.male
GET /users?select=males&males=count.*&males.filter.gender=eq.male&males.filter.order=name.asc
GET /users?select=males&males=count.*&males?gender=eq.male&males?order=name.asc

I actually like the ? syntax a lot - it has a lot of similarity to the first ? in the querystring - which adds filters and order to the main query, much like males? adds filters and order to the aggregate. I think ? in the querystring are allowed by spec as well, so that should work.

@steve-chavez
Copy link
Member

I thought a bit more about the proposed addition to the query string in the form of groupby=.... I think we can do without.
And we shouldn't try to expose the full SQL language via HTTP anyway, so it's fine if not everything is possible there.

This reminds me of this project's syntax(got good feedback on HN) where they also do aggregates without a group by. So I agree, we don't necessarily have to expose full SQL and if we can improve/restrict its syntax we should do it(the http QUERY method will also require a DSL from us).

I actually like the ? syntax a lot - it has a lot of similarity to the first ? in the querystring

Hm, ? needs urlencoding though, it's a reserved character.

@wolfgangwalther
Copy link
Member

Hm, ? needs urlencoding though, it's a reserved character.

It's a reserved character, yes, but not all reserved chracters need percent encoding everywhere.

See RFC 3986 2.2:

2.2. Reserved Characters
[...]
If data for a URI component would conflict with a reserved
character's purpose as a delimiter, then the conflicting data must be
percent-encoded before the URI is formed.
[...]
URI producing applications should percent-encode data octets that
correspond to characters in the reserved set unless these characters
are specifically allowed by the URI scheme to represent data in that
component. [...]

and RFC 3986 3.4:

3.4. Query
[...]
The query component is indicated by the first question
mark ("?") character and terminated by a number sign ("#") character
or by the end of the URI.
[...]
The characters slash ("/") and question mark ("?") may represent data
within the query component. Beware that some older, erroneous
implementations may not handle such data correctly when it is used as
the base URI for relative references (Section 5.1), apparently
because they fail to distinguish query data from path data when
looking for hierarchical separators. However, as query components
are often used to carry identifying information in the form of
"key=value" pairs and one frequently used value is a reference to
another URI, it is sometimes better for usability to avoid percent-
encoding those characters.

@wolfgangwalther
Copy link
Member

wolfgangwalther commented Feb 14, 2022

In #2164 (comment) I mentioned that it would be helpful to think about a syntax to express for filters to apply to either WHERE or HAVING parts of the query. In that context we should also think about the FILTER (WHERE ...) mentioned above.

There are probably a few ways to do that, but the one that I have in mind currently is like this:

  • First, we need to settle on a syntax on how to call the aggregate functions in select. I suggested this:
    GET /employees?select=salary.sum()
  • Now, we can add regular main-query-WHERE-clause filters with the syntax as usual:
    GET /employees?select=total:salary.sum()&position=eq.department+lead
  • With dot separated subfilters, we can add FILTER (WHERE ...):
    GET /employees?select=females:salary.sum(),males:salary.sum()&females.gender=eq.male&males.gender=eq.female
    This is much less confusing than the ? syntax I proposed a few comments above, because it's very well in line with how we filter on embeddings.
  • With filters on the aggregation call itself, we can add HAVING filters:
    GET /employees?select=n:count(),mean:salary.avg()&n=gt.10
    The parentheses in count() tell us that it's a function call and through the schema cache we know that this will be an aggregation. Because of that, we know we need to move the filter to HAVING.
  • For the last example (HAVING) it should be possible to use the syntax without aliases, to use it as a filter, but not select it:
    GET /employees?select=salary.avg()&count()=gt.10
    This doesn't make much sense for FILTER, I think. Note, how count() needs parentheses to be able to tell it apart from a column named count.

Edit:

One thing I am not sure about, yet, is what happens when we use aggregation inside embeddings. Example:

  • WHERE-clause should work fine:

    GET /companies?select=*,employees(total:salary.sum())&employees.position=eq.department+lead
  • FILTER (WHERE ...) should work fine:

    GET /companies?select=*,employees(females:salary.avg(),males:salary.avg())&employees.females.gender=eq.female&employees.males.gender=eq.male
  • HAVING is fine too, but note the small details. The following would add the gt.10 filter to the HAVING clause of the subquery:

    GET /companies?select=*,employees(n:count(),mean:salary.avg())&employees.n=gt.10

    While, if Order parent by child's column (for to-one mapping) #1414 (comment) was supported, the following would add the filter to the outer query's WHERE:

    GET /companies?select=*,employees(n:count(),mean:salary.avg())&employees->>n=gt.10

Ok, should work fine. Just wanted to write those down to see it.

@steve-chavez
Copy link
Member

Looks amazingly consistent 🥇

GET /employees?select=salary.sum()
GET /users?select=count()

So right now the syntax is:

  • aggregate(): a select aggregate(t) from tbl t
  • col.aggregate(): a select aggregate(col) from tbl

One question, does it ever make sense to have an aggregate function call with two columns as an input?

select aggregate(col1, col2) from tbl t

Would we need an aggregate(col1,col2) syntax in that case?

@wolfgangwalther
Copy link
Member

One question, does it ever make sense to have an aggregate function call with two columns as an input?

json_object_agg is a built-in example, that would make use of two columns, I guess.

Would we need an aggregate(col1,col2) syntax in that case?

I'd say no. Those cases seem special enough, that we don't need to support them out of the box with our http syntax. It's still possible to create custom aggregates on the whole row, that use those two columns internally.

@jdgamble555
Copy link

Have you guys given up on group by? It is just as dangerous as any other filter. This is extremely important for basic use cases without the extra work of a View.

J

@wolfgangwalther
Copy link
Member

Have you guys given up on group by?

Why do you think so?

We have 175 open issues. Anything we've given up on is closed.

@jdgamble555
Copy link

Why do you think so?

Because this is a necessary feature for basic use cases (like sorting a table by reactions, votes), and it has been 2 years with no updates. If you have to use VIEWS for basic use cases, the library is incomplete.

I 100% disagree with the dangerous argument here. Even with that argument, there are safety things you could put into place.

J

@wolfgangwalther
Copy link
Member

Because this is a necessary feature for basic use cases

That's a bold statement.

and it has been 2 years with no updates.

Not true. You could either say: It has not been implemented for more than 4 years since when the issue was opened. Or you could say: There were no updates since February, 16th of this year.

the library is incomplete.

100% agree. There's so many ideas to make PostgREST better - I doubt we will be "complete" any time soon.

there are safety things you could put into place.

Yes, certainly. We discussed a few here and elsewhere already. Maybe you can help us put those in place. PR welcome!

@jdgamble555
Copy link

That's a bold statement.

I don't see how. Any app that is past a todo app is going to need to total counts.

Not true. You could either say: It has not been implemented for more than 4 years since when the issue was opened. Or you could say: There were no updates since February, 16th of this year.

Ok, there is no action for more than 4 years, and there are negative comments that are very controversial on the usefulness of this. Just because you can do anything with a View, doesn't mean you should have to for basic use cases.

100% agree. There's so many ideas to make PostgREST better - I doubt we will be "complete" any time soon.

I think I meant complete with basic features. I wasn't trying to sound negative. It is a trigger for me personally when people turn down feature requests for one-sided reasons, and nothing gets done for years because of it. It is definitely fair to say PRs are welcome. That being said, I know nothing of Haskell.

GraphQL handles a lot of these cases with aggregations, which works due to the nature of graphs. I just want to make sure this feature is not "turned down" more than anything.

J

@steve-chavez
Copy link
Member

An alternative syntax to the above:

GET /employees?select=salary.sum()

GET /employees?select=total:$f.sum(salary)

This is related to the dollar operators proposal on #2125 (comment).

@codeayra
Copy link

codeayra commented Feb 9, 2023

By any chance can we do GROUP BY now on postgREST APIS??

@jdgamble555
Copy link

jdgamble555 commented Nov 23, 2023

I wanted to get the distinct tags for this schema:

CREATE TABLE tags (
  name text,
  post_id uuid REFERENCES posts(post_id) ON DELETE CASCADE,
  created_at timestamptz NOT NULL DEFAULT now(),
  PRIMARY KEY (post_id, name)
);

So, using some computed fields hack, I found a use for it in filters:

GET /tags?distinct_tag=is.true

Choose your version:

Compare post_id

CREATE OR REPLACE FUNCTION distinct_tag(tags)
  RETURNS boolean AS $$
  SELECT NOT EXISTS (
    SELECT 1
    FROM tags
    WHERE name = $1.name
    AND post_id < $1.post_id
  );
$$ LANGUAGE sql;

Positive equivalent

CREATE OR REPLACE FUNCTION distinct_tag(tags)
  RETURNS boolean AS $$
  SELECT EXISTS (
    SELECT 1
    FROM tags t
    WHERE t.name = $1.name
    AND t.post_id <= $1.post_id
    EXCEPT
    SELECT 1
    FROM tags t
    WHERE t.name = $1.name
    AND t.post_id < $1.post_id
  );
$$ LANGUAGE sql;

DISTINCT ON

CREATE OR REPLACE FUNCTION distinct_tag(tags)
  RETURNS boolean AS $$
  SELECT EXISTS (
    SELECT 1
    FROM (SELECT DISTINCT ON (name) post_id, name FROM tags) subquery
    WHERE name = $1.name
    AND post_id = $1.post_id
  );
$$ LANGUAGE sql;

Window Functions

CREATE OR REPLACE FUNCTION distinct_tag(tags)
  RETURNS boolean AS $$
  SELECT EXISTS (
    SELECT 1
    FROM (
      SELECT name, post_id, ROW_NUMBER() OVER (PARTITION BY name ORDER BY post_id) AS rn
      FROM tags
    ) subquery
    WHERE rn = 1
    AND name = $1.name
    AND post_id = $1.post_id
  );
$$ LANGUAGE sql;

There might also be a way to use "in" and "isdistinct", but not sure.

But at what point is an rpc not just easier?

CREATE OR REPLACE FUNCTION distinct_tags()
RETURNS SETOF tags AS $$
  SELECT DISTINCT ON (name) * FROM tags 
$$ LANGUAGE sql;

Of course in reality we would want to sort by the count usually too:

CREATE OR REPLACE FUNCTION tag_count(tags)
  RETURNS bigint as $$
  SELECT COUNT(*) FROM tags 
  WHERE name = $1.name;
$$ LANGUAGE sql;
GET /tags?select=name,count:tag_count&distinct_tag=is.true&order=tag_count

I find these Computed Fields to be much more manageable than long rpc functions. Especially when you can do all the joins after the fact.

Of course if it just worked like this:

GET /tags?distinct=name&order=distinct.asc

We maybe wouldn't need all this mess. Either way, cool to know some hacks.

NOTE: If you can simplify any of my examples, please let me know :)

J

@wolfgangwalther
Copy link
Member

NOTE: If you can simplify any of my examples, please let me know :)

With PostgREST 12's support for aggregate functions this can easily be done like this:

GET /tags?select=name,count()&order=name.asc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement a feature, ready for implementation
Development

No branches or pull requests

9 participants