Skip to content
This repository has been archived by the owner on Feb 24, 2024. It is now read-only.

Ordering by random #671

Open
4bo opened this issue Apr 21, 2022 · 8 comments
Open

Ordering by random #671

4bo opened this issue Apr 21, 2022 · 8 comments
Labels
how-tos Goal oriented guides. Not in depth.

Comments

@4bo
Copy link

4bo commented Apr 21, 2022

as postgres supports order by random(), please add similar functionality to PostgREST


Edit: Similar requests:

@wolfgangwalther
Copy link
Member

as postgres supports [...], please add similar functionality to PostgREST

PostgreSQL supports a lot of stuff, that PostgREST doesn't. Adding full support for everything that PostgreSQL does is certainly not achievable. I wouldn't consider this a valid argument for supporting anything.


[...] order by random() [...]

We can look at this from two different angles:

  • We have not implemented calling arbitrary functions in filters or order parameters. If we implemented that, calling random should work, too. However, I don't see us implementing generic function support for a variety of reasons, one of them the inability to use indexes and the problem of DoS attacks.
  • random() feels like kind of a special case, because it does not depend on any other column - so there is not really a point of using an index at all. Not sure how many other expressions that don't depend on any columns make sense to use, but random() certainly does.

In any case, I think you can already use it right now with a virtual / computed column. Just define it like this:

create table my_table (...);

create function random(my_table) returns double precision
language sql as 'select random()';

You can then use it as a column on this endpoint:

GET /my_table?order=random

You could probably generalize the function to take any as the argument type, to allow this for all your endpoints.

@steve-chavez
Copy link
Member

order by random() will always cause a full table scan, so I don't think we'll support it.

Supporting TABLESAMPLE sounds more likely to happen.

More details at: https://www.2ndquadrant.com/en/blog/tablesample-and-other-methods-for-getting-random-tuples/

@steve-chavez
Copy link
Member

TABLESAMPLE seems a bit complex to expose to clients.

Since doing

create function random(anyelement) returns double precision
language sql as 'select random()';

Is pretty simple and makes random available for all tables, we could turn this into a how-to in docs.

This depends on PostgREST/postgrest#2442, because we'd need to be able to restrict to which columns the order by can be applied.

@steve-chavez
Copy link
Member

steve-chavez commented May 29, 2023

Similarly to the estimated count, perhaps we can apply the order by random() for low counts and then TABLESAMPLE for higher counts.

We could do it through an Accept: application/vnd.pgrst.random+json.

Edit: Maybe a header isn't right here as different media types can support random.

@steve-chavez steve-chavez transferred this issue from PostgREST/postgrest Aug 28, 2023
@steve-chavez steve-chavez added how-tos Goal oriented guides. Not in depth. and removed idea labels Aug 28, 2023
@steve-chavez
Copy link
Member

pg16 has a new any_value aggregate: https://www.postgresql.org/docs/16/functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE

Add aggregate function ANY_VALUE() which returns any value from a set

Haven't tried it yet but seems it could be used in conjunction with custom media types to get a random row for any relation.

@wolfgangwalther
Copy link
Member

Add aggregate function ANY_VALUE() which returns any value from a set

Haven't tried it yet but seems it could be used in conjunction with custom media types to get a random row for any relation.

It does not return a random row, however. It returns "a" row. IIRC, the current implementation is that it returns the "first" rows it finds. That's certainly not going to be a true random.

@steve-chavez
Copy link
Member

IIRC, the current implementation is that it returns the "first" rows it finds. That's certainly not going to be a true random.

Ah, ok. Then it's like first on: https://wiki.postgresql.org/wiki/First/last_(aggregate)

@wolfgangwalther
Copy link
Member

Ah, ok. Then it's like first on

Nope, it's not, because this is only an implementation detail and not guaranteed to stay that way. You can't rely on it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
how-tos Goal oriented guides. Not in depth.
Development

No branches or pull requests

3 participants