Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected behavior for computed relationships using scalar functions (`returns <tablename>) #2481

Closed
laurenceisla opened this issue Sep 21, 2022 · 6 comments · Fixed by #2540
Labels
bug embedding resource embedding

Comments

@laurenceisla
Copy link
Member

laurenceisla commented Sep 21, 2022

With the fixture

create schema api;

create table api.premieres (
  id text primary key,
  film_id text -- references api.films(id)
);
insert into api.premieres values
  ('P1', 'F1'),
  ('P2', 'F1'),
  ('P3', 'F2'),
  ('P4', 'F3')
 ;

create table api.films(id text primary key);
insert into api.films values('F1'), ('F2'), ('F4');

and the functions

create or replace function api.computed_films_scalar(api.premieres) returns api.films as $$
  select * from api.films where id = $1.film_id
$$ stable language sql;

create or replace function api.computed_premiers_o2m(api.films) returns setof api.premieres as $$
  select * from api.premieres where film_id = $1.id
$$ stable language sql;

create or replace function api.computed_premiers_scalar(api.films) returns api.premieres as $$
  select * from api.premieres where film_id = $1.id
$$ stable language sql;
GET /premieres?select=*,computed_films_scalar(*)

-- questionable usage, expected results (taken as m2m relationship)
[{"id":"P1","film_id":"F1","computed_films_scalar":[{"id":"F1"}]}, 
 {"id":"P2","film_id":"F1","computed_films_scalar":[{"id":"F1"}]}, 
 {"id":"P3","film_id":"F2","computed_films_scalar":[{"id":"F2"}]}, 
 {"id":"P4","film_id":"F3","computed_films_scalar":[{"id":null}]}]

If we want a rowtype-returning scalar function used as a computed relationship, my suggestion would be marking it as in rows 1, because such a function always returns one row, no more, no less.

-- check the film existence
select f from api.films f where id = 'F1';
--   f   
-- ------
--  (F1)
-- (1 row)

select f from api.films f where id = 'F3';
-- f 
-- ---
-- (0 rows)

--- the differences between set-returning function and scalar function
-- scalar function returns one row even there are two matching rows
select * from api.computed_premiers_scalar((select f from api.films f where id = 'F1'));
--  id | film_id 
-- ----+---------
--  P1 | F1
-- (1 row)

-- set-returning function returns all the matching rows
select * from api.computed_premiers_o2m((select f from api.films f where id = 'F1'));
--  id | film_id 
-- ----+---------
--  P1 | F1
--  P2 | F1
-- (2 rows)

-- scalar function returns one row even there are no matching row
select * from api.computed_premiers_scalar((select f from api.films f where id = 'F3'));
--  id | film_id 
-- ----+---------
--  ¤  | ¤
-- (1 row)

-- set-returning function returns no rows
select * from api.computed_premiers_o2m((select f from api.films f where id = 'F3'));
-- id | film_id 
-- ----+---------
-- (0 rows)

Originally posted by @Iced-Sun in #2475 (comment)

@laurenceisla
Copy link
Member Author

laurenceisla commented Sep 21, 2022

There are two options we could go for:

  1. Dissallow the usage of scalar functions that returns <tablename> for computed relationships
  2. Fix it and return only a to-one relationship.

The issue may be in this line:

p.prorows = 1 as single_row

And adapting the query that calls the RPC.

@laurenceisla laurenceisla added bug embedding resource embedding labels Sep 21, 2022
@steve-chavez
Copy link
Member

Disallow the usage of scalar functions that returns for computed relationships

I think we should disallow and document it. The problem with not including a SETOF is that the function will not be inlinable.

I believe we should enforce this where we can otherwise it can lead to performance problems down the line.

@wolfgangwalther
Copy link
Member

Disallow the usage of scalar functions that returns for computed relationships

I think we should disallow and document it.

Please don't. The function is not inlineable, because we currently have it in the FROM part of a query. But maybe we can construct the query differently for those kind of functions, so that it is in the SELECT part?

In this case it would be inlineable. This might also be a much better approach to the whole ROWS 1 thing. We don't necessarily need to make all computed relationships return SETOF, I think.

Plus, the new computed relationship feature is not only about relationships, but it is also a "computed columns on steroids" feature. A bit more general.

Something that we couldn't do with regular computed columns before: Select only some of the fields from a computed column that returns a composite type:

CREATE TABLE x (
  y INT
);

CREATE FUNCTION abc(x, OUT a INT, OUT b INT, OUT c INT)
-- ...

With regular computed column syntax I would always get something like this:

{
  "y": 1,
  "abc": {
    "a": 2,
    "b": 3,
    "c": 4
  }
}

But with the new syntax, I could do something like select=*,asd(a) and get this:

{
  "y": 1,
  "abc": {
    "a": 2
  }
}

@Iced-Sun
Copy link

Please don't. The function is not inlineable, because we currently have it in the FROM part of a query. But maybe we can construct the query differently for those kind of functions, so that it is in the SELECT part?

In this case it would be inlineable. This might also be a much better approach to the whole ROWS 1 thing. We don't necessarily need to make all computed relationships return SETOF, I think.

For a scalar function, the wiki states it must meet the following conditions to be inlineable:

  1. the function body consists of a single, simple, SELECT expression
  2. the body query must return exactly one column

So I think inlining a scalar function is not particularly attractive.

With regular computed column syntax I would always get something like this:

{
  "y": 1,
  "abc": {
    "a": 2,
    "b": 3,
    "c": 4
  }
}

Not exactly the same, but it can be worked around as select=*,abc->>a:

{
  "y": 1,
  "a": 2
}

@wolfgangwalther
Copy link
Member

Not exactly the same, but it can be worked around as select=*,abc->>a:

Yeah. But something like select=*,abc(a,b) etc. will already become a mess, soon.

the function body consists of a single, simple, SELECT expression

That's the same in the FROM case, too.

the body query must return exactly one column

Yes, but that could be a SELECT x FROM x, i.e. return a composite type, and not individual columns.

So I think inlining a scalar function is not particularly attractive.

Yeah, I think inlining is not a strict requirement at all times. But there are cases where it's very helpful. So we should ideally have queries that would allow whoever is writing the function to make it inlineable. We can't control what is inside the function, but we can control how we call it, which affects inlineability, too.

@Iced-Sun
Copy link

Iced-Sun commented Sep 28, 2022

IMHO, allowing or denying the scalar functions used as computed relationship is more like a matter of taste. Inling a scalar function for select a, f(a) from t (where f(x)=x+1) is simply expanded to select a, a+1, in which few optimizations could be applied except constant-folding.

If the desicion is to allow using the scalar functions as computed relationship, there is a subtlety between a scalar function and a rows 1 set-returning function:

GET /premieres?select=*,computed_films_m2o(*)

-- expected usage and expected results
[{"id":"P1","film_id":"F1","computed_films_m2o":{"id":"F1"}}, 
 {"id":"P2","film_id":"F1","computed_films_m2o":{"id":"F1"}}, 
 {"id":"P3","film_id":"F2","computed_films_m2o":{"id":"F2"}}, 
 {"id":"P4","film_id":"F3","computed_films_m2o":null}]
GET /premieres?select=*,computed_films_scalar(*)

-- questionable usage, expected results (taken as m2m relationship)
[{"id":"P1","film_id":"F1","computed_films_scalar":[{"id":"F1"}]}, 
 {"id":"P2","film_id":"F1","computed_films_scalar":[{"id":"F1"}]}, 
 {"id":"P3","film_id":"F2","computed_films_scalar":[{"id":"F2"}]}, 
 {"id":"P4","film_id":"F3","computed_films_scalar":[{"id":null}]}]

Notice the difference of the last row (because a set-returning function returns empty results and a scalar function returns a null row).

An inner-join makes the case trickier:

GET /premieres?select=*,computed_films_m2o!inner(*)

-- expected usage and expected results
[{"id":"P1","film_id":"F1","computed_films_m2o":{"id":"F1"}}, 
 {"id":"P2","film_id":"F1","computed_films_m2o":{"id":"F1"}}, 
 {"id":"P3","film_id":"F2","computed_films_m2o":{"id":"F2"}}
GET /premieres?select=*,computed_films_scalar!inner(*)

-- questionable usage, unexpected results(inner join didn't work)
[{"id":"P1","film_id":"F1","computed_films_scalar":[{"id":"F1"}]},
 {"id":"P2","film_id":"F1","computed_films_scalar":[{"id":"F1"}]},
 {"id":"P3","film_id":"F2","computed_films_scalar":[{"id":"F2"}]},
 {"id":"P4","film_id":"F3","computed_films_scalar":[{"id":null}]}]

wolfgangwalther added a commit to wolfgangwalther/postgrest that referenced this issue Oct 27, 2022
Fixes PostgREST#2481

Signed-off-by: Wolfgang Walther <walther@technowledgy.de>
wolfgangwalther added a commit to wolfgangwalther/postgrest that referenced this issue Oct 27, 2022
Fixes PostgREST#2481

Signed-off-by: Wolfgang Walther <walther@technowledgy.de>
wolfgangwalther added a commit to wolfgangwalther/postgrest that referenced this issue Oct 27, 2022
Fixes PostgREST#2481

Signed-off-by: Wolfgang Walther <walther@technowledgy.de>
wolfgangwalther added a commit that referenced this issue Oct 27, 2022
Fixes #2481

Signed-off-by: Wolfgang Walther <walther@technowledgy.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug embedding resource embedding
Development

Successfully merging a pull request may close this issue.

4 participants