Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: ability to call pg 11 stored procedure #1210

Open
Iced-Sun opened this issue Nov 10, 2018 · 24 comments
Open

Feature Request: ability to call pg 11 stored procedure #1210

Iced-Sun opened this issue Nov 10, 2018 · 24 comments
Labels
enhancement a feature, ready for implementation

Comments

@Iced-Sun
Copy link

Environment

  • PostgreSQL version: 11
  • PostgREST version: 0.5.1
  • Operating system: not relevant

Description of issue

  1. create the stored procedure (not a function)
create procedure api.test_proc(a int) language sql as $$
select 1;
$$;
  1. make the request
POST /api/rpc/test_proc
Content-Type: application/json

{"a": 1}
  1. postgrest cannot call a stored procedure for now
: HTTP/1.1 400 Bad Request
: Server: nginx/1.10.3 (Ubuntu)
: Date: Sat, 10 Nov 2018 09:04:34 GMT
: Content-Type: application/json; charset=utf-8
: Transfer-Encoding: chunked
: Connection: keep-alive
:
: {"hint":"To call a procedure, use CALL.","details":null,"code":"42809","message":"api.test_proc(a => integer) is a procedure"}
@steve-chavez steve-chavez added the enhancement a feature, ready for implementation label Nov 14, 2018
@steve-chavez
Copy link
Member

At first I thought PROCEDUREs can't have return values, but turns out they work with INOUT parameters(only OUT parameters don't work btw).

create procedure public.test_proc(inout a int, inout b int) language sql as $$
select 1, 2;
$$;
 
call test_proc(3, 4);

 a | b 
---+---
 1 | 2
(1 row)

Going to have to think this through. Had the idea that a 204 No Content would always be the response.

@steve-chavez
Copy link
Member

call syntax is severely limited. We cannot dynamically construct the input(arguments) and output(result) in SQL as we have done with traditional stored functions.

Arguments

Not possible to:

-- use ctes
with smth as(
  select 1
)
call test_proc(3, 4);
-- ERROR:  42601: syntax error at or near "call"

-- use subqueries
call test_proc((select 3), (select 4));
-- ERROR:  0A000: cannot use subquery in CALL argument

-- it's possible to call functions in arguments though
create table myrowtype(a int, b int);
call test_proc((json_populate_record(null::myrowtype, '{"a":1,"b":2}')).a,(json_populate_record(null::myrowtype, '{"a":1,"b":2}')).b);
 a | b
---+---
 1 | 2

-- but not possible to use "AS" inside call
call test_proc((json_to_record('{"a":1,"b": 3}') as x(a int, b int)).a, (json_to_record('{"a":1,"b": 3}') as x(a int, b int)).b);
-- ERROR:  42601: syntax error at or near "as"

Result

Not possible to:

-- wrap call inside a function
select row_to_json((call test_proc(3, 4)));
-- ERROR:  42601: syntax error at or near "test_proc"

-- use ctes for storing call results
with smth as(
  call test_proc(3, 4)
)
select * from smth;

Proposal

For the arguments it's possible to do smth like:

call test_proc(a:= ('{"a":1,"b": 3}'::json->>'a')::int, b:= ('{"a":1,"b": 3}'::json->>'b')::int);

Downside is that the json must be copied. We could also do the conversion to json in haskell. Overall I think the arguments part is solvable.

For the result, we'd have to get all the stored procs inout parameters in the schema cache, decode dynamically(not sure if possible) and convert to json in haskell. For now, I'm inclining to not return anything but the status codes since INOUT is already limiting.

@steve-chavez
Copy link
Member

One more limitation for stored procedures:

If CALL is executed in a transaction block, then the called procedure cannot execute transaction control statements. Transaction control statements are only allowed if CALL is executed in its own transaction.

From https://www.postgresql.org/docs/11/sql-call.html

That means we cannot use our regular transaction wraparound for stored procedures. If we use transaction wraparound then the stored proc can't have COMMITs/ROLLBACKs inside(useless basically).

@steve-chavez
Copy link
Member

steve-chavez commented Jun 19, 2019

That also means if we allow calling pg11 stored procedures we could not do our usual ROLE switching nor enforcing the search_path.

It breaks too much of our foundation to be worth adding for now.
If new PostgreSQL versions improve the situation we could reconsider.

@steve-chavez steve-chavez removed the enhancement a feature, ready for implementation label Jun 19, 2019
@steve-chavez
Copy link
Member

I made a draft implementation at https://github.com/steve-chavez/postgrest/tree/pg11-proc.
If COMMIT/ROLLBACK is used inside the stored procs then the following error is shown:

{
    "code": "2D000",
    "details": null,
    "hint": null,
    "message": "invalid transaction termination"
}

@W1M0R
Copy link

W1M0R commented May 14, 2020

@steve-chavez There can be a few reasons for the invalid transaction termination error. First of all, for commit/rollback to work, the root routine that calls the procedure, must be a procedure itself. If, let's say, there is a runUserFunction that itself is made with create function and it does call user_routine, then user_routine will fail if it is a procedure and it does a commit.

Other reasons for that error, could be when the called procedure has security definer in its signature, or uses a set clause.

@steve-chavez
Copy link
Member

@W1M0R The main issue is that procedures can't do nested transactions. See https://dba.stackexchange.com/a/249453/158470.

Until that issue is solved in pg, stored procedures cannot be supported in postgrest(for the above reasons).

@W1M0R
Copy link

W1M0R commented May 15, 2020

@steve-chavez Don't procedures just do nested transactions differently, i.e. instead of begin/end, they just call other procedures.

From: https://www.postgresql.org/docs/12/plpgsql-transactions.html

A new transaction is started automatically after a transaction is ended using these commands, so there is no separate START TRANSACTION command. (Note that BEGIN and END have different meanings in PL/pgSQL.)

@steve-chavez
Copy link
Member

steve-chavez commented May 15, 2020

@W1M0R To put it concisely, if this can be made to work:

BEGIN;
SET LOCAL search_path = 'public, api';
SET LOCAL role = 'anon';
CALL test_proc(3, 4); --  proc that uses transactions inside(has COMMIT)
COMMIT;

Then postgrest could support calling procedures.

@W1M0R
Copy link

W1M0R commented May 15, 2020

Here is a message dated 2019-08-26 that confirms what you are saying, i.e. that this is indeed a current restriction of Postgres: https://www.postgresql.org/message-id/flat/99b87249-d08c-8082-340e-84273150b59e%402ndquadrant.com#74880e948075c03e3e209a60dd33af06

The problem lies in having to wrap the procedure call inside the transaction. Forgive my lack of understanding, but is it not possible to have the code base detect a procedure, and then call it outside the transaction, i.e. avoid a custom transaction all together and ensure that the procedure is called as the root?

@steve-chavez
Copy link
Member

Forgive my lack of understanding, but is it not possible to have the code base detect a procedure, and then call it outside the transaction, i.e. avoid a custom transaction all together and ensure that the procedure is called as the root?

Yes, but the problem is that we need to ensure the SET LOCAL ROLE anon is done. This is integral to postgrest auth: http://postgrest.org/en/v7.0.0/auth.html#authentication-sequence.

Besides from that, all of our HTTP logic wouldn't work too: http://postgrest.org/en/v7.0.0/api.html#http-logic. Those features relies on a bunch of SET LOCALs.

We cannot ensure the user created procedure would do that.

@W1M0R
Copy link

W1M0R commented May 16, 2020

Thanks for helping me understand the broader issues of the problem @steve-chavez.

@laurenceisla
Copy link
Member

Some users are used to stored procedures as a way to mutate data without returning values, not necessarily for the need of using nested transactions (e.g. #1739 ). So, even if nested transactions are still not available, we can support calling procedures while documenting that restriction.

@steve-chavez
Copy link
Member

I think we could fully support stored procedures transactions by applying some conventions on the procedure definition.

  1. To ensure our role and search path protections are maintained(see above), we can enforce the procedure to be declared with those as configuration parameters:
create or replace procedure my_proc() as $$ 
  select 'do something';
$$ language sql 
set role = 'anon'
set search_path = 'api, public';

anon must match our db-anon-role value, search_path must match the db-schema + db-extra-search-path value. Otherwise we refuse to run the stored procedure.

  1. Using Wolfgang's idea here we can pass our HTTP Context(JWT, headers, etc) to the procedure if it defines special parameters.
create or replace procedure my_proc(other_param int, "request.jwt.claims" json, "request.headers" json) as $$ 
  select 'do something';
$$ language sql 
set role = 'anon'
set search_path = 'api, public';

Then the user would have to switch role manually. This should be fine because anon doesn't have much privilege, to do more the user would need to switch anyway.

create or replace procedure my_proc(other_param int, "request.jwt.claims" json, "request.headers" json) as $$ 
  select set_config('role', ("request.jwt.claims"::json)->>'role');
  select 'do something';
$$ language sql 
set role = 'anon'
set search_path = 'api, public';

To keep user queries conditioned by our http context(e.g. current_setting('request.jwt.claims', true)::json->>'email') working:

create or replace procedure my_proc(other_param int, "request.jwt.claims" json, "request.headers" json) as $$ 
  select set_config('role', ("request.jwt.claims"::json)->>'role');
  select set_config('request.jwt.claims', "request.jwt.claims");
  select 'do something';
$$ language sql 
set role = 'anon'
set search_path = 'api, public';

By doing the above, we could then omit our wrapping transaction for calling procedures.

@wolfgangwalther
Copy link
Member

set role = 'anon'

Better to enforce the authenticator role here instead of anon. PostgREST could run without db-anon-role at all. And since we'd expect the stored procedure to make a call to SET ROLE anyway, it should start with a clean authenticator role.

@steve-chavez steve-chavez added the enhancement a feature, ready for implementation label Apr 8, 2023
@steve-chavez
Copy link
Member

steve-chavez commented Apr 9, 2023

The restrictions for pg 11 procedures should be:

  • authenticator would need GRANT EXECUTE on the procedure. Makes sense as we're basically conceding PostgREST authentication flow and transaction settings to the user.

    • Under openapi-mode = "follow-privileges", OpenAPI would not consider the EXECUTE privilege on authenticator and would only show procedures if anon/webuser/members_of_authenticator have EXECUTE privilege on them.
  • set role to 'authenticator' must be enforced on the procedure. i.e. we'll only allow procedures defined as create procedure .. set role to 'authenticator'.

  • No SECURITY DEFINER procedures. Makes sense as with security definer we essentially do create procedure .. set role to 'postgres/superuser_like', which conflicts with the previous restriction.

    • Also pg docs say "A SECURITY DEFINER procedure cannot execute transaction control statements"(ref), which makes it pointless to use a procedure for this, a function would be better.
  • Only callable with POST. There's no volatility(STABLE, IMMUTABLE) marker on procedures.

  • The search_path restriction doesn't seem necessary.

  • No filters will be applied on procedures

  • Since it's not really possible to apply transformations on inputs and outputs(as shown above), we should only allow a json input and output for starters. Later on this could be extended to other types like text or bytea. This way the implementation is simpler and also will tie in later with custom media types as discussed on Add pgrst.accept setting to RPC #1582.

-- we would recognize "request.body" and "response.body" as input and output
CREATE PROCEDURE hello("request.body" json, inout "response.body" json default null) AS $$
BEGIN
  -- role switch here...
  select json_build_object('hello', "request.body"->'name') into "response.body";
END$$
LANGUAGE PLPGSQL
SET role = 'authenticator';

-- later on: GET /rpc/hello "Content-Type: text/plain" "Accept: text/plain"
-- CREATE PROCEDURE hello("request.body" text, inout "response.body" text default null)
-- after #1582, custom media types could be done as
-- CREATE PROCEDURE hello("request.body" bytea, inout "response.body" "application/custom-mime" default null)

call hello("request.body" := '{"name": "world"}');
    response.body
---------------------
 {"hello" : "world"}

(inout is used in this example because for some reason out parameters are mandatory when using CALL)

Edit: out params are only supported from pg 14(ref)


Notes:

  • CALL is not preparable.
PREPARE fooplan (json, json) AS CALL test.hello($1, $2);
ERROR:  syntax error at or near "CALL"
  • Function and procedure clash on naming
create or replace procedure test.hello("request.body" json, inout "response.body" json)
language plpgsql as $$
begin
  select json_build_object('hello', current_setting('request.body', true)) into "response.body";
end$$;

create  function test.hello(json, json) returns void as $$ select 1 $$ language sql;
ERROR:  function "hello" already exists with same argument types
  • A procedure can be overloaded and its peers can be regular functions
create  function test.hello(json, json, json) returns void as $$ select 1 $$ language sql;
CREATE FUNCTION

@steve-chavez
Copy link
Member

Controlling the isolation level inside procedures is a bit weird:

CREATE OR REPLACE PROCEDURE hello(inout "response.body" json default null)
LANGUAGE PLPGSQL AS $$
begin
  commit;
  set transaction isolation level repeatable read;
  select json_build_object('hello', 'world') into "response.body";
END$$;

The first commit must be done, otherwise pg will err. See https://dba.stackexchange.com/a/289339/158470 - btw, the default_transaction_isolation method mentioned in that SO link could also be used to set isolation level on functions, as mentioned on #2468 (comment).

@steve-chavez
Copy link
Member

response.headers and response.status can also be used.

These might be a problem, since we would depend on the order of the parameters for decoding the response in Haskell(no way to change this with CALL). The safest way would be to have a custom type like:

CREATE TYPE rest_response as (body json, headers json, status text);

CREATE OR REPLACE PROCEDURE hello("request.body" json, OUT res rest_response)

But then we'd have the problem that the json body is inside the type, so custom media types couldn't be used later(unless we create a new wrapper type for each one, which would unwieldy).

So we would need an additional OUT parameter for headers/status, it can be a type or we could also group them both in a single JSON. I'll leave this part for a later enhancement.

@steve-chavez
Copy link
Member

steve-chavez commented Apr 11, 2023

authenticator would need GRANT EXECUTE on the procedure. Makes sense as we're basically conceding PostgREST authentication flow and transaction settings to the user.

The above one is bad. So I've been thinking about another way. How about if we just use session-level variables? Basically for every session that hits /rpc/pg11_procedure:

SET role TO <jwt_role>;
SET request.jwt.claims TO <claims>;
CALL myprocedure();
RESET ALL;

https://www.postgresql.org/docs/current/sql-reset.html

Using current_setting("request.jwt.claims", true) inside myprocedure works fine too.

If we always set the role at the start of the session, the RESET ALL might not be needeed.


No use doing RESET ALL since the proc CALL might fail and the next statement won't run. It'd be better to ensure the state is always right with the first bunch of SET.

@steve-chavez
Copy link
Member

SET role TO <jwt_role>;
SET request.jwt.claims TO ;

So far the above seems to work fine.

In an attempt to not limit the function signature to res OUT json I tried to transform the procedure output in SQL, by putting the CALL results into a session-level setting:

-- having
-- create or replace procedure test.mult_them(a int, b int, c int, res out int)
-- language plpgsql as $$
-- begin
--   select a*b*c into res;
-- end$$;

do $$
declare res int;
begin
  call test.add_them(1, 2, 3, res);
  perform set_config('response.body', res::text, false);
exception
  when others then
    reset all; -- ensure the session-level settings are reset
    raise; -- reraise the exception so the error is reported to the client
end $$;

But since the anonymous DO block starts a transaction, the procedure cannot use transaction control statements inside(fails with "invalid transaction termination").. defeating the original purpose.

create or replace procedure test.hello("request.body" json, inout "response.body" json)

So it's not possible to transform the output in SQL and the above restriction on signature must remain.

I think transforming the procs params in SQL to JSON might be doable with a session-level setting but since the body can be big, storing that result in a session-level setting could be trouble.

I think this is still better as no authenticator privilege will be required for the procedure nor switching roles manually.

Will add some tests to make sure the session-level settings don't conflict with the transaction-level ones.

@steve-chavez
Copy link
Member

create or replace procedure test.hello("request.body" json, inout "response.body" json)
So it's not possible to transform the output in SQL and the above restriction on signature must remain.

By restricting the proc signature, I'm sure we'll cause a lot of confusion for users - increasing support burden.

Also while experimenting with procedures, me and @laurenceisla found out that they don't support using SAVEPOINTs(ref).

Doing a ROLLBACK inside can be accomplished by doing RAISE on a function, as we have recommended for a long time now.

One thing that is unique about store procedures is setting the transaction isolation level, but this can be supported on functions with #2755 (by setting default_transaction_isolation, the same way is done on procedures).

For now I'll stop working on this as functions seem to be a fine replacement.

@foxydevloper
Copy link

foxydevloper commented Dec 1, 2023

I agree that considering the trade-offs, procedure support isn't a good fit for postgrest. I was originally hoping for support since I had a procedure that doesn't return anything, but I only just realized functions can return void, and it gives a proper 204 no content response, which is all I wanted procedures for. There should definitely be an error message if you try calling a procedure with /rpc/ syntax though because for a long time I was very confused as to why my /rpc/ call to a procedure wasn't working, since all it said was "couldn't find in schema cache".
The documentation should also mention this. The fact it's called "stored procedures" is incredibly confusing when "procedures" in postgres don't work.

@MyProxyPass
Copy link

Adding my opinion here. When browsing through the admittedly basic procedure examples above, it looked like all of those were simply selecting from a table. I DO think there is a use case for allowing Postgrest to execute stored procedures especially in the case of upserts. If the purpose of Postgrest is to simplify and expedite the data interaction with the database -- I think it would be helpful to take that extra step and allow conditional inserts or potential transformations to occur in that layer. Using Postgrest, how would one pass values from the API to the database engine and either insert a new record if some key doesn't exist or update a value in an existing row if they key did? The API shouldn't need to know this and should pass the information to the database and let it decide how to handle the scenario (of which, a stored procedure is an ideal place to put that decision making logic).

@steve-chavez
Copy link
Member

@MyProxyPass Hm, do you think that only SQL stored procedures are able to execute writes? SQL functions can do those just fine.

(Though as you point out, maybe we need a clear example of that in the docs)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement a feature, ready for implementation
Development

No branches or pull requests

7 participants