Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PL/pgSQL’s transaction control model disallows critical use cases #2464

Open
bllewell opened this issue Sep 30, 2019 · 4 comments
Open

PL/pgSQL’s transaction control model disallows critical use cases #2464

bllewell opened this issue Sep 30, 2019 · 4 comments
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/medium Medium priority issue

Comments

@bllewell
Copy link
Contributor

bllewell commented Sep 30, 2019

Jira Link: DB-1962
Note

YugabyteDB re-uses the source code that implements the “upper half” of PostgreSQL Version 11.2. See HERE. This means that the problem that this PostgreSQL issue tracks affects users of YugabyteDB too.

Transaction Management in PL/pgSQL

The official doc on the topic (see Chapter 43.8. Transaction Management) says absolutely nothing to allow the user to expect the problem described here. Rather, it implies that there will be no such problem.


Correction added by bllewell on Wednesday 2-Oct-2019

The very last line of the doc at the link I gave says this

A transaction cannot be ended inside a block with exception handlers.


It shows a small sample PL/pgSQL procedure that executes an “insert” statement in a loop and alternatingly executes “commit” or “rollback” inside the loop. And it mentions no restrictions.

However, if a PL/pgSQL procedure has a block statement with an exception section (this might be the procedure’s defining block, or an inner block) like this:

create or replace procedure p()
  language plpgsql
as $$
begin
  rollback;
  begin
    insert into t(k, v) values(1, 42);
    commit;
  exception when others then
    raise;
  end;
end
$$;

then that block’s executable section must not issue “commit”. Doing so causes this run-time error:

cannot commit while a subtransaction is active

Googling for this message finds nothing useful in the PostgreSQL docs but it does find THIS DISCUSSION on Stack Overflow. This attributes the restriction to the fact that entry into the executable section of a block statement that has an exception section implicitly creates an anonymous savepoint and entry into a handler in the exception section implicitly rolls back to that anonymous savepoint. It seems to imply that there’s a laws-of-physics limitation at the root of it. But this is bogus. Oracle Database is happy to let you commit in the executable section of a PL/SQL block statement that has an exception section. So it strikes me that the PostgreSQL behavior is, rather, an emergent property of an unfortunate implementation.

The limitation is hugely problematic. Here are two example use cases that suffer.

Use Case One

A general principle of proper practice insists that the management of all the SQL statements that implement an application’s database backend must be encapsulated behind a hard-shell stored procedure API—which API (by suitable use of privileges) ensures that the users, as which client-side programs connect, cannot access any tables or views. An extension to this insists that no “raw” database error should ever escape the hard shell because the texts of such errors reveal names of schemas, tables, constraints, procedures, and so on. And hackers are helped by knowing such names. For example, hackers provoke errors by submitting “bad” input in order to discover exploitable SQL injection vulnerabilities.

The practice to counter this has an “others” handler in every procedure among the set that jointly defines the database API that logs details about the error to an incident table and gets back the new unique incident ID. This is then given back to the client-side code, for example by encoding the number into the text of a user-defined exception like this:

“Unexpected error. Please note incident number 421742 and contact support@acme.com with this number.”

However, some errors occur at “commit” time. An obvious example is when a foreign key constraint is defined with the “deferrable initially deferred” option and a DML violates the constraint.

This means that the API-defining procedures must issue “commit” in their executable sections and handle any exceptions that the commit causes in their exception sections. This, then, extends further to any unexpected exception that bubbles up from called procedures that jointly implement the API’s functionality.

Use Case Two

The correct semantics of some transactions are guaranteed only when these are executed at the serializable isolation level. (This is why the level exists.) This implies that serialization errors will occur. And the occurrence of such an error implies that the transaction must be retried. (This is what the error text says.) Of course, to honor the spirit of encapsulation, the retry loop must be programmed in PL/pgSQL behind the API’s hard shell. However, serialization errors often occur only at “commit” time. This, too, implies the need for a block statement whose executable section issues “commit” and whose exception section has a handler for the named serialization error exception that will signal the need for retry.

Summary

The limitations of PL/pgSQL’s transaction control model are so severe that stored procedures in PostgreSQL simply cannot be used for their intended purpose.

Appendix

Here is a code illustration of Use Case One (“deferrable initially deferred” foreign key). Create the tables, thus—initially without the “deferrable initially deferred” option—and insert a master row:

create table masters(
  mk int
  constraint masters_mk_check check(mk > 0)
  constraint masters_pk primary key,

  v varchar(10)
    constraint masters_v_nn not null
    constraint masters_v_unq unique);

create table details(
  mk int
  generated by default as identity,

  dk int
  generated always as identity,

  v varchar(20)
  constraint details_v_nn not null,

  constraint details_pk primary key(mk, dk),

  constraint details_fk foreign key(mk)
    references masters(mk)
    match full
    on delete cascade
    on update restrict
  );

insert into masters(mk, v) values(1, 'Mary');

Create the procedure that EITHER catches the expected error and sends back a sanitized version OR catches all unexpected errors, logs the report, and sends back an anodyne message:

create or replace procedure insert_detail(new_mk in int, new_v in varchar)
  language plpgsql as
$$
begin
  begin
    insert into details(mk, v) values(new_mk, new_v);
  exception
    when foreign_key_violation then
      raise info
        'the new detail record has no master record'
        using errcode = '88888'; 

    when others then
      -- Log the full details of the error to the incident table.
      -- Get back the incident ID, say 421742.
      raise info
        'Unexpected error. Indident ID: %', 421742
        using errcode = '99999'; 
  end;
end;
$$;

Test it first by inserting a detail for the existing master:

call insert_detail(1, 'hammer');

Now test it by trying to create a masterless detail:

call insert_detail(2, 'saw');

We see this message:

INFO:  88888: the new detail record has no master record

as intended.

Now alter the FK constraint:

alter table details
  alter constraint details_fk
  deferrable initially deferred;

and try again to insert the masterless detail. Now we see this:

ERROR:  23503: insert or update on table "details" violates foreign key constraint "details_fk"
DETAIL:  Key (mk)=(2) is not present in table "masters".
SCHEMA NAME:  public
TABLE NAME:  details
CONSTRAINT NAME:  details_fk
LOCATION:  ri_ReportViolation, ri_triggers.c:2776

So our attempt to secure critical names has failed, and the hacker has some very useful facts.

As explained, the reason for this outcome is that the error is caused by "commit". But "commit" is illegal in the block statement's "executable" section because it has an "exception" section.

@bllewell bllewell assigned bllewell and unassigned bllewell Sep 30, 2019
@ddorian ddorian added the area/ysql Yugabyte SQL (YSQL) label Sep 30, 2019
@wishdev
Copy link

wishdev commented Sep 30, 2019

The doc you pointed to clearly says the following as the last line

"A transaction cannot be ended inside a block with exception handlers."

Rightly or wrongly - it is clearly noted.

@bllewell
Copy link
Contributor Author

bllewell commented Oct 1, 2019

Thanks, John Higgins. I'm embarrassed to say that I missed that sentence. I see it now. So I withdraw the part of the "correction" that I submitted. Of course, I'm still left frustrated with respect to finding a way to implement solutions for the use cases I've described.

@bllewell
Copy link
Contributor Author

bllewell commented Oct 1, 2019

I got an email from Christophe Pettus. It started with this:

This is documented; it's the very last line of the page you reference in the Github issue:

A transaction cannot be ended inside a block with exception handlers.

Yes, just as John Higgins pointed out. I'm embarrassed.

Christophe's email continued thus:

Your discussion doesn't answer the specific issue: BEGIN/EXCEPTION/END in pl/pgSQL is implemented by savepoints. What semantics should COMMIT / ROLLBACK have inside of that? Doing a COMMIT / ROLLBACK (at the database level) at that point would lose the savepoints, which would break the BEGIN/EXCEPTION/END semantics.

It's not clear to me what the alternative semantics would be. Can you propose specific database behavior for a COMMIT or ROLLBACK inside a BEGIN/EXCEPTION/END block which retain the savepoint behavior of BEGIN/EXCEPTION/END?

Oracle Database implements the semantics that I need. I downloaded the freely available XE edition and implemented this:

create table masters(
  mk int
  constraint masters_mk_check check(mk > 0)
  constraint masters_pk primary key,

  v varchar(10)
    constraint masters_v_nn not null
    constraint masters_v_unq unique);

create table details(
  mk int,

  dk int
  generated always as identity,

  v varchar(20)
  constraint details_v_nn not null,

  constraint details_pk primary key(mk, dk),

  constraint details_fk foreign key(mk)
    references masters(mk)
    on delete cascade
    deferrable initially deferred
  );

These "create table" statements are simpler that what I used in PostgreSQL (above)—and they work there too. They're sufficient to support the two procedures (below) that demonstrate the semantics that I want.

Here's the first:

create or replace procedure insert_detail(new_mk in int, new_v in varchar2) is
  foreign_key_violation exception;
  pragma exception_init(foreign_key_violation, -02291);
  error_on_commit exception;
  pragma exception_init(error_on_commit, -02091);
begin
  begin
    insert into details(mk, v) values(new_mk, new_v);
    commit;
  exception
    when foreign_key_violation then
      DBMS_Output.Put_Line(
        'the new detail record has no master record');

    when error_on_commit then
      if sqlerrm() like '%ORA-02291%' then
        DBMS_Output.Put_Line(
          'error_on_commit: foreign_key_violation');
      else
        -- Log the full details of the error to the incident table.
        -- Get back the incident ID, say 421742.
        DBMS_Output.Put_Line(
          'Unexpected error. Incident ID: '||421742);
      end if;

    when others then
      -- Log the full details of the error to the incident table.
      -- Get back the incident ID, say 421742.
      DBMS_Output.Put_Line(
        'Unexpected error. Incident ID: '||421742);
  end;
end;
/

It has some Oracle idioms—but the body is pretty much what you'd write in PL/pgSQL. (DBMS_Output.Put_Line() is a reasonable substitute for "raise info". This favors the use case where a procedure implements a business txn that must succeed, or fail, atomically. Here, the "savepoint" notion is semantically wrong. I like to write "rollback" in each handler for its self-doc value.

And here's the second:

create or replace procedure p is
  value_too_large_for_column exception;
  pragma exception_init(value_too_large_for_column, -12899);
begin
  rollback;
  for j in 1..15 loop
    begin
      insert into masters(mk, v) values(j, rpad('x', j, 'x'));
    exception
       when value_too_large_for_column then
        DBMS_Output.Put_Line('j = '||to_char(j)||' caught value_too_large_for_column');
    end;
  end loop;
  commit;
end;
/

Executing it (after deleting from "masters" and committing) produces this output:

j = 11 caught value_too_large_for_column
j = 12 caught value_too_large_for_column
j = 13 caught value_too_large_for_column
j = 14 caught value_too_large_for_column
j = 15 caught value_too_large_for_column

And selecting from "masters" shows this:

MK V
-- ----------
 1 x
 2 xx
 3 xxx
 4 xxxx
 5 xxxxx
 6 xxxxxx
 7 xxxxxxx
 8 xxxxxxxx
 9 xxxxxxxxx
10 xxxxxxxxxx

We see that the failed statement execution that brings you to a hander (and only that) is rolled back by the Oracle system. Clearly there's an anonymous savepoint notion at work here, because the effect of statement executions that don't fail survives.

This seems to be what the PostgreSQL implementation seeks to achieve. But the implementation brings the penalty that I've described that nullifies the general value of stored procedures.

The key point is that, using Oracle Database, I am allowed to "commit" in a block's executable section and it's exactly this that I must do to be able to handle an exception that's caused by the "commit".

@bllewell
Copy link
Contributor Author

I also discovered—and this is documented HERE—that “A SECURITY DEFINER procedure cannot execute transaction control statements.” There seems to be no workaround. The effect of this is to disallow yet another critical use case.

A time-honored principle of good practice—that implements the principle of least privilege—uses at least four owner-and-schema pairs:

Owner-schema data: owns the tables, indexes, constraints, and triggers. data grants “select”, “insert”, “update”, and “delete” on its tables to code.

Owner-schema code owns the substantive PG/plSQL code. And every function and procedure pwned by code is security definer. code grants “execute” on exactly and only the subprograms that express the database API to owner-schema api. In other words, api cannot execute “helper” subprograms.

Owner-schema api owns “jacket” PG/plSQL subprograms that directly invoke exactly and only the code-owned subprograms that express the database API. api grants “execute” on these to client. Again, every function and procedure pwned by api is security definer

Owner client has no schema (and so cannot own objects) and is the only principal whose “connect” credentials are known to outside-of the database clients. The default schema for client is set to api.

In such a regime, \df — when connected as client — lists exactly and only the subprograms that express the database API. Other listing commands show nothing at all of how the app is implemented. This perfectly meets the definition of the principle of least privilege.

With some caveats PL/pgSQL in PostgreSQL v 11.2 does let you establish this regime.

However, there’s more.

Another time-honored security principle defines what to do when an unexpected error occurs. (An unexpected error, by definition, is one that can be caught only by an “others” handler.) It says that when an unexpected error occurs, no information must be leaked outside of the database? Why? Simple — ‘cos such information (names of schemas, tables, constraints, indexes, procedures, functions, lacking privileges for user x on object y.z, and so on) are gold-dust for “professional” hackers. This principle is easily honored using databases other than PostgreSQL by writing an “others” hander in every API-defining subprogram that assembles all the facts about the error into a report and commits these to a dedicated “incidents” table.

The net effect of the various restrictions that PL/pgSQL imposes on transaction control prevents the implementation of this scheme.

@yugabyte-ci yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue labels Jun 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug priority/medium Medium priority issue
Projects
Status: No status
Development

No branches or pull requests

4 participants