Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PostgREST watch mode #1512

Closed
steve-chavez opened this issue Apr 30, 2020 · 9 comments · Fixed by #1542
Closed

PostgREST watch mode #1512

steve-chavez opened this issue Apr 30, 2020 · 9 comments · Fixed by #1542
Labels
idea Needs of discussion to become an enhancement, not ready for implementation

Comments

@steve-chavez
Copy link
Member

The idea is to provide a watch: true config option that reloads the schema cache without the need for restarting or sending a SIGUSR1 to pgrst from another process. This would make it more ergonomic for development.

Since our schema cache queries are fast (take 400 ms on a big schema) I think we can just re-run the queries on every request. It could be possible that some requests respond with a 503 while pgrst is reloading. In this case the request would have to be retried. IIRC postgrest sends a message asking the user to retry as well. So it seems acceptable to behave this way.

watch would print a big warning telling it should not be used for production.

@steve-chavez steve-chavez added the idea Needs of discussion to become an enhancement, not ready for implementation label Apr 30, 2020
@theonewolf
Copy link

@steve-chavez is there any way we could signal Postgrest from within PostgreSQL? That way we could tell it to reload for a period of time, but not for every request. Maybe this was your idea though (I wasn't sure where this config would live).

Huge thumbs up on this idea. I'm testing Postgrest in environments where I can't easily send it a SIGUSR1---things like Google's Cloud Run (managed containers).

It's amazing how far I can get with the standard Postgrest container, but I can't easily update my live instances when schemas change in the DB. Rare, but it's still a minor issue.

@steve-chavez
Copy link
Member Author

steve-chavez commented May 4, 2020

@theonewolf One possible solution would be to make postgrest LISTEN on a channel. Then from psql:

NOTIFY postgrest;

And this would reload the schema cache.

That could be offered in addition to the unix signals. It seems possible in principle, I believe the authenticator role doesn't need any extra privilege for that.

We could use https://github.com/diogob/hasql-notifications for this.

steve-chavez added a commit to steve-chavez/postgrest that referenced this issue Jun 10, 2020
Fixes PostgREST#1512

Helps on environments where you can't send unix signals(Windows, managed
containers). Also provides better UX for schema reloads - NOTIFY
can be send from pg clients(psql, pgadmin).

There's limitation for now, if the LISTEN connection is lost the
connection won't be retried.
steve-chavez added a commit to steve-chavez/postgrest that referenced this issue Jun 10, 2020
Fixes PostgREST#1512

Helps on environments where you can't send unix signals(Windows, managed
containers). Also provides better UX for schema reloads - NOTIFY
can be send from pg clients(psql, pgadmin).

There's limitation for now, if the LISTEN connection is lost the
connection won't be retried.
steve-chavez added a commit to steve-chavez/postgrest that referenced this issue Jun 10, 2020
Fixes PostgREST#1512

Helps on environments where you can't send unix signals(Windows, managed
containers). Also provides better UX for schema reloads - NOTIFY
can be send from pg clients(psql, pgadmin).

By default, `NOTIFY pgrst` - with no payload - should be done to reload
the schema cache. Notifications with a payload will be ignored.
The channel name can be configured with `db-channel`.

There's limitation for now, if the LISTEN connection is lost the
connection won't be retried.
@steve-chavez
Copy link
Member Author

Now that I've been playing with the new NOTIFY schema reload, I've noticed that the original issue of a "watch mode" can be solved by setting an EVENT TRIGGER:

CREATE OR REPLACE FUNCTION pgrst_watch() RETURNS event_trigger AS $$
BEGIN
  NOTIFY pgrst;
END;
$$ LANGUAGE plpgsql;

-- enable the watch mode
CREATE EVENT TRIGGER pgrst_watch ON ddl_command_end EXECUTE PROCEDURE pgrst_watch();
                                                                                     
-- disable the watch mode
DROP EVENT TRIGGER pgrst_watch;

This is something we can recommend on the docs.

@theonewolf
Copy link

I'll check this out again, thanks for following up on this thread. I have dev time for this on Saturday.

@ruslantalpa
Copy link
Contributor

before this can be recommended (and it is the right path to implement watch/autoreload) note that ddl_command_end is triggered for every ddl statement which means that, in a production setting for example, when you run a migration, this will trigger "reload" a lot of times (which is expensive). This should be implemented similar to how pgrst handles HUP events, if one is in progress, the ones that follow are ignored, + the reaction to the first one should have a few ms delay.

@steve-chavez
Copy link
Member Author

@theonewolf Cool! You can use "docker run stevechavez/postgrest:latest" to test it. Would be great to know if the NOTIFY reload is indeed working on Google Cloud Run.

@steve-chavez
Copy link
Member Author

@ruslantalpa Great idea! I originally thought this mode would be for development only, but as you mention it could work in production as well. And with this postgrest would be more "set it and forget it" :)

if one is in progress, the ones that follow are ignored + the reaction to the first one should have a few ms delay.

Right now I'm using the same connectionWorker used for the SIGHUP, so this might already work. But I'll have to do some testing to be certain.

Additionally, we can also limit the event trigger. Check this function:

CREATE OR REPLACE FUNCTION pgrst_watch() RETURNS event_trigger AS $$
DECLARE
  obj record;
BEGIN
  FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands()
  LOOP
      RAISE NOTICE '% % object: % % % % % % % %',
                   tg_event,
                   tg_tag,
                   obj.classid,
                   obj.objid,
                   obj.objsubid,
                   obj.command_tag,
                   obj.object_type,
                   obj.schema_name,
                   obj.object_identity,
                   obj.in_extension;
  END LOOP;
  NOTIFY pgrst;
END;
$$ LANGUAGE plpgsql;

create table c(id int);
NOTICE:  00000: ddl_command_end CREATE TABLE object: 1259 1295602 0 CREATE TABLE table test test.c f
LOCATION:  exec_stmt_raise, pl_exec.c:3803
CREATE TABLE

create function minus(a int, b int) returns int as $$ select a - b; $$ language sql;
NOTICE:  00000: ddl_command_end CREATE FUNCTION object: 1255 1295606 0 CREATE FUNCTION function test test.minus(integer,integer) f
LOCATION:  exec_stmt_raise, pl_exec.c:3803
CREATE FUNCTION

With some conditionals on those attributes we could limit the events fired. Maybe skip some commands too(here's the full list). We'll need to work it out and find what would be the minimum events needed.

steve-chavez added a commit that referenced this issue Jun 25, 2020
Fixes #1512

Helps on environments where you can't send unix signals(Windows, managed
containers). Also provides better UX for schema reloads - NOTIFY
can be sent from pg clients(psql, pgadmin).

`NOTIFY pgrst` - with no payload - should be done to reload the schema cache.
Notifications with a payload will be ignored.

The channel can be enabled with `db-channel-enabled`(false by default)
and its name can be configured with `db-channel`.

The LISTEN thread uses a dedicated pg connection.
This connection is recovered if it fails.
A debounce of 1ms is done in case too many NOTIFYs arrive.
@steve-chavez
Copy link
Member Author

NOTIFY reload is merged!

I'll be working on docs and finish the pg event trigger on PostgREST/postgrest-docs#335.

@wolfgangwalther
Copy link
Member

wolfgangwalther commented Dec 11, 2020

With some conditionals on those attributes we could limit the events fired. Maybe skip some commands too(here's the full list). We'll need to work it out and find what would be the minimum events needed.

I think our best bets here are:

  • obj.object_type - this should give us a lot less values to care about than the command
  • obj.schema_name - we should probably send this as the payload to PostgREST, to decide whether we care about this. If it's in db.schemas or db.extra-search-path, yes, otherwise, no? This doesn't work for all types, some need to be checked for even when they are outside of the schemas (tables, fk constraints, because of source columns?). scratch that, I don't think it's worth it. This might lead to multiple NOTIFYs in the same transaction as well, because of different payload. Bad! Just object_type should be good!

monacoremo pushed a commit to monacoremo/postgrest that referenced this issue Jul 17, 2021
Fixes PostgREST#1512

Helps on environments where you can't send unix signals(Windows, managed
containers). Also provides better UX for schema reloads - NOTIFY
can be sent from pg clients(psql, pgadmin).

`NOTIFY pgrst` - with no payload - should be done to reload the schema cache.
Notifications with a payload will be ignored.

The channel can be enabled with `db-channel-enabled`(false by default)
and its name can be configured with `db-channel`.

The LISTEN thread uses a dedicated pg connection.
This connection is recovered if it fails.
A debounce of 1ms is done in case too many NOTIFYs arrive.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
idea Needs of discussion to become an enhancement, not ready for implementation
Development

Successfully merging a pull request may close this issue.

4 participants