Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restart required for detecting table relations on schema change #475

Closed
sscarduzio opened this issue Jan 28, 2016 · 30 comments
Closed

Restart required for detecting table relations on schema change #475

sscarduzio opened this issue Jan 28, 2016 · 30 comments
Labels
enhancement a feature, ready for implementation

Comments

@sscarduzio
Copy link
Contributor

I have an event_group table with fields group_id and event_id referencing events.id and groups.idrespectively.

The weird thing is that even if hydrating the event IDs works perfectly:

GET /event_group?select=events{*}
[
  {
    "events": {
      "doc": {
     ...
       }
  }
]

When I try to do the same with groups , I receive this error:

GET  /event_group?select=groups{*}
 {
  "hint": null,
  "details": null,
  "code": "42703",
  "message": "column groups.group_id does not exist"
}

why does it expect groups.group_id and not groups.id? I didn't need a event.event_id for events!

Here's the SQL:

CREATE TABLE "1".event_group
(
  group_id character varying(32) NOT NULL,
  event_id character varying(36),
  CONSTRAINT fk_eid FOREIGN KEY (event_id) REFERENCES "1".events (id) MATCH SIMPLE ON UPDATE CASCADE ON DELETE NO ACTION,
  CONSTRAINT fk_groups FOREIGN KEY (group_id) REFERENCES "1".groups (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT uq_gid_eid UNIQUE (event_id, group_id)
)


CREATE TABLE "1".events
(
  doc jsonb,
  id character varying(36) NOT NULL DEFAULT uuid_generate_v4(),
  CONSTRAINT pkey_eid PRIMARY KEY (id)
)

CREATE TABLE "1".groups
(
  id character varying(32) NOT NULL DEFAULT uuid_generate_v4(),
  name character varying(128) NOT NULL,
  description character varying(800),
  CONSTRAINT group_pkey PRIMARY KEY (id)
)
@begriffs
Copy link
Member

This was fixed by restarting your postgrest server, correct? If so I'll rename this issue to discuss ways for PostgREST to detect schema changes.

@sscarduzio
Copy link
Contributor Author

Restarting PostgREST has fixed my issue indeed. But honestly, before I consider this resolved, I'd love to see some notes on the documentation that makes this (probably temporary) side effect evident to the users.
Do you want to take care of this or shall I?

@begriffs
Copy link
Member

I'd love your help with the docs if you could.

sscarduzio added a commit to sscarduzio/postgrest that referenced this issue Jan 29, 2016
sscarduzio added a commit to sscarduzio/postgrest that referenced this issue Jan 29, 2016
sscarduzio added a commit to sscarduzio/postgrest that referenced this issue Jan 29, 2016
@begriffs begriffs changed the title Cannot hydrate external id column by foreign key only on one column Restart required for detecting table relations on schema change Jan 29, 2016
@begriffs
Copy link
Member

Here are the options I can see for handling schema change.

  • Don't cache knowledge of the schema
    • When a request comes in that needs to know the db structure, query for the relevant parts of the schema then and there
    • 👍 keeps postgrest stateless
    • 👍 fast propagation of schema changes to the API
    • 👎 may need to break up the consolidated schema detection code into little pieces for each type of check
    • 👎 slow and wasteful when the schema doesn't change much (which is probably the common case)
  • Refresh schema structure cache on a sysadmin signal
    • Typically in unix it is SIGHUP, but we could also listen for a database NOTIFY as well so that a stored proc could cause the schema reload
    • 👍 Only makes the sql calls when it has to
    • 👍 Probably not a whole lot to change about our code
    • 👎 You have to remember to SIGHUP just like you currently have to remember to restart, so still potentially confusing
    • 👎 Painstaking to do on a cluster of load-balanced postgrests via a unix signal, but maybe OK if they all listen for a db notify
  • Poll for changes
    • 👍 nobody has to remember to restart or send a signal
    • 👍 can be implemented in addition to the refresh signals
    • 👎 most APIs probably run for a long time without schema change so polling is wasteful
  • Refresh on ddl_command_end event trigger
    • 👍 happens for all and only the kind of schema-altering events we care about
    • 👍 available in postgres 9.3+
    • 👍 no manual signaling required, and schema changes propagate quickly to the API
    • 👎 requires the db admin to create the trigger which translates ddl_command_end into a pg_notify for which postgrest can listen. Maybe there is a way that postgrest can create such a trigger for the duration of its run?
    • 👍 supports manual refresh too by issuing a sql notify statement that mimics the one generated by our ddl event trigger

This last option is really nice. Ideally we can have postgrest set it up without any admin intervention. Any options or pros/cons I'm missing?

@eric-brechemier
Copy link
Contributor

If you number the options:

  1. No change, keep current behavior: schema knowledge is loaded when postgrest starts
  2. Don't cache knowledge of the schema
  3. Refresh schema structure cache on a sysadmin signal
  4. Poll for changes
  5. Refresh on ddl_command_end event trigger

The options 2 and 4 seem worse than current behavior, 3 and 5 look OK and could probably both be implemented.

If we go for option 5, just like for authentication, it would make sense to keep this feature out of postgrest and implement it as a separate postgrest-supervisor or something.

@begriffs
Copy link
Member

How about this, on startup have postgrest run this sql:

CREATE OR REPLACE FUNCTION public.notify_ddl_postgrest()
  RETURNS event_trigger
 LANGUAGE plpgsql
  AS $$
BEGIN
  NOTIFY ddl_command_end;
END;
$$;

DROP EVENT TRIGGER IF EXISTS ddl_postgrest;
CREATE EVENT TRIGGER ddl_postgrest ON ddl_command_end
   EXECUTE PROCEDURE public.notify_ddl_postgrest();

Then any schema change will notify on channel ddl_command_end. Also an admin can call public.notify_ddl_postgrest to manually tell postgrest to reload the schema. If the function is available with the appropriate permissions then you could access it over http at /rpc/notify_ddl_postgrest.

The missing piece is that postgrest needs to be able to listen for changes. @nikita-volkov would it be feasible to add support for listen in hasql? Something similar to this?

@nikita-volkov
Copy link

@begriffs There actually is a PR from Leon Smith with that: https://github.com/nikita-volkov/hasql/pull/43/files . I just need to adapt it a little better.

@sscarduzio
Copy link
Contributor Author

I think PostgREST produces enough value to be a tad more "invasive" in terms of integration.

I mean, when one wants to use PostgREST, he/she may be fine with the fact this component will require an "installation phase" in which it (automatically) creates triggers, tables and whatever else it needs to provide all these non trivial features.

I imagine the UX would be something along the lines of:

$ postgrest postgres://postgres@127.0.0.1/somedb -a postgres -s "1"
WARNING: PostgREST is not installed in "somedb". Some features might not work as expected.
SOLUTION: restart using the --install option

This way, we can let people opt out the custom triggers and tables if they don't use the advanced features (user management, schema auto-refresh, etc.). But we'd offer a turn-key setup process for the rest of the users (as opposed to "if you want user management, follow this howto..").

@begriffs
Copy link
Member

@sscarduzio I agree, in fact for this feature we just need to create a proc and a trigger so maybe postgrest should just do it without asking. The trigger shouldn't add much of a burden on anyone's db because it triggers only on DDL changes.

@ruslantalpa
Copy link
Contributor

I dont agree :) in order for this to work, the authenticator needs to be granted high privileges, and for what? Compromise security for a small convinience?

@begriffs
Copy link
Member

begriffs commented Feb 1, 2016

Does it take high privileges to create a function and a trigger?

@ruslantalpa
Copy link
Contributor

I would consider the ability to create procedures and triggers "high privileges".
Why not simply implement (3) and let some other script do the refresh in whatever way it chooses to.
In production there is no "random" schema change, usually it's done using CI so why not allow the ability to issue a HUP at the end of the deployment script?

@eric-brechemier
Copy link
Contributor

In production there is no "random" schema change, usually it's done using CI so why not allow the ability to issue a HUP at the end of the deployment script?

Exactly my thought after I posted my previous comment. For example, I am using git to deploy new versions on a server, and the repository includes also the nginx configuration of the virtual host and the sqitch folder with SQL scripts.

I currently run nginx -s reload after the git push to reload nginx configuration, I would simply add service reload postgrest or something similar to reload postgrest configuration after deploying database changes with sqitch deploy.

@sscarduzio
Copy link
Contributor Author

Well, plot twist: there's not only "production" that counts.
I'm creating tables as I progress with the user stories: postgres - the stateful layer of my architecture -keeps the pace with me (no need to restart), why should my development cycle include restarting a supposedly "stateless" layer? It's really awkward, looks like we're back in Tomcat times.

@eric-brechemier
Copy link
Contributor

Well, plot twist: there's not only "production" that counts.

Also agreed, but development uses extra tools, not installed on the server.

I am still suggesting to have a separate, officially sanctioned executable to refresh postgrest config at regular intervals or using elevated privileges to detect schema changes during development.

@sscarduzio
Copy link
Contributor Author

so you suggest I have a while true SIGHUP sleep 1 in my dev env? Is it what you'd propose for tomcat as well?

@ruslantalpa
Copy link
Contributor

In your dev env you create those triggers then listen to notify events with a 10 line script and restart postgrest

@eric-brechemier
Copy link
Contributor

My memories of Tomcat days are slowly fading away, but afaik using a separate process to monitor (e.g. file) changes to reload the server is still pretty much current state of the art:

or more generally

@sscarduzio
Copy link
Contributor Author

I still don't understand why it's ok for developers if the developer experience is crap. We have the solution in hand to make this a pleasant thing to use, why pulling back?

If you don't want postgrest to retain CREATE permissions, just don't give them. Postgrest will print a warning (as I showed) that schema won't be refreshed.

@davidkuep
Copy link

New to this - so not sure if this is a silly option- but why not make the notify/advanced features a separate extension for postgres itself?

It could be installed separately in its own schema-

  1. the authenticator user wouldn't need any special permissions as there could still just be listen/notify sql code that resides in the postgrest-extension schema
  2. Said code could then notify with the name of schemas that have been refreshed and the postgrest instance would simply have to know what schema it was running and only update if its name was called
  3. If the postgrest extension isn't installed everything would still work, you just won't get the advanced features and dropping the postgrest extension would drop the advanced features etc. ]

Anyway, there are probably cons here that I'm not considering, but just wondering why it's not a part of the discussion.

@begriffs
Copy link
Member

begriffs commented Apr 3, 2016

OK how about this: postgrest listens for both SIGHUP and a specially named SQL event and reloads the schema in either situation. In production you would send the server a signal, and in development you create an event trigger on ddl_command_end and have it notify postgrest through the sql event.

In development there would be a snippet of sql that the developer would have to run once in their database which would enable postgrest auto schema change detection.

@begriffs begriffs added the enhancement a feature, ready for implementation label Apr 3, 2016
@begriffs
Copy link
Member

Look what's now available! https://github.com/cocreature/hasql-notifications

@matthewmueller
Copy link

Has this been resolved? Seems like if your DB changes, you'll end up with some failed API requests until postgrest has been restarted. Are there any plans to implement the auto-refresh via notifications?

@begriffs
Copy link
Member

begriffs commented Nov 12, 2016

It hasn't been resolved in full generality, but postgrest does listen for SIGHUP nowadays which will cause it to reload the schema. It's less intrusive than having to restart the server.

@matthewmueller
Copy link

Ah gotcha, so the idea would be that you SSH into your container / dyno and trigger a SIGHUP on that process?

@begriffs
Copy link
Member

For now yeah. Having postgrest LISTEN for a specially named event seems feasible for v0.4.1. Once that is implemented we can add a section in the docs about hooking ddl_command_end up to NOTIFY that event for immediate postgrest schema reloads on DDL changes.

@matthewmueller
Copy link

Got it, thanks for your help!

@begriffs
Copy link
Member

I followed up on the hasql-notifications library which seems to have run into a race condition bug last year. We'll see if that gets resolved, and then it'll be pretty easy to finally get this issue closed.

@jmealo
Copy link
Contributor

jmealo commented May 31, 2018

@begriffs: JarvusInnovations/lapidus#19

It seems as though it'd be trivial to create a tiny sidecar application that sends SIGHUP to PostgREST if the Haskell library isn't ready yet and just suggest its use in the documentation.

Go:
http://coussej.github.io/2015/09/15/Listening-to-generic-JSON-notifications-from-PostgreSQL-in-Go/

Rust:
sfackler/rust-postgres@bb837bd

Elixr/Erlang:
https://github.com/elixir-ecto/postgrex/blob/v0.13.3/lib/postgrex/notifications.ex#L78

I think it may be possible to do this with a shell script assuming the user is providing credentials using something libpq uses. I'm not seeing any examples though.

@swuecho
Copy link
Contributor

swuecho commented Jun 15, 2019

what is about the part " a specially named SQL event and reloads the schema"? I saw #570 implement the SIGNUP part.

for my use case, postgrest run in docker. I run test in another docker. send a signal from one container to another is possible I guess, but not interesting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement a feature, ready for implementation
Development

No branches or pull requests

9 participants