-
-
Notifications
You must be signed in to change notification settings - Fork 297
Implement dynamic background worker registration #710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement dynamic background worker registration #710
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh cool. This looks interesting.
A test confirming this works would be fantastic!
This also changes termination API for background workers a little bit, enforcing shutdown-scoped limitations on the functionality available.
|
@workingjubilee I've added the first draft of the test; please let me know what you think. |
It's not really doing anything as it hooks into postgres initialization Solution: remove it and put a note in comments
|
@workingjubilee I've updated this PR to be based on the latest I've considered changing the API for the startup hook to make it impossible to use dynamic registration if the hook is provided, but for now, I decided to keep it as is unless you want me to. Something like: pub struct BackgroundWorkerBuilder<const Dynamic: bool = true> { .... }
pub fn enable_shmem_access(mut self: Self) -> Self { ... }
pub fn shmem_startup_hook(mut self: Self, unsafe extern "C" fn()) -> BackgroundWorkerBuilder<false> { ... }
impl BackgroundWorkerBuilder<true> {
pub fn load_dynamic(self: Self) -> BackgroundWorkerHandle { ... }
} |
|
@workingjubilee @eeeebbbbrrrr I've updated the documentation as requested. |
This is rather unpleasant behaviour as it is unrecoverable. Solution: check if the PID was correctly configured and return an error if it wasn't
|
@eeeebbbbrrrr I've replaced the blocking on |
It now properly reflects the fact that dynamic background workers are supported.
Solution: document it
Solution: remove it
|
I think this looks good now. Let’s have @workingjubilee confirm next week before we merge. Thanks for the work, @yrashk! |
|
Anything else I can do here to make this ready to be merged? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine to me, but there might be something I'm missing, so it's probably better if @workingjubilee takes a look
In some cases, it is not desirable for the builder to load the background worker. For example, when actual loading is handled by some external code that uses this structure from the C API. Solution: make `BackgroundWorkerBuilder::pg_sys_worker` public
7bb5fe9 to
48a9151
Compare
Its naming and scoping is problematic, it's not obvious what it does. Solution: convert it into an `Into` trait implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, the presence of tests makes this much easier to review. I have a few nits that aren't terribly important, but I also have a number of design questions about how this is being handled and some edge-case-related concerns before this can be merged. However it looks like we're getting close.
| let worker = BackgroundWorkerBuilder::new("dynamic_bgworker") | ||
| .set_library("pgx_tests") | ||
| .set_function("bgworker") | ||
| .set_argument(123i32.into_datum()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...wait, what happens if this argument is None? I realize this is almost certainly going to be a case of severe programmer error, but we should test the None case as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My assumption here was that it was up to the worker to handle the argument correctly. What exactly do you suggest to test? Maybe I missing your train of thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I mostly mean that this basically goes Option<Datum> -> NullableDatum -> Option<Datum> (via Postgres) and we should make sure, as simple as it sounds, that if we say the Datum is "SQL NULL" (i.e. None) that it also becomes a None on the other side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
( I have recently become suspicious of these kinds of transitions after finding out that sometimes Postgres will pass values that are basically nullptr_t, true and so I am wondering if there are other hijinx in store. )
I am also kind of wondering what happens if a BackgroundWorker, er, fucks up massively, and if PGX can actually recover from that with our current API surface.
| pg_sys::BgwHandleStatus_BGWH_NOT_YET_STARTED => BackgroundWorkerStatus::NotYetStarted, | ||
| pg_sys::BgwHandleStatus_BGWH_STOPPED => BackgroundWorkerStatus::Stopped, | ||
| pg_sys::BgwHandleStatus_BGWH_POSTMASTER_DIED => BackgroundWorkerStatus::PostmasterDied, | ||
| _ => unreachable!(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| _ => unreachable!(), | |
| _ => unreachable!("postgres returned unknown BgwHandleStatus"), |
pgx/src/bgworkers.rs
Outdated
| /// | ||
| /// Requires `BackgroundWorkerBuilder.bgw_notify_pid` to be set to `pg_sys::MyProcPid`, otherwise it'll | ||
| /// return [`BackgroundWorkerStatus::Untracked`] error | ||
| pub fn wait_for_shutdown(&self) -> Result<(), BackgroundWorkerStatus> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is erroneous to do anything but reap the worker. We should enforce that.
| pub fn wait_for_shutdown(&self) -> Result<(), BackgroundWorkerStatus> { | |
| pub fn wait_for_shutdown(self) -> Result<(), BackgroundWorkerStatus> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not 100% sure this would be the correct handling. Waiting for an event that may not succeed shouldn't necessarily reap the handle, should it? I agree that this is rather a corner case, but let's say it returns, um, Untracked error status, and you still have the pid to work with (of course, you can save it beforehand, but I am just trying to explore the idea why do we need to reap it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. If it was Untracked, then what valid, safe actions remain with that pid that require it to remain attached to its source TerminatingBackgroundWorker type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going outside of PG's envelope, sending a signal to that PID?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. Isn't that fine though because that means we're doing another form of IPC then (and quite possibly unsafe code! or at least code that isn't in PGX) and we still have dispensed with the idea of it being some kind of BackgroundWorker, we're just trying to murder what we now see as a rogue process with whatever sky-lasers the OS has to offer us?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait_for_shutdown isn't a request for murder; it simply waits until the worker dies. Very Zen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, right, but interacting with the PID "directly" isn't that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would somewhat prefer to optimistically assume that consuming is okay (especially since a consuming-oriented approach is what's used by the rest of this new API surface) and revisit this assumption if we find a use case that is blocked by this.
| /// | ||
| /// Requires `BackgroundWorkerBuilder.bgw_notify_pid` to be set to `pg_sys::MyProcPid`, otherwise it'll | ||
| /// return [`BackgroundWorkerStatus::Untracked`] error | ||
| pub fn wait_for_shutdown(self) -> Result<(), BackgroundWorkerStatus> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens when this is called on a worker that hasn't been started yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading the code suggests that this is fine. It'll simply wait until it is stopped.
|
A design question that came up in my mind that I don't know enough about BackgroundWorkers to remark on: Can a BackgroundWorker, if registered and loaded dynamically, move to any other lifecycle point? Or does it always progress in a fairly linear fashion (from "not yet started" to "active" to "ended")? In other words: is this enum truly an enum or is this trying to represent a state machine? We can enforce a much more rigid progression if so, if we want to. This would make the design more complex but would also make it harder to misuse the interface. |
From @workingjubilee: When a handle is a noun, typical Rust form is to use its name unadorned (almost all our "smart pointers" are also "handles", technically speaking). Exceptions are for verbs (JoinHandle) and truly ambiguous types (e.g. RawHandle and OwnedHandle). Solution: rename relevant types
I can't say I am an expert in the intricacies of PostgreSQL. Reading this: typedef enum BgwHandleStatus and the documentation suggests that it is linear. not yet started -> started -> stopped and unclear. What kind of state machine did you have in mind that it might represent? |
|
I figured something like that would be the case. I was wondering if using a generic-enforced state machine (as in https://hoverbear.org/blog/rust-state-machine-pattern/ 3.3 and 3.4) might be better as a model (with the obvious clause that we can go from the "ending" back to the start again if we restart the worker, I think?) or if that would be too cumbersome. We kind of already have that half-way here. |
My current feeling about this is that it is a) an overkill for now b) Postgres handles restarts itself so one can't really go "stopped->not_started->started" on their own. I propose to break this out into a separate discussion. I am more than willing to commit some time to improving these small details in followups. |
|
Hm. I think it's fine to defer some of this to followups and I think we'll need a little time experimenting with this before we can be sure of what the right API is, anyways, so a major break on it is inevitable, but in the general case we should try to tend towards stricter (within reason) APIs -> less strict APIs. It's harder to take back relaxations of the rules when you've already smoothed a given path for something. |
|
I'm 100% on this with you. Good APIs take time and stricter are better, but
only when not taken to an extreme. It should still be usable :)
I'll be happy to help improving this over time.
Do we have anything blocking this PR that I can address before merging as a
first version?
…On Tue., Oct. 18, 2022, 7:45 p.m. Jubilee, ***@***.***> wrote:
Hm. I think it's fine to defer some of this to followups and I think we'll
need a little time experimenting with this before we can be sure of what
the right API is, anyways, so a major break on it is inevitable, but in the
general case we should try to tend towards stricter (within reason) APIs ->
less strict APIs. It's harder to take back relaxations of the rules when
you've already smoothed a given path for something.
—
Reply to this email directly, view it on GitHub
<#710 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAADRCCDHW5XBZOO2JMA3DWD5OGNANCNFSM6AAAAAAQVNP5L4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
It doesn't match the signature of `DynamicBackgroundWorker::wait_for_shutdown` Solution: ensure they are the same and there's no confusion
|
@workingjubilee I've addressed the issue with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then away we go!
|
Woohoo! ❤️ |
No description provided.