reg_or_locate/3 for start funs that return {ok, Pid} #22

Closed
peerst opened this Issue Jul 18, 2012 · 3 comments

Comments

Projects
None yet
2 participants

peerst commented Jul 18, 2012

The current version of reg_or_locate/3 spawns the fun parameter if the name can't be located.

I'd like to use something like reg_or_locate/3 e.g. with a simple_one_for_one supervisor:start_child/2 or to start gen_serverand its ilk.

Is there a deeper reason why something like tis is not in the API? Maybe I missed a easy way to handle this use case or something in gproc prevents implementing this?

If its only omitted because nobody needed it would you add the functionality to reg_or_locate maybe as variant taking a {M, F, Arg} or a fun + Arglist. Or do you think another function e.g. start_or_locatewould be better?

I would implement it and send pull request for it if you think its feasible.

Owner

uwiger commented Jul 18, 2012

I did consider the different other variants of starting the process, but there are indeed some deeper reasons to be careful, at least:

  • I don't want to execute arbitrary code in the gproc server
  • I don't want to execute potentially blocking operations in the gproc server

If the server would call supervisor:start_child(S, Args), this call would block until the child in question is fully initialized and acks back to the supervisor, which then acks back to the gproc server. Meanwhile, no other registration requests will be serviced.

What I would prefer is if the child could start asynchronously, and then add itself to the supervisor. This should be doable, and would be a reasonably good extension of the supervisor behavior - a bit like the gen_server:enter_loop() function, which allows a process to "become" a gen_server, even though it wasn't started as such.

What gproc could (and probably should) do is start the process using proc_lib:spawn_opt/2, and it should probably also allow the caller to provide spawn options. One way to do this would be to allow the {M,F,A} form, but be very restrictive about which functions it's willing to execute - e.g. {proc_lib, spawn, [F]}, {erlang,spawn,[F]} (which would instead become spawn_monitor, since gproc needs to monitor the process), {proc_lib, spawn_opt, [F, Opts]}.

The spawn_link functions don't make sense either, since we don't want the process to be linked to gproc.

One could imagine a future implementation of gproc that would manage to ensure atomicity without serializing all registrations via a central server, making even blocking operations practical, as they could execute in the calling process. I did try such an implementation in a very early version, but the only really positive outcome of that was an ICFP paper on QuickCheck [1] which illustrated in detail why the implementation was subtly but dangerously (and irrevocably) flawed. :)

[1] http://publications.lib.chalmers.se/publication/125252-finding-race-conditions-in-erlang-with-quick-check-and-pulse

peerst commented Jul 18, 2012

Yeah I feared the issue would be blocking.

Well the workaround without changing anything would be:

  1. see if its registered: yes -> use pid
  2. call start child
  3. register in init of child if already registered return ignore
  4. if supervisor:start_child gets us a pid use it
  5. if not look it up again it should be registered from somewhere else ... if not, don't know what to do so probably crash ;-)
  6. make sure the race doesn't happen often otherwise it might get wasteful

The workaround isn't very nice but I'm a bit scared messing with the supervisor implementation to make async starting possible. Maybe have a little look at the simple start_child (probably only makes sense in this case anyway).
The API for async start, add to supervisor is a bit messy since you could accidentally have a process thats not according to the spec which would change if restarted.

The article BTW looks very interesting I'm reading it at the moment.

Owner

uwiger commented Jul 18, 2012

Yes, but this kind of workaround can be implemented as a wrapper around the standard gproc functions. ;-)

There would be that slight risk when adding a process dynamically to a supervisor, but I think it's an acceptable risk. It is a bit similar in that way to Joe Armstrong's on_exit handler (Programming Erlang, ch 9.2, pp 152-154).

uwiger closed this May 24, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment