Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Breaking changes warning! Process.register(...) #72

Closed
louthy opened this issue Dec 10, 2015 · 0 comments
Closed

Breaking changes warning! Process.register(...) #72

louthy opened this issue Dec 10, 2015 · 0 comments

Comments

@louthy
Copy link
Owner

louthy commented Dec 10, 2015

I am currently re-writing the Process.register(...) and Process.deregister(...) behaviour. If you rely on this then be aware of what's coming in the next nuget release (it's already on the master branch).

Just to re-cap, this is how it worked previously:

  • Calling Process.register(processId, name) would create a new proxy Process at the address /registered/<name>
  • Process.register(...) would return the proxy's process ID
  • Because /registered is outside of the scope of the node's address space (/node-name/...) it essentially acted as a DNS service for any process to find any other process by a known name.
  • This was achieved by calling Process.find(name), which didn't actually do any searching, it just built a ProcessId in the format /registered/<name>
  • The proxy at that address would deal with the routing of the messages to the registered Process

Problems with that solution are:

  • There is no cluster wide coordination of registered processes
  • That means if two separate nodes register the same process name then the two proxy processes would be fighting over the shared inbox. This would lead to undefined behaviour.
  • Proxy processes aren't great for subscriptions. You can't subscribe to a proxy and have it auto-subscribe to the thing it's proxying (this may change, but right now it's a limitation).

Dispatchers

Now that the Process system supports dispatchers we can implement a more advanced and robust system. Dispatchers put control on the sender's side. Here's an example:

    ProcessId pid1 = spawn("proc1", ... );
    ProcessId pid2 = spawn("proc2", ... );
    ProcessId pid3 = spawn("proc3", ... );

    ProcessId pid = Dispatch.broadcast(pid1,pid2,pid3);
    tell(pid, msg);

In that example 3 processes are grouped into one ProcessId. You can then tell, ask, subscribe, etc. because it's just a regular ProcessId. The Process system itself can spot that there are multiple processes referenced and it deals with the dispatch, without a router or proxy Process.

In the above example pid looks like this:

    /disp/broadcast/[/node-name/user/proc1,/node-name/user/proc2,/node-name/user/proc3]

The disp part tells the system to use a named dispatcher, the broadcast part is the name of the dispatcher (you can register your own dispatchers via Dispatch.register(...)). There are several built in dispatchers: broadcast, random, round-robin, random, first. The name of the dispatcher decides the bespoke behaviour to run on: [/node-name/user/proc1,/node-name/user/proc2,/node-name/user/proc3]

Back to registered processes

When you register a Process it now does one of two things:

  • If the Process is local-only, then it gets registered in an in-memory map of names to ProcessIds
  • If the Process is visible to the cluster, then it gets registered in a Redis map of names to ProcessIds

The result of calling Process.register(name) is still a ProcessId, but instead of it looking like this: /registered/<name> it will now look like this: /disp/reg/<name>. As you can probably see there is now a dispatcher for registered Processes called reg.

The default behaviour of this dispatcher is to get the full list of Processes that have been registered with a specific name, and dispatch to all of them (broadcast). This behaviour is more consistent overall, because it doesn't pass any judgement on who registered what when. It simply realises there are multiple processes registered with the same name, and you're trying to communicate with a Process by name, and therefore that's all of them.

When you call Process.find(name) the system does a similar thing as before of not actually doing any searching at that point, it merely returns /disp/reg/<name> - so as processes register or deregister the number of possible destinations for a message increases and decreases dynamically.

The keen eyed amongst you may realise that if you can get n processes registering themselves as 'a named thing', then you could implement high-availability strategies. And to that end, you can combine a registered ProcessId with other dispatcher behaviour. i.e.

    ProcessId pid = Disptach.leastBusy(find("mail-server"));
    tell(pid, msg);

The pid variable above would look like this:

    /disp/least-busy/disp/reg/mail-server

This is actually a general feature of dispatchers that they can be combined. You can imagine that the reg dispatcher returns a list of registered 'mail-server' ProcessIds and then the least-busy dispatcher finds out which of those mail-server processes has the smallest queue before dispatching the message.

You can take this even further and register a dispatcher. Remember when you call register(name, pid) the pid is a ProcessId and so are the special dispatcher ProcessIds. So you could do this:

    var pid1 = spawn("proc1", ... );
    var pid2 = spawn("proc2", ... );
    var pid3 = spawn("proc3", ... );

    ProcessId parcelPassing = Dispatch.roundRobin(pid1,pid2,pid3);

    var reg = register("pass-parcel", parcelPassing);

The value of reg would be:

    /disp/reg/pass-parcel

If you then did a series of tell calls against reg then the messages would be sent round-robin to pid1, pid2 and pid3. This has very similar functionality to routers without the need for a router Process.

If you think the implications of that through further, let's say you had two data-centres and you wanted an 'eventually consistent' system by sending the same message to both data-centres, but you wanted the least-busy of 3 nodes in each centre to receive the message. A node in each centre could register a least-busy dispatcher ProcessId under the same registered name, and because the default behaviour of the registered dispatcher is to broadcast, you'd get the exact behaviour you wanted.

Things to note are that this isn't an aliveness system (see the roles section later for that). A registered Process is registered until:

  • You call deregisterById(pid)
  • You call kill(pid)
    Killing a process wipes its state, inbox and registrations. If you want to kill a process but maintain its cluster state then call: shutdown(pid).
    So if a registered Process is offline then its inbox will keep filling up until it comes back online - so that facilitates eventually consistent behaviours.

Roles

Eventually consistent isn't always the desired behaviour, often you want to just find a Process that does 'a thing' and you want it to do that thing now. Roles facilitate that behaviour. Each node in the cluster must have a role name. Roles use the Process.ClusterNodes property to work out which member nodes are actually available (it's at most 3 seconds out of date, if a node has died, otherwise it's updated every second).

If you had 10 mail-servers, you could find the least-busy SMTP process by doing something like this:

    ProcessId pid = Role.LeastBusy["mail-server"]["user"]["outbound"]["smtp"]
    tell(pid, email);

The first child mail-server is the role name (which you specify when you call Cluster.register(...) at the start of your app), the rest of it is a relative leaf /user/outboud/smtp that will refer to N processes in the mail-server role.

The problem with that ProcessId is that you need to know about the inner workings of the mail-server node to know that the smtp Process is on the leaf /user/outbound/smtp, and that means that the Process hierarchy for the mail-server can't ever change. However because pid is just a ProcessId the mail-server node itself could register it instead:

    register("smtp", Role.LeastBusy["mail-server"]["user"]["outbound"]["smtp"]);

Then any other node that wanted to send a message to the least-busy smtp Process could call:

    tell(find("smtp"), msg);

You'll notice also that the mail-server nodes themselves have the control over how to route messages, whether it's least-busy, round-robin, etc. They can change their strategy without it affecting the sender applications.

Although this SMTP example isn't a great one, it should indicate how you can use registered names to represent a dynamically changing set of nodes and processes it the cluster.

De-registering by name

You can also wipe all registrations for by name:

    deregisterByName(name);

That will clear all registrations for the name specified. This is pretty brutal behaviour, because you don't know who else in the cluster has registered a Process and you're basically wiping their decision. You could use it as a type of leader election system (by deregistering everyone else and registering yourself); but one thing to note is the process wouldn't be atomic, and is therefore not particularly bulletproof.

Existing code

So how will this affect existing code?

  • The signature to register has changed - there is no need for the flags or mailbox size argument any more (because the proxy has gone)
  • Any previous attempt to use this system for either broadcast or leader election would need to be reviewed. If you were doing this it was probably buggy anyway, but now it will be broadcast by default.
  • Process.Registered has gone, so if you're using that to build registered ProcessIds then you will need to use Process.find(...)
  • Process.deregister(...) is now Process.deregisterById(...) and Process.deregisterByName(...). Try to avoid using deregisterByName

This system is significantly more robust and powerful, so I hopefully you'll find that the breaking changes are worth it.

Process system dispatch documenation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant