New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distributed nodes that don't see each other #19

Closed
Licenser opened this Issue Jun 8, 2012 · 3 comments

Comments

Projects
None yet
2 participants
@Licenser

Licenser commented Jun 8, 2012

I noticed that I keep running into problems with distributed gproc from the moment where I fire up two nodes that don't see each other when gproc starts but then are supposed to join together with discovery set to all. I kind of have the feeling it's a known issue but I figured it won't hurt to document.

Example timeline:
n1 - boot
n1 - start gproc
n2 - boot
n2 - start gproc
n1 - net_adm:ping(n2)
-> not joining together propperly.

That kind of is a netsplit issue and it propably can't be resolved in a entirely way since it's not guaranteed that conflicts in the two registreis can be joined automatically but what really would be cool if there would be some kind of callback saying: Hey we've a (re)join from a split with side 1 and 2 so it'd be possible to work out the stuff if possible.

@uwiger

This comment has been minimized.

Show comment
Hide comment
@uwiger

uwiger Jun 8, 2012

Owner

This is a problem with gen_leader and, consequently, with gproc.

I believe some of the gen_leaders out there, not least vagabond and garrett-smith and can handle netsplits fairly well, but you'd need to check with them directly for details. Gproc would need some callback in order to resynch, and some changes to the internal data structures to be able to know what to do in case of conflicts.

Owner

uwiger commented Jun 8, 2012

This is a problem with gen_leader and, consequently, with gproc.

I believe some of the gen_leaders out there, not least vagabond and garrett-smith and can handle netsplits fairly well, but you'd need to check with them directly for details. Gproc would need some callback in order to resynch, and some changes to the internal data structures to be able to know what to do in case of conflicts.

@uwiger

This comment has been minimized.

Show comment
Hide comment
@uwiger

uwiger Jun 25, 2012

Owner

I have added gproc:bcast() and :wide_await() to make it a bit easier to have multiple instances of local gproc services cooperating in a loose way. There may be other similar functions that could be added. Beyond that, I have no immediate plans to work on the global gproc part right now (not from lack of interest - I simply don't have the time).

Owner

uwiger commented Jun 25, 2012

I have added gproc:bcast() and :wide_await() to make it a bit easier to have multiple instances of local gproc services cooperating in a loose way. There may be other similar functions that could be added. Beyond that, I have no immediate plans to work on the global gproc part right now (not from lack of interest - I simply don't have the time).

@Licenser

This comment has been minimized.

Show comment
Hide comment
@Licenser

Licenser Jun 25, 2012

Thanks mate, that are great improvements, I'll see if I can work with them to make my service more stable :) and no worries I know the time problem, sadly I don't think I am yet up to helping out here :(

Licenser commented Jun 25, 2012

Thanks mate, that are great improvements, I'll see if I can work with them to make my service more stable :) and no worries I know the time problem, sadly I don't think I am yet up to helping out here :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment