New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When creating a view, the creator should be our own address #54
Conversation
Actually, having the view creator and coordinator be different breaks at least the recently added in code in MERGE3 ("Only add view creators which are actually in the set as well").
Now A won't be a merge leader because it's not a view-creator, and C won't be a merge leader because it's not a coordinator; and if the ordering of addresses falls unfortunately then D will expect one of the others to do it. I wouldn't be surprised if there's other code that implicitly assumes the view creator is also the coordinator. I'll look at making another change to restore that property. |
No, it is necessary to have a creator which is not in the view: when A leaves A|6={A,B,C}, it'll create and install B|7={B,C} |
I see Util.shutdown() where a leaving member will "// inject view in which the shut down member is the only element", but I haven't found code that behaves in the way you describe. Probably I'm just missing it - could you point it out? But this seems dangerous, too. Suppose A and C leave at about the same time. Then I think you're saying that A would create B|7={B,C} while C would create B|7={A,B}. Is that right? (Not to mention that B might also be creating a B|7, eg if some other member D is joining at the same time...) Why isn't it a bad idea to have different views with the same view ID? |
Just returned from JBossWorld. Your pull requests affect code that has worked well for 10+ years, so understand that I'm very hesitant to change this, to make edge cases involving a protocol that nobody else uses (yet) work. I want to be 110% sure that I don't jeopardize mainstream code by fixing an edge case. This takes time. |
Welcome back. Just so that we're clear, only JGRP-1484 is SEQUENCER-specific. The open pull requests are not. I don't know whether this makes a difference to you. I had hoped that this change would be uncontroversial, but since it seems not to be... would it be useful if I opened a new ticket to cover it? I somehow feel as though there's more space for discussion over there than here on github! |
Well, the other pull request is regarding view creation, and that code is core and has worked well for years, so I'm hesitant to touch it. |
Of course it's sensible to be cautious about making changes to field-hardened code. On the other hand: even old code can contain old bugs; and if such a bug is found then we surely shouldn't be so afraid of change that we just have to leave the bug there forever? For what it's worth, my stress tests have been running with this fix and have not seen problems. (In view of their effectiveness at finding bugs, I think there's a very good chance that if anything was seriously broken by this, I'd have found out). What would it take to convince you that this fix is safe? |
I've just noticed JGRP-1002, saying that it would be a good thing if a post-merge coordinator was also a pre-merge coordinator. This pull request will have that effect (since the new coordinator is the merge leader, who was chosen from among the existing coordinators). |
What do you want to achieve with this pull request ?
So this doesn't fix a bug, correct ? I'm trying to gauge what the benefit of this change is, compared to the risks. |
Yes, if we have A|6={A,B,C}, then A leaving we'd end up with view B|7={B,C}. C would not create a view on its own as it's not the coordinator. C would ping A, and once B|7 is installed, B to send it a leave request. B would then create B|8={B}. |
Not necessarily: one of the existing coordinator will be picked to be the new coordinator, so only one partition will continue to have the same coordinator. E.g. if we have {A,B,C}, {M,N,O} and {X,Y,Z}, then the merge coordinators are A, M and X. Say we pick M (after sorting), then 2 partitions will change their coordinator (from A->M and from X->M) whereas one partition will continue to use the existing coordinator. |
I don't see any benefit of applying this pull request; as a matter of fact there's a lot of risk associated with it. I'd have to browse through a lot of code to see whether your change affects it. |
I don't think that I've seen this cause problems, but it seems surprising and dangerous for a node A to create a view whose ViewID says that its creator is B (say). This must open up the possibility of two nodes creating two different views with the same ViewID, with confusion to follow.
(I considered also changing this bit of code so that the local node was the coordinator. But I don't think that this ought to be necessary, so I've stuck with the smallest change that I think needs making).