-
Notifications
You must be signed in to change notification settings - Fork 561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify and correct DefaultClusterEventService implementation #4154
Comments
Proposed solution is to keep have a cluster event service keep track of known topics/subscriptions, and only gossip its own topics/subscriptions (the entire set) with other nodes - this greatly simplifies the implementation at the risk of being slightly inefficient (we always transmit all of our local subscriptions), but as we use very few topics and the gossip interval can be relatively high (in the order of seconds), this is most likely fine. In order to handle a partition/crash, we can also listen for cluster membership events and remove a dead node's subscriptions. |
4387: Refactor cluster event service r=deepthidevaki a=deepthidevaki ## Description * Use member properties to propagate subscription info instead of custom gossip * EventService can now only `broadcast`. Methods for unicast and send are removed from the interface. ## Related issues <!-- Which issues are closed by this PR or are related --> closes #4154 closes #4307 # Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
Expected behavior
The current
DefaultClusterEventService
implementation is complex, buggy, and difficult to test. It works as is:So what we expect here is that if I subscribe to a topic, eventually all nodes in the cluster know about it through gossip; if I unsubscribe, all nodes also know about it eventually through gossip.
Actual behavior
It's possible that a tombstone subscription prevent us from subscribing to a topic. If I subscribe, and unsubscribe, then other nodes have a tombstone subscription for my memberId and my topic. If I re-subscribe, then as noted above, it is possible that my subscription is ignored. As we don't typically retry subscribing, then it's possible other nodes never know about our subscription, as on gossip we don't resend the whole list of subscriptions, only the newly created ones.
Another bug is that a node, on rejoining the cluster, will never be sent existing subscriptions, as the other nodes only gossip "new" subscriptions to known members - this would work fine if we removed members from the
updateTimes
map, but we never do, so on rejoining a new node with the same old ID will have it's old update time.The text was updated successfully, but these errors were encountered: