Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the useUnreachableState flag #5

Closed
ktoso opened this issue Aug 7, 2020 · 0 comments · Fixed by #29
Closed

Implement the useUnreachableState flag #5

ktoso opened this issue Aug 7, 2020 · 0 comments · Fixed by #29
Assignees
Labels
1 - triaged Task makes sense, is well defined, and is ready to be worked on t:swim

Comments

@ktoso
Copy link
Member

ktoso commented Aug 7, 2020

We can operate in two modes, one with unreachability and one with the classic failure detection means .dead mode.

The unreachable state pattern is not useful for most system and is to be disabled by default.

The implementation today uses the unreachable state, emits events about unreachable and awaits that someone calls confirm dead. We should only do this if useUnreachableState is true.

        /// Optional SWIM Protocol Extension: `SWIM.MemberStatus.unreachable`
        ///
        /// This is a custom extension to the standard SWIM statuses which first moves a member into unreachable state,
        /// while still trying to ping it, while awaiting for a final "mark it `.dead` now" from an external system.
        ///
        /// This allows for collaboration between external and internal monitoring systems before committing a node as `.dead`.
        /// The `.unreachable` state IS gossiped throughout the cluster same as alive/suspect are, while a `.dead` member is not gossiped anymore,
        /// as it is effectively removed from the membership. This allows for additional spreading of the unreachable observation throughout
        /// the cluster, as an observation, but not as an action (of removing given member).
        ///
        /// The `.unreachable` state therefore from a protocol perspective, is equivalent to a `.suspect` member status.
        ///
        /// Unless you _know_ you need unreachability, do not enable this mode, as it requires additional actions to be taken,
        /// to confirm a node as dead, complicating the failure detection and node pruning.
        ///
        /// By default this option is disabled, and the SWIM implementation behaves same as documented in the papers,
        /// meaning that when a node remains unresponsive for an exceeded amount of time it is marked as `.dead` immediately.
        public var useUnreachableState: Bool = false
@ktoso ktoso added 1 - triaged Task makes sense, is well defined, and is ready to be worked on t:swim labels Aug 7, 2020
@ktoso ktoso self-assigned this Aug 22, 2020
@ktoso ktoso closed this as completed in #29 Aug 22, 2020
ktoso added a commit that referenced this issue Aug 22, 2020
…nce and more tests (#29)

* more Peer usage, rather than Node

* +swim #26 #5 implement unreachable on/off mode and more tests

* cleanup metadata rendering to not render Optional()

* +confirmDead pushed into the instance

* Pushed .confirmDead into the Instance logic
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1 - triaged Task makes sense, is well defined, and is ready to be worked on t:swim
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant