Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove nodes with read-only filesystems (#52680) #59138

Merged
merged 1 commit into from
Jul 7, 2020

Conversation

DaveCTurner
Copy link
Contributor

Today we do not allow a node to start if its filesystem is readonly, but
it is possible for a filesystem to become readonly while the node is
running. We don't currently have any infrastructure in place to make
sure that Elasticsearch behaves well if this happens. A node that cannot
write to disk may be poisonous to the rest of the cluster.

With this commit we periodically verify that nodes' filesystems are
writable. If a node fails these writability checks then it is removed
from the cluster and prevented from re-joining until the checks start
passing again.

Closes #45286

Today we do not allow a node to start if its filesystem is readonly, but
it is possible for a filesystem to become readonly while the node is
running. We don't currently have any infrastructure in place to make
sure that Elasticsearch behaves well if this happens. A node that cannot
write to disk may be poisonous to the rest of the cluster.

With this commit we periodically verify that nodes' filesystems are
writable. If a node fails these writability checks then it is removed
from the cluster and prevented from re-joining until the checks start
passing again.

Closes elastic#45286
@DaveCTurner DaveCTurner added :Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. backport v7.9.0 labels Jul 7, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Cluster Coordination)

@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team label Jul 7, 2020
@DaveCTurner DaveCTurner merged commit 46c8d00 into elastic:7.x Jul 7, 2020
@DaveCTurner DaveCTurner deleted the 2020-07-07-ro-fs-handling-7x branch July 7, 2020 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport :Distributed/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. Team:Distributed Meta label for distributed team v7.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants