Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: prevent single replica proxies from staying unhealthy #12641

Merged
merged 1 commit into from
Mar 18, 2024

Conversation

deansheather
Copy link
Member

In the peer healthcheck code, when an error pinging peers is detected we write a "replicaErr" string with the error reason. However, if there are no peer replicas to ping we returned early without setting the string to empty. This would cause replicas that had peers (which were failing) and then the peers left to permanently show an error until a new peer appeared.

Also demotes DERP replica checking to a "warning" rather than an "error" which should prevent the primary from removing the proxy from the region map if DERP meshing is non-functional. This can happen without causing problems if the peer is shutting down so we don't want to disrupt everything if there isn't an issue.

In the peer healthcheck code, when an error pinging peers is detected we
write a "replicaErr" string with the error reason. However, if there are
no peer replicas to ping we returned early without setting the string to
empty. This would cause replicas that had peers (which were failing) and
then the peers left to permanently show an error until a new peer
appeared.

Also demotes DERP replica checking to a "warning" rather than an "error"
which should prevent the primary from removing the proxy from the region
map if DERP meshing is non-functional. This can happen without causing
problems if the peer is shutting down so we don't want to disrupt
everything if there isn't an issue.
@deansheather deansheather merged commit cf50461 into main Mar 18, 2024
24 checks passed
@deansheather deansheather deleted the dean/avoid-replica-health-hang branch March 18, 2024 13:45
@github-actions github-actions bot locked and limited conversation to collaborators Mar 18, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants