Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add custom ListAndWatch error handler to netwatcher Informer #207

Merged
merged 4 commits into from
Apr 17, 2020

Conversation

Levovar
Copy link
Collaborator

@Levovar Levovar commented Apr 8, 2020

Based on recent client enhancement: kubernetes/kubernetes#87329

When an existing watcher thread fails with an unexpected error, netwatcher will shut itself down using this custom handler.
Kubernetes will promptly restart it, re-initializing the watches thus ensuring overall netwatcher HA - even if it comes at the expensive of some Pod restarting..
It is observed that the client library provided Informer HA cannot tolerate prolonged API server failures, so this enhancement makes sure netwatcher always comes back when the API server returns.

Levovar and others added 3 commits April 17, 2020 12:57
…m Informer API.

When an existing watcher thread fails with an unexpected error, netwatcher will shut itself down.
Kubernetes will promptly restart it, re-initializing the watch thus ensuring netwatcher HA.
The client provided Informer HA cannot tolerate prolonged API server failures, so this enhancement is an added HA measure on top.
… queries.

Also printing the error messages returend by the netwatcher discovery queries, if any.
@Levovar
Copy link
Collaborator Author

Levovar commented Apr 17, 2020

This might not be what the creators behind the API intended, but it seems to work beautifully

@Levovar Levovar merged commit 2c9d61c into master Apr 17, 2020
@Levovar Levovar deleted the netwatcher_healthcheck branch April 17, 2020 12:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants