Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to reconcile offline nodes #6124

Open
ZeroDeltaAlpha opened this issue Aug 2, 2022 · 0 comments
Open

Ability to reconcile offline nodes #6124

ZeroDeltaAlpha opened this issue Aug 2, 2022 · 0 comments

Comments

@ZeroDeltaAlpha
Copy link

Hi,

We are currently trying to write a manager for citus in Kubernetes. Worker node additions work great using the citus_add_node UDF.

The problem we are facing is when we have a worker disappear (Via scale down or outright deletion) the drain and remove node functions cease to work due to citus needing to resolve the worker through DNS.

We have tried using preStop hooks but referring to the Kubernetes documentation this is ran when the pod is terminated, which is too late for citus as, at this point the pod has already had it's networking endpoint removed and cannot be resolved.

I'd love to chat through this as i think it would be useful to establish how to recover from worker nodes being non contactable and also running it in a Kubernetes environment where workers could move and disappear.

Stack

Citus 11 docker on k3d
k3d version v5.4.1
k3s version v1.22.7-k3s1 (default)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant