Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lock acquisition hangs with kubedock 0.14.0 #58

Closed
thomascube opened this issue Oct 27, 2023 · 4 comments
Closed

Lock acquisition hangs with kubedock 0.14.0 #58

thomascube opened this issue Oct 27, 2023 · 4 comments
Labels
bug Something isn't working

Comments

@thomascube
Copy link

After upgrading kubedock from 0.13.0 to 0.14.0 we're experiencing problems with the lease acquisition when using the --lock flag. The startup hangs after the following log message:

leaderelection.go: attempting to acquire leader lease esta-tekton-dev/kubedock-lock...

and the testcontainers fail because the Docker API is not available.

This seems like a comeback of #40. The Lease resource is again left with a non-empty holderIdentity field after kubedock terminates.

We're running Kubedock on Openshift and the environment didn't change. Downgrading to 0.13.0 solves the problem thus it must be related to a change in 0.14.0.

@joyrex2001 joyrex2001 added the bug Something isn't working label Oct 27, 2023
@joyrex2001
Copy link
Owner

This seems like a comeback of #40. The Lease resource is again left with a non-empty holderIdentity field after kubedock terminates.

Starting fails with timeout acquiring lock right, and you do see exit signal recieved, removing pods, configmaps and services when kubedock stops?

It looks indeed like #40 is back.

@joyrex2001
Copy link
Owner

In #40 I did not found the root-cause, but re-organizing and rewriting the exit handler worked around it. Moving towards go 1.21 re-introduced the bug.

The root-cause is that the main go-routine is killed before the exit handler was finished, which would crash the exit handler. Making the main go-routine wait indefinitely and moving the exit responsibility to the exit handler (which was the initial imlementation/idea), makes sure all cancelations (including the release lock) are executed.

@joyrex2001
Copy link
Owner

Released https://github.com/joyrex2001/kubedock/releases/tag/0.14.1 with this fix.

@thomascube
Copy link
Author

Thanks for the fast fix! I can confirm that the problem is resolved with 0.14.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants