Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change: graceful shutdown drains node before k3d container stops #1119

Merged
merged 2 commits into from
Feb 8, 2023

Conversation

arikmaor
Copy link
Contributor

@arikmaor arikmaor commented Aug 4, 2022

This PR drains the node before stopping the container (incase of docker stop/k3d cluster stop/k3d node stop/system shutdown)

It allows services to gracefully shutdown in this case, this is specially useful for databases that needs to clear their lock files.

Related to the bug I opened

Implications

  1. sh is now the main process running in the image and is used to intercept incoming signals (SIGTERM etc.) more info
  2. Before killing the k3s process, kubectl drain is called, causing the pods to evict. When to node is ready again, kubectl uncordon is called to allow pods to reschedule more info

arikmaor pushed a commit to tensorleap/helm-charts that referenced this pull request Aug 4, 2022
This commit should be reverted once
k3d-io/k3d#1119 is merged
arikmaor pushed a commit to tensorleap/helm-charts that referenced this pull request Aug 4, 2022
This commit should be reverted once
k3d-io/k3d#1119 is merged
arikmaor pushed a commit to tensorleap/helm-charts that referenced this pull request Aug 4, 2022
This commit should be reverted once
k3d-io/k3d#1119 is merged
yotamazriel
yotamazriel approved these changes Aug 4, 2022
@iwilltry42
Copy link
Member

Hi @arikmaor , thanks for your PR!
I like the simplicity of this solution 👍
However, I feel like we should make this configurable.
Without testing, I guess, that it will significantly slow down the cluster stop process, right?
That may not be wanted in quite a few use-cases, e.g. CI or quick development cycles, where no data needs to be persisted.

E.g. one could introduce a flag --graceful to k3d cluster stop or a new command k3d cluster shutdown or just some environment variable, or something else.

What do you think?

@arikmaor
Copy link
Contributor Author

arikmaor commented Aug 4, 2022

Without testing, I guess, that it will significantly slow down the cluster stop process, right?

I don't think so, I'm assuming you're calling docker stop internally and according to the docs docker will send a SIGTERM signal and wait for 10s by default before doing a force kill.
you can set this timeout when you do docker stop with the --timeout parameter which perhaps should be added to the k3d cluster/node stop commands.

That may not be wanted in quite a few use-cases, e.g. CI or quick development cycles, where no data needs to be persisted.

In the ci scenario, you can live with the graceful default 10s or do a kill
In the dev scenario, I think you expect the pods to go down gracefully when doing a k3d cluster stop or when shutting down your workstation.

E.g. one could introduce a flag --graceful to k3d cluster stop or a new command k3d cluster shutdown or just some environment variable, or something else.

IMHO, graceful shutdown should be the default, but I guess that's debatable.
I think it should be the default because:

  1. To match docker and kubectl semantics, I would prefer a separate kill command to go along with stop.
  2. When doing a clean system shutdown, you expect all the running processes to close properly and all data to be saved. This is also applicable to someone accidentally doing docker stop instead of k3d node stop. In those cases you must "speak the os language" (handling signals) in order to respond properly

arikmaor pushed a commit to tensorleap/helm-charts that referenced this pull request Aug 4, 2022
This commit should be reverted once
k3d-io/k3d#1119 is merged
@iwilltry42 iwilltry42 added this to the v5.5.0 milestone Aug 4, 2022
@iwilltry42
Copy link
Member

Your arguments make total sense, thanks for taking the time.
I'll add it to the next milestone 👍

@arikmaor
Copy link
Contributor Author

arikmaor commented Aug 4, 2022

10x cool :)

yotamazriel pushed a commit to tensorleap/helm-charts that referenced this pull request Aug 7, 2022
This commit should be reverted once
k3d-io/k3d#1119 is merged
Copy link
Member

@iwilltry42 iwilltry42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@iwilltry42 iwilltry42 changed the title Drain node before container stops change: graceful shutdown drains node before k3d container stops Feb 8, 2023
@iwilltry42 iwilltry42 merged commit f354727 into k3d-io:main Feb 8, 2023
@iwilltry42
Copy link
Member

Thank you @arikmaor ! 👍

@arikmaor
Copy link
Contributor Author

arikmaor commented Mar 8, 2023

@all-contributors
please add @arikmaor for code

@allcontributors
Copy link
Contributor

@arikmaor

I've put up a pull request to add @arikmaor! 🎉

@eshizhan
Copy link

@iwilltry42 @arikmaor agent does not have a kubeconfig, that make error in agent: The connection to the server localhost:8080 was refused - did you specify the right host or port?

@arikmaor arikmaor deleted the patch-1 branch April 13, 2023 09:48
@arikmaor
Copy link
Contributor Author

@iwilltry42 @arikmaor agent does not have a kubeconfig, that make error in agent: The connection to the server localhost:8080 was refused - did you specify the right host or port?

Can happen from all kinds of reasons...
How is this related to this PR?

@eshizhan
Copy link

That PR used kubectl command, it make agent node raise error log every 3 seconds. Because k3s does not create kubeconfig on agent default. k3s-io/k3s#3862

@eshizhan
Copy link

Reproduce:

k3d cluster create k3dtest -a 1
docker logs k3d-k3dtest-agent-0 -f
The connection to the server localhost:8080 was refused - did you specify the right host or port?
The connection to the server localhost:8080 was refused - did you specify the right host or port?
The connection to the server localhost:8080 was refused - did you specify the right host or port?
The connection to the server localhost:8080 was refused - did you specify the right host or port?
The connection to the server localhost:8080 was refused - did you specify the right host or port?
...

@arikmaor
Copy link
Contributor Author

@iwilltry42 perhaps the script should be tweaked a bit for some use case I did not for-see
PTAL

@iwilltry42
Copy link
Member

@arikmaor I think we may even need to revert the change for now.
I see the following problems:

  1. As @eshizhan noticed, it won't work on agent nodes, since the only available node-local kubeconfigs there don't have the necessary permissions
  2. We always need access to the Kubernetes API to call the drain command, which will make it fail if the servers are already down, e.g. when you just want to delete some orphaned agent node

I'm not sure how "graceful" we could make the shutdown by e.g. using ctr or crictl to node-locally stop/remove pods/containers.

k3d's stop commands could easily be handled from the client side.. but not docker commands unfortunately.

Do you have any idea?

@arikmaor
Copy link
Contributor Author

arikmaor commented Apr 26, 2023

k3d's stop commands could easily be handled from the client side.. but not docker commands unfortunately.

Let's not give up just yet.
It's not just docker commands, it's also a system shutdown.

If you just run k3s
shutting down k3s - doesn't kill the running containers
shutting down the machine - allowing all containers to gracefully shutdown

shutting down the machine is critical case IMO, you don't expect your containers to just drop dead without a chance of saving their data when you do a clean shutdown.

@arikmaor
Copy link
Contributor Author

@iwilltry42 is there a way of knowing inside k3d-entrypoint.sh if the current container is a server? (perhaps by reading some environment variables)
If so, perhaps we should just run the kubectl drain/uncordon command when we know it's a server, otherwise ignore.
Does that sound like the correct behavior to you?

@iwilltry42
Copy link
Member

@arikmaor yeah we would know that, but that doesn't help with agent shutdown, right?
With k3d triggered stop we could make it work, but not externally induced, since we don't know the order in which they are stopped, so it may or may not work.

@arikmaor
Copy link
Contributor Author

mmm... I see what you mean (before I didn't understand that an agent is still running pods, just not replicating the control plane)
But why exactly can't we drain an agent node? it's still a k3s node IIUC

@iwilltry42
Copy link
Member

Alright, so we two major cases here

k3d-induced shutdown of nodes

Everything's fine here, we can just run the drain command for all cluster node from a single server node.

Docker- or System-induced shutdown of nodes

We don't know in which order the nodes are being shut down, and if it will be possible to drain the node before the container is killed.
E.g. we have one server and two agents. The System is rebooting. Agents are stopped already or in any shutdown state where the kubelet won't react to a drain task anymore, but in the shutdown sequence of the server we're still trying to drain them, which will fail.

Do you know of any way to drain nodes without access to the control-plane?

@arikmaor
Copy link
Contributor Author

arikmaor commented May 7, 2023

Do you know of any way to drain nodes without access to the control-plane?

Draining is the recommended way
Found this feature in k8s that tries to deal with this, not sure if it's included/enabled in k3s

I think it someone use case was broken by this PR, we should revert it.
In case of revert, I'd appreciate a way of loading a custom entry file without a warning during cluster creation 🙏

@eshizhan
Copy link

@arikmaor @iwilltry42 any update? This causes the docker log to grow very quickly and takes up disk space.
Can we temporarily add 2>/dev/null to avoid this problem first?

@iwilltry42 iwilltry42 modified the milestones: v5.5.0, v5.7.0, v5.4.6 Apr 3, 2024
@iwilltry42
Copy link
Member

Moved #205 to v5.7.0 - with this, we can add the option for custom entrypoints and other lifecycle hooks that can execute scripts.
I'd then revert this PR but keep the script around, just disabled by default with an easy way to enable those "built-in" scripts.

@iwilltry42
Copy link
Member

Ref #1420

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants