-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[systemd] Add value to LimitNOFILE due to performance problems #1735
Conversation
When k3s is installed on an OS with default high ulimits, performance issues can be observed. This was discovered on CoreOS where the default value is 1073741816. Symptoms include very slow file operations such as installing a Rook/Ceph cluster will take ~6 hours instead of ~10 minutes. A google search for 'container LimitNOFILE' will show that most major projects set this already, including the (unused) containerd systemd unit found in this repository at /vendor/github.com/containerd/containerd/containerd.service k3OS is not affected becuasse the default there is already 1048576. See description in coreos/fedora-coreos-tracker#329
FWIW, I've found that setting a |
Thanks for digging into this @stellirin! The bug seems gross. :( Is Python 2 the only culprit? I think the fix is okay, but just want to ping @ibuildthecloud for lgtm / thoughts |
I only know of this specific case, but kubernetes-sigs/kind#760 mentions issues with NFS and MySQL. I guess it requires a specific set of conditions, but it seems it is common enough that it justified the mitigation on the container daemon side. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The change seems good the comment confuses me though.
That seems like the comment is saying it should be set to 0. I'm fine with the value of 1048576 as that is the expected value anyhow. I didn't know it could go to 1073741816 |
Verified on k3s v1.18.2-rc4+k3s1
|
When k3s is installed on an OS with default high ulimits, performance
issues can be observed. This was discovered on CoreOS where the default
value is 1073741816. Symptoms include very slow file operations such
as installing a Rook/Ceph cluster will take ~6 hours instead of ~10 minutes.
A google search for 'container LimitNOFILE' will show that most major
projects set this already, including the (unused) containerd systemd unit
found in this repository at /vendor/github.com/containerd/containerd/containerd.service
k3OS is not affected becuasse the default there is already 1048576.
See description in coreos/fedora-coreos-tracker#329