Set `LimitNOFILE=1024:524288` for `crio.service` #7703

polarathene · 2024-01-20T12:23:48Z

What happened?

I was recently made aware of this configuration line (contributed Oct 2016):

Line 20 in 91816d7

LimitNOFILE=1048576

Quite a bit has changed since then, notably with systemd v240 release in 2018Q4. Both Docker and Containerd projects have recently removed the line from their configs to rely on the 1024:524288 default systemd v240 provides (unless the system has been configured explicitly to some other value, which the system administrator may do so when they know they need higher limits).

You can find insights related to those PRs, along with a third link to the Envoy project (as an example of a popular software that presently does not raise it's soft limit or document that requirement, but has depended upon this implicit config in the environment) where the linked comment details why the soft limit should be 1024 to avoid software incompatibility:

This issue is raised to suggest consider applying the same change.

Either:

Remove the line like Docker and containerd have done
Include LimitNOFILE=1024:524288 with contextual comment.
- Although it may be better to remove the line, instead relying implicitly on the system default.
- Admins / users could also use a drop-in service override unit.

What did you expect to happen?

For LimitNOFILE to have a soft limit of 1024, so that software running in a container operates with the same environment defaults of the host system.

Raising the default soft limit should be done explicitly by the admin, or via the process that needs it implicitly (see Python reproduction below for an example of this).

How can we reproduce it (as minimally and precisely as possible)?

Commands

I am not familiar with cri-o, but the equivalent Docker commands demonstrate the difference (which for LimitNOFILE=1048576 can be more subtle, for example postsrsd would be <500ms vs 8 minutes):

# Demonstrating the impact on a python process with `LimitNOFILE=1048576`:
$ docker run --rm -it --ulimit "nofile=1048576" --volume './python_close_individual.py:/tmp/test.py' python:3.12-alpine3.19 ash -c 'time python3 /tmp/test.py 1048570 100'
115.63215670500358
real    1m 56.75s
user    0m 4.08s
sys     1m 51.62s

# For this test example, the 1st parameter (number of FDs to open/close per process) is sufficient to alter the iteration behaviour,
# If you lower `--ulimit` instead, the program will fail to open FDs outside the hard limit.
# Much faster than 2 minutes, only 145ms.
$ docker run --rm -it --ulimit "nofile=1048576" --volume './python_close_individual.py:/tmp/test.py' python:3.12-alpine3.19 ash -c 'time python3 /tmp/test.py 1024 100'
0.1456817659927765
real    0m 0.24s
user    0m 0.18s
sys     0m 0.06s

# Fedora 35 (for comparison to next snippet results)
$ docker run --rm -it --ulimit "nofile=1048576" --volume './python_close_individual.py:/tmp/test.py' fedora:35 bash -c 'dnf install -y python3 && time python3 /tmp/test.py 1048570 100'
114.43512667200412
real    1m55.261s
user    0m3.282s
sys     1m50.847s

# fedora:34 uses a version of Python (3.9) with a less optimized `closerange()` call
# fedora:35 uses Python 3.10 which can use a faster syscall when available (requires glibc 2.34+ in container and host kernel 5.9+)
$ docker run --rm -it --ulimit "nofile=1048576" --volume './python_close_range.py:/tmp/test.py' fedora:35 bash -c 'dnf install -y python3 && time python3 /tmp/test.py'
real    0m0.015s
user    0m0.000s
sys     0m0.014s

$ docker run --rm -it --ulimit "nofile=1048576" --volume './python_close_range.py:/tmp/test.py' fedora:34 bash -c 'dnf install -y python3 && time python3 /tmp/test.py'
real    0m6.268s
user    0m1.950s
sys     0m4.319s

# Alpine as of Jan 2024 does not have compatibility for the better closerange syscall like fedora:35+ does:
$ docker run --rm -it --ulimit "nofile=1048576" --volume './python_close_range.py:/tmp/test.py' python:alpine ash -c 'time python3 /tmp/test.py'
real    0m 6.81s
user    0m 2.55s
sys     0m 4.26s

Sources

python_close_individual.py:

import os, subprocess, sys, timeit
from resource import *

# Example, get the soft and hard limits from the environment and try to adjust that (soft limit to hard)
soft, hard = getrlimit(RLIMIT_NOFILE)
setrlimit(RLIMIT_NOFILE, (hard, hard))

# CLI args, number of FDs to open and how many times to run the bench method
num_fds, num_iter = map(int, sys.argv[1:3])

for i in range(num_fds):
    os.open('/dev/null', os.O_RDONLY)

# Spawn a subprocess that inherits the FDs opened (which will close them internally).
# Do this N times to demonstrate the impact:
# https://docs.python.org/3/library/timeit.html
# `subprocess.run()` calls Popen, which by default (close_fd=True) closes each FD above 3 individually:
# https://docs.python.org/3/library/subprocess.html#popen-constructor
# > If `close_fds` is `true`, all file descriptors except 0, 1 and 2 will be closed before the child process is executed.
print(timeit.timeit(lambda: subprocess.run('/bin/true'), number=num_iter))

python_close_range.py:

import os;

# Raise repetition to emulate a more intensive task
num_iter = 100
# Close all FDs after the third to the max, a common initialization practice for daemons
# The faster call with the fedora:35 image is constant, avoiding iteration over a potentially
# large range of FDs, each with an individual `close()` call..
for i in range(num_iter):
  os.closerange(3, os.sysconf("SC_OPEN_MAX"))

Reproduction references:

Anything else we need to know?

While containerd is yet to publish a release with this change AFAIK (should be scheduled for v2.0), AWS eagerly adopted the change and promptly reverted it due to customer feedback with some software failing to communicate a request for a higher soft limit (some AWS specific software and Envoy are known examples).

AWS can provide a higher LimitNOFILE configuration if that better suits their users (despite the referenced 1024 soft limit concerns, or the difficult to troubleshoot issues with LimitNOFILE=infinity), but that should be a vendor decision while projects like cri-o actually fix the bug.

LimitNOFILE=1048576 is not as bad as LimitNOFILE=infinity, however:

This concern still applies: Support raising the soft limit envoyproxy/envoy#31502 (comment)
Software such as MySQL has been known to allocate excessive memory, this will be 1,000x less but affected deployments would still be allocating 1,000x more than they may need. Java runtime was also identified as another culprit.
Software like PostSRSd, Fail2Ban, Rsyslog
RPM package managers:
- yum (NOTE: PowerDNS had to workaround due to 6 hours image build time, however that was due to a 2^30, not 2^20 limit)
- zypper (NOTE: LimitNOFILE=1048576 taking 30-60 minutes, could be much faster)
- dnf

CRI-O and Kubernetes version

N/A

$ crio --version
# paste output here

$ kubectl version --output=json
# paste output here

OS version

N/A

Test reproduction environment was WSL2 (Ubuntu), but previously was Arch Linux and Fedora.

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

Additional environment details (AWS, VirtualBox, physical, etc.)

The text was updated successfully, but these errors were encountered:

kwilczynski · 2024-01-22T07:45:25Z

@polarathene, thank you for getting in touch!

This does make sense. Users can always set the following:

[crio.runtime]
default_ulimits=[
  "nofile=1024:1048576",
]

If they really wish to override the inherited default.

kwilczynski · 2024-01-22T07:45:41Z

/assign kwilczynski
/assign haircommander

kwilczynski · 2024-01-22T07:47:35Z

Also, interestingly, we don't support /etc/containers/containers.conf to allow for overriding of this value. I suppose, not to cross the streams with Podman and the likes.

kwilczynski · 2024-01-22T07:47:56Z

Speaking of Podman, any issues with how they set limits @polarathene?

champtar · 2024-01-22T10:41:27Z

Many applications do not raise the soft limit automatically because it was never the responsibility of the application.
1024 was not the default for 8 years in cri-o and 6 in containerd, so application that need 1024 as limit are not that widespread, and they can always lower the limits before getting fixed properly.
Lowering the default will work just fine in dev/preprod/QA, and will break only in production.
On going discussion: containerd/containerd#8924 (comment)

kwilczynski · 2024-01-22T11:26:24Z

@champtar, this issue has a strong "ain't broken, don't fix it vibe". Would this be what you are getting at?

We can:

Do nothing. As there haven't been any issues reported against CRI-O where the limits are an issue
Set the limit lower. Which might break things for users, and as such, it would be a breaking change

The former requires, perhaps, an update to the existing documentation that clearly explains the defaults and the side effects for some applications that need lower limits due to some of their design choices. The latter would be a rather lengthy deprecation process.

We should be cautious not to break OpenShift with this sort of change.

kwilczynski · 2024-01-22T11:26:32Z

polarathene · 2024-01-22T12:18:05Z

Response to @champtar

Many applications do not raise the soft limit automatically because it was never the responsibility of the application.

It absolutely is. If the software needs more than the soft limit available, and the hard limit permits that, it should communicate that to the kernel.

I have told you this once before @champtar but please see this link for why the soft limit is intended to stay at 1024. If you know everything you're running in your container is not impacted by that (such as when you use a scratch base image and copy of a build artifact(s)), then by all means raise the soft limit if the software is not capable of doing so itself.

Cite me one distro that actually deviates from the system default soft limit as set above 1024. I don't think you'll find this to be the case with anything like Ubuntu, Debian, RHEL, Fedora, openSUSE, ArchLinux, Alpine, etc. If you do find something, it's more than likely niche / container-focused which doesn't inspire much confidence.

You are trying to justify the container environment to be inconsistent with a typical host because of a misconfiguration that has existed for a long time due to no one having the time to invest to resolve answering why it was a mistake. It caused various bugs that I have linked you to which were non-obvious to troubleshoot the cause.

1024 was not the default for 8 years in cri-o and 6 in containerd, so application that need 1024 as limit are not that widespread

These projects tend to follow Dockers own config changes in my experience.

The PR for this project has very little context regarding the decision, nor review comments on it which suggests at the time they didn't really understand it to have any feedback / question it (we get busy and don't always have time, so changes like this can slip through).

The fact it wasn't changed since doesn't mean it didn't cause any problems. Look at the ones related to Docker that I've cited, these span over many years, some not realizing the cause, or implementing their own workaround/fix (some with bad advice).

The Docker / containerd change was discussed for well over a year IIRC with some maintainer feedback before I pushed for it with sufficient evidence to justify it.

they can always lower the limits before getting fixed properly.

Question, if you have a container with three processes:

One that malfunctions when the soft limit is above 1024
One that regresses in resource usage with an excessively high soft limit (taking 1,000x as long in CPU to operate, or multiple GB of RAM instead of a few MB)
One that needs a higher soft limit and knows that it's developers knows that it requires this for enterprise-grade workloads, more than 2^20.

How do you approach this?:

Should the other two processes be negatively impacted from running within the same container?
Should a workaround be implemented by using ulimit in the entrypoint with subshells to accommodate the different processes?
Or should the default 1024 soft limit be respected, and the process that actually knows it needs a very large amount of FDs request that at runtime? (it's very simple, see the python example above setrlimit(RLIMIT_NOFILE, (hard, hard)))

If you need the soft limit to be raised for software deployed at workloads requiring the limit to be that high then the software should handle this. Especially when it's software that's well funded and/or used heavily by businesses, why can't it implement the correct behaviour instead of negatively impacting other software by being lazy?

A separate soft limit exists for a reason and it should be respected.

Lowering the default will work just fine in dev/preprod/QA, and will break only in production.

Plenty of software can work in production just fine. If it's not, then as I've said already, that software is most likely only deployed in containerized environments and has depended upon misconfiguration of the container runtime to work without facing this issue.

Outside of a container, it'd be stuck with 1024, or a package that explicitly bundles a systemd unit (or similar) or documents (like you'll find with MongoDB and Kafka) the need for a higher soft limit if the software is incapable of raising one for some reason.

I'd love to know why the software can't do what Go does and just raise the soft limit implicitly to the hard limit at runtime (since that's what is unofficially expected from software like Envoy). Not all software will be compatible with that, but the software you're referring to generally knows when it needs that.

For production deployments, it's better that you are explicit about such a requirement when it's needed. Rather than troubleshooting to identify it as the cause when it's caused a bug/regression (which is notably harder to track from a limit too high vs too low).

You can use drop-in override configs for systemd, or if you don't have the ability to manage the host at that level have your service provide provide the ability to raise this default limit to meet your business needs, it shouldn't be imposed upon all businesses though when that software would have functioned correctly outside of a container.

polarathene · 2024-01-22T12:18:34Z

Response to @kwilczynski

Speaking of Podman, any issues with how they set limits @polarathene?

I am not that familiar with podman beyond name. I had a quick look over their Github repo and the only LimitNOFILE results that came up related to systemd config was users units often setting both soft/hard limit to a low 2^16 (this is what Github CI runners used too last I checked).

There is this Go file for their podman command. That sets the soft limit to the hard limit:

macOS - Introduced Nov 2023.
Linux - Introduced July 2020 (this shouldn't be necessary anymore AFIAK with builds on newer releases of Go)

I assume that's only for their podman command. So it doesn't seem like they have any limit overrides config and just trust the host OS as it's defaults are configured. Which is what Docker and soon Containerd will now do 👍

The former requires, perhaps, an update to the existing documentation that clearly explains the defaults and the side effects for some applications that need lower limits due to some of their design choices. The latter would be a rather lengthy deprecation process.

We should be cautious not to break OpenShift with this sort of change.

If you have no reports from your users about issues running software like I've referenced you don't need to do anything. You can also wait and see how Docker v25.0 and Containerd v2.0 manage this change to see if they revert it.

LimitNOFILE=1048576 / 2^20 is less severe.

I have illustrated in the reproduction example above how it does regress, but most of your users are unlikely to notice it as it's not severely disruptive and those who do know better can likely find out about the issue now and resolve it locally with an override.

LimitNOFILE=infinity is what Docker and Containerd previously had for some time which became problematic from 2019 when many systems updating to systemd v240 had the hard limit bumped from 2^20 to 2^30 (1,024x), making that delta a million instead of a thousand.

So besides the minor regression from the 2^20 soft limit you have (eg: 15ms vs 6s / 2 minutes, for an affected operation). The only real breakage with software would be those that rely on select() (directly or indirectly via dependencies) which from what I understand makes that functionality unreliable. I'm not sure what software today still uses that syscall, or what sort of impact that could have.

As @champtar notes, it's less likely to be severe as no one has really reported issues that have pin pointed the 1024 soft limit needing to be enforced to my knowledge. Assuming anyone that was impacted would have noticed and reported something like data loss / corruption or other undesirable outcomes beyond performance / resource regressions which are less serious to resolve.

afirth · 2024-01-22T14:04:09Z

FYI a similar change exploded EKS for many large customers. awslabs/amazon-eks-ami#1551

champtar · 2024-01-22T14:28:10Z

@polarathene you are answering without taking into account my answer at containerd/containerd#8924 (comment)
I've seen your link and I'm not convinced, after 6 years the world hasn't burned down when soft != 1024
Distro default is 1024, they have no incentive to change the default, it would break things. k8s default (via containerd or cri-o) is > 1024, changing would also break things.

ulimit on the host has been set by pam_limits or systemd LimitNOFILE since forever, and in containers there was no need to raise soft limit because it was high enough 99% of the time.
Now you are telling every users of httpd / nginx / postgres / mongo / envoy / ... to go fix their docker images because their software is broken ?

The question is not if we can fix everything, it's: has anyone started to identify what will break (you have clearly not), opened the bug and or PR and argue to have them merge, have we given enough time.

@kwilczynski @polarathene I would be ok with a good 1 year minimum deprecation notice, with a blog post explaining why, and posting it on social media for awareness so people can start fixing images.
This process should include cloud providers providing managed k8s as well.

Right now the docker/containerd plan is to make the change and see what breaks, without trying to prepare for it, or even mention it in the release notes !!! (https://github.com/moby/moby/releases/tag/v25.0.0 / https://github.com/containerd/containerd/releases/tag/v2.0.0-beta.1)

Should a workaround be implemented by using ulimit in the entrypoint with subshells to accommodate the different processes?

This is what will happen in many images if you lower soft limit to 1024, everyone will have to add entrypoints waiting for software to be "fixed"

Or should the default 1024 soft limit be respected, and the process that actually knows it needs a very large amount of FDs request that at runtime?

It might only need 10240, but 1024 is a crazy low number

it shouldn't be imposed upon all businesses though when that software would have functioned correctly outside of a container.

Again and again, outside of k8s you can set the ulimits just fine (pam_ulimits or systemd), and it's not because systemd folks says that application should raise the soft limit if needed that this is how things are done today.

Instead of responding again with a pile of text, please go through the most pulled image from docker.io and see for yourself the work in front of us.

polarathene · 2024-01-23T01:49:34Z

Apologies for the verbosity again (not intentional).

You may find the bullet point list of insights with software you mentioned worthwhile. Everything else we're mostly going in circles, with neither of us likely to change our opinion on what is correct.

I'm not particularly interested in continuing the discussion. I've cited a huge amount of resources for why the change is appropriate.

If the projects that removed LimitNOFILE from the config retain this change, you could push for individual distros to package the software with a patch to add it back, carry your own override locally, or have k8s/whomever else provide support that you require.
I continue to stand by aligning with the host OS defaults.
- I don't install a Linux desktop expecting it to be tuned for production servers needs, especially when they negatively impact my needs as a user by introducing terrible regressions. Likewise, why should that differ from running software in a container?
- Businesses employ talent to ensure production is deployed correctly, one would hope they're capable of adding another config adjustment or have more influence to instigate changes where needed than a mere individual like myself.

If you want to push for an implicit reliance on the soft limit being artificially high in container environments by default for your convenience, then do so with proper reproductions that demonstrate the failure in action.

If you can do this with enough popular images / software impacted outside of a production context, you could most likely convince maintainers to revert the change as it'd impact UX at a wider scale where they may be better off documenting the difference in a container environment, and the potential hard to troubleshoot breakage/regressions for software running in a container.

At 2^20, someone might observe the performance difference, but how likely are they to figure out it was because of this soft limit in a container vs their host?

The likelihood increases with severity to justify the effort, where the risk was acceptable to ignore in favor of convenience for production.
Others will just observe it as container overhead, biased that alone is what makes it run slower. There was a similar observation with networking in the past, which is still present with Docker (disable the default userland-proxy and it'd double the network perf cap).

has anyone started to identify what will break (you have clearly not)

I did try, see this comment where I detail reproduction related to the original motivation for the LimitNOFILE change (to 2^20, prior to the infinity one). containerd followed along with Docker for their config update. So this was demonstrating that the change was valid for that original concern:

I did extensively look into the history as best I could for software that actually needed higher limits.

I came across a few that documented advice such as MongoDB and Kafka.
Your production workloads that aren't compatible I lack the skills/time to properly assess myself in a way that would trigger a failure. It seemed appropriate that providing an explicit limit for that when needed would be valid.

I've seen your link and I'm not convinced, after 6 years the world hasn't burned down when soft != 1024

Because ideally they get spotted and fixed?

Kind of how bug reports work when the cause can be identified. I've seen projects with bug reports, but no one knew how to track down the actual cause at the time to resolve it.

Now you are telling every users of httpd / nginx / postgres / mongo / envoy / ... to go fix their docker images because their software is broken ?

For reference, Docker set LimitNOFILE=1048576 in 2014 and LimitNOFILE=infinity in 2016.

supervisord with select() (2011 reported, 2014 fixed).
Nginx will raise soft limit when you tell it to via config (bug report in 2015, where select() was used by nginx limiting it to 1024). Alternatively if you're using the nginx container, you can raise the soft limit at the container level.
Redis docs advising 2^16 in example, will be dependent upon your workload.
- redis-py Dec 2013 issue with select(), fixed in June 2014
- redis/hiredis 2015 issue where a user was relying on select().
- Nov 2020 article on select(), references Redis still carries select() as a fallback (ae_select.c)
httpd has select() (see this 2002 commit, which is still present today in 2024_)
Postgres has this 2010 response for why they don't use the hard limit to not negatively impact other software running. Less of an issue in a container, especially when you can set the limits. The limits work a little bit differently now , that the issue shouldn't be applicable anymore (the global FD limit for a system is notably higher than the hard limit tends to be per process).
- Docs on max_files_per_process. Similar to nginx, there is a specific setting here, and it's also per process this software manages. Both of these are doing the correct thing by not having a monolithic process sharing the same FD limit. The soft limit applied is per process, if you have 1024 processes with the soft limit of 1024, you also have 2^20 FDs available across them...not only 1024.
- Postgres still has source with usage of select() here, here and various other locations if you want to look through it.
MongoDB using select() in 2014, in the 3.7.5 release it was still using it for listen.cpp, but that was dropped in the 3.7.6 release (April 2018). A select() call still exists though.
Envoy... I shouldn't need to touch on. They don't document the need to support more than 1024 at all, and unofficially advise you to set the soft limit to the hard limit. If they know that's always appropriate, then they might as well do that internally.

So what takeaways do we have from all that?

There is still some select() calls in the source of those projects today, you'd just need to run the code that would trigger it.
Some software knows when it needs a higher soft limit, it's compatible with 1024 soft limit as the base, with a high enough hard limit to raise to for child processes (workers) spawned which will raise up to a higher soft limit that you explicitly configure. They probably could do this implicitly now as the original concerns appear to be from earlier kernel behaviour that is not applicable today AFAIK.
Various software had these select() calls with the high soft limit in a container environment. You can find bug reports like for MySQL where the software was misbehaving but it wasn't apparent why it only happened in containers, and only really noticeable to some environments.

I'd like to also highlight this:

you have to make sure your code (and any code that you link to from it) refrains from using select()
Note: there's at least one glibc NSS plugin using select() internally.
Given that NSS modules can end up being loaded into pretty much any process such modules should probably be considered just buggy.

indiscriminately increasing the soft RLIMIT_NOFILE resource limit today for every userspace process is problematic:
as long as there's userspace code still using select() doing so will risk triggering hard-to-handle, hard-to-debug errors all over the place.

Even if the project itself removes select(), there can be dependencies it brings in that implicitly use select(). See this comment about Samba switching to poll() but still having the problem due to their dependencies.

How do you want to properly test for that without trawling through source code of various projects, or only caring about coverage of the happy paths?

That's just for select(), as shown above raising the soft limit varies in support / config for software you've cited.
You could raise the soft limit externally, but sometimes the separate configuration settings are to ensure you only raise it where it's appropriate so that any code using select() isn't impacted. Especially for a container with multiple processes from different projects, raising the default soft limit for everyone in that environment is unwise.

I would be ok with a good 1 year minimum deprecation notice, with a blog post explaining why, and posting it on social media for awareness so people can start fixing images.
This process should include cloud providers providing managed k8s as well.

I agree about the importance to communicate such a change.

Someone has to write up all that extra content though, which is important since the type of user it'd affect is not likely to recognize the concern from a changelog entry alone, only after their deployments are breaking.
They may not see any of this additional supporting content prior still until they hit the problem though.

For cri-o, if there is no bug reports like the other projects that adopted the change had. Then there really is no rush, just observe. Plenty of information in this issue thread should a problem related to it arise 👍

This is what will happen in many images if you lower soft limit to 1024, everyone will have to add entrypoints waiting for software to be "fixed"

It was done on some projects (often poorly) when it was the other way around.

Why are you against software compatibility between a host and container deployment? I often find containers are great at providing a reproduction example environment for bug reports, but it doesn't help when defaults introduce bugs (like my local test suite run failing when CI was fine).

Raising the soft limit when it's needed is often documented / supported for projects that have that need. You'd need to do it on the host, that should carry over to the container IMO.

Realistically, the demographic shifts from small projects / images to bigger ones deployed in a production context, as does the people behind all that. It's not a big ask for a business to configure the host to use a higher global soft limit default for containers, be that your company or a service host you rely on.

It might only need 10240, but 1024 is a crazy low number

Yes it's low. Why should it be different for container environments? Many of the software you cited are not run exclusively in containers and have support for raising the soft limit (or documenting awareness), where is all the complaints related to that?

If you are to push for a slightly higher bump disregarding the select() concern, 2^16 is common. It probably won't be enough for production loads that aren't multi-process with high FD requirements.

outside of k8s you can set the ulimits just fine

Raise that issue with the k8s devs? Surely they have the resources and capabilities to accommodate such a tunable?

EDIT: Apparently not, 9 years unresolved
- Perhaps the change with containerd is positive and will finally prioritize k8s to resolve it?
- Note the linked comment from 2023 with a k8s deployment hitting max RAM and CPU usage causing OOM.

Alternatively if you're dependent upon the LimitNOFILE= systemd service being set from projects like crio-o / containerd, use a drop-in override as suggested earlier.

Updates won't mess with that then, your override will always have precedence. Easy for a local fix, and I'd expect a non-issue for production deployments (unless this is abstracted by a service provider, then raise an issue with them about this, you're their customer and pay them to meet your business needs).

Instead of responding again with a pile of text, please go through the most pulled image from docker.io and see for yourself the work in front of us.

I ironically never expect my responses to be as verbose as they end up 🤷‍♂️ It's not intentional sorry, but I don't have the time to condense it.

I saw your examples for the mentioned software in your linked containerd comment, you just pulled the image and checked if it increased limits at runtime implicitly. See the above part of my response here where that's not sufficient.

You'd be better off demonstrating a reproduction of where it's an actual problem. As stated, just because the soft limit is 1024 per process by default, doesn't mean the software is limited to 1024 FDs. Those that spawn worker processes like nginx are granted 1024 FDs to work with each, it is not shared with the parent process.

If you want to debate this, why should something like Python subprocess.run() perform notably worse in a container than the host? Why should the user need to think about things like this that would normally not be an issue?

Why is your concern with production deployments with software like databases and webservers such a big point of friction for you that you can't adjust the limits? (If it's due to k8s and the override support isn't viable for some reason, perhaps you should be focusing your time and attention on getting that resolved instead?)

polarathene · 2024-01-31T22:27:39Z

Since the bot wants to reassign you and discussion died down here. I don't mind closing this issue as "not planned" 👍

I don't use cri-o, and while I've given plenty of evidence / resources to justify why the change should be applied (or better remove/comment-out the config line to rely on global limit if the host OS changed it), I understand being weary of this when you have no notable complaints from users that it'd resolve.

There is risk by applying this risks breaking the deployments of those that relied on the bad config. The limit currently configured isn't going to be as visible in impact (as opposed to infinity that resolves to 2^30), minimizing bug reports related to it. No need to burden yourself and other maintainers with that if the change is not considered helpful to anyone.

For reference, here is the release notes for Docker v25 on this change:

You can see that while they state the previous value was infinity / 2^30, their troubleshooting advice is to not revert back to that but 1048576 / 2^20 instead.

They also mention the drop-in override config support for those that need to apply this change, and that once the related containerd update lands it'll have a wider impact.

github-actions · 2024-03-02T00:02:42Z

A friendly reminder that this issue had no activity for 30 days.

kwilczynski · 2024-05-02T02:06:36Z

/remove-lifecycle stale

polarathene added the kind/bug Categorizes issue or PR as related to a bug. label Jan 20, 2024

polarathene mentioned this issue Jan 20, 2024

Set LimitNOFILE=1048576 in containerd.service containerd/containerd#9660

Open

openshift-ci bot assigned haircommander and kwilczynski Jan 22, 2024

polarathene mentioned this issue Jan 22, 2024

Remove LimitNOFILE from containerd.service containerd/containerd#8924

Merged

kwilczynski removed their assignment Jan 25, 2024

openshift-ci bot assigned kwilczynski Jan 31, 2024

github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 2, 2024

polarathene mentioned this issue Mar 6, 2024

rlimit support kubernetes/kubernetes#3595

Open

openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set `LimitNOFILE=1024:524288` for `crio.service` #7703

Set `LimitNOFILE=1024:524288` for `crio.service` #7703

polarathene commented Jan 20, 2024 •

edited

kwilczynski commented Jan 22, 2024

kwilczynski commented Jan 22, 2024

kwilczynski commented Jan 22, 2024

kwilczynski commented Jan 22, 2024

champtar commented Jan 22, 2024 •

edited

kwilczynski commented Jan 22, 2024

kwilczynski commented Jan 22, 2024 •

edited

polarathene commented Jan 22, 2024

polarathene commented Jan 22, 2024 •

edited

afirth commented Jan 22, 2024

champtar commented Jan 22, 2024

polarathene commented Jan 23, 2024 •

edited

polarathene commented Jan 31, 2024 •

edited

github-actions bot commented Mar 2, 2024

kwilczynski commented May 2, 2024

Set LimitNOFILE=1024:524288 for crio.service #7703

Set LimitNOFILE=1024:524288 for crio.service #7703

Comments

polarathene commented Jan 20, 2024 • edited

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Commands

Sources

Anything else we need to know?

CRI-O and Kubernetes version

OS version

Additional environment details (AWS, VirtualBox, physical, etc.)

kwilczynski commented Jan 22, 2024

kwilczynski commented Jan 22, 2024

kwilczynski commented Jan 22, 2024

kwilczynski commented Jan 22, 2024

champtar commented Jan 22, 2024 • edited

kwilczynski commented Jan 22, 2024

kwilczynski commented Jan 22, 2024 • edited

polarathene commented Jan 22, 2024

polarathene commented Jan 22, 2024 • edited

afirth commented Jan 22, 2024

champtar commented Jan 22, 2024

polarathene commented Jan 23, 2024 • edited

polarathene commented Jan 31, 2024 • edited

github-actions bot commented Mar 2, 2024

kwilczynski commented May 2, 2024

Set `LimitNOFILE=1024:524288` for `crio.service` #7703

Set `LimitNOFILE=1024:524288` for `crio.service` #7703

polarathene commented Jan 20, 2024 •

edited

champtar commented Jan 22, 2024 •

edited

kwilczynski commented Jan 22, 2024 •

edited

polarathene commented Jan 22, 2024 •

edited

polarathene commented Jan 23, 2024 •

edited

polarathene commented Jan 31, 2024 •

edited