A Pod becomes a memory hog in Kind (the issue seems Kind-specific) #2623

victor-sudakov · 2022-02-08T03:55:40Z

What happened:
A very simple pod turns into a memory hog on a 3-node Kind cluster. This sample haproxy pod (see the manifest below) rapidly consumes as much memory as limits permit and starts consuming swap (and if there is no limit configured, it easily consumes gigabytes of memory).

What you expected to happen:
I expect this particular image to consume about 150Mi of RAM as it happens in Minikube and in a kOps/AWS cluster.

How to reproduce it (as minimally and precisely as possible):
Please apply the attached manifest and watch the pod's memory usage.

Anything else we need to know?:
The problem is reproducible only in Kind that's why I'm filing the bug here as it may be specific to Kind. Minikube and kOps are not affected.

Environment:

kind version: (use kind version): kind v0.11.1 go1.17.6 linux/amd64.
Kubernetes version: (use kubectl version): Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"5e58841cce77d4bc13713ad2b91fa0d961e69192", GitTreeState:"clean", BuildDate:"2021-05-21T23:01:33Z", GoVersion:"go1.16.4", Compiler:"gc", Platform:"linux/amd64"}
Docker version: (use docker info): containerd version: 1407cab509ff0d96baa4f0eb6ff9980270e6e620.m, runc version: v1.0.3-0-gf46b6ba2, init version: de40ad0
OS (e.g. from /etc/os-release): Manjaro Linux Qonos 21.2.2

test.yaml.txt

The text was updated successfully, but these errors were encountered:

BenTheElder · 2022-02-08T05:04:35Z

i think you will find the same bug in minikube --vm-driver=docker, the problem is with nested containers we can't enforce resource limits (currently...). #877

this is a known limitation, and is not trivial to fix. kind is not suitable for testing this sort of thing, unfortunately.

further, if you add additional notes, the host resouces will be over reported (duplicated for each node). see #877 and linked issues from there.

victor-sudakov · 2022-02-08T05:10:00Z

@BenTheElder no, I'm running this same manifest in Minikube v1.24.0 with the docker driver, and the container consumes just about 150Mi while the limit in the manifest is 512Mi. The problem is not in enforcing the limits, it's with the container going amok under Kind. Can you observe/reproduce this behavior?

BenTheElder · 2022-02-08T05:22:43Z

this sounds like #2597

I cannot just this moment. will revisit.

victor-sudakov · 2022-02-09T02:52:19Z

I thought my Kind config could be useful:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
name: office
containerdConfigPatches:
  - |-
    [plugins."io.containerd.grpc.v1.cri".registry.mirrors."192.168.88.63:5000"]
      endpoint = ["http://192.168.88.63:5000"]
nodes:
  - role: control-plane
    kubeadmConfigPatches:
      - |
        kind: InitConfiguration
        nodeRegistration:
          kubeletExtraArgs:
            node-labels: "ingress-ready=true"
    extraPortMappings:
      - containerPort: 80
        hostPort: 8080
        protocol: TCP
      - containerPort: 443
        hostPort: 8081
        protocol: TCP
  - role: worker
  - role: worker

BenTheElder · 2022-02-17T01:01:29Z

thanks, I will try to replicate this soon, I suspect it's a variation on #2597

victor-sudakov · 2022-02-17T03:34:11Z

I have not seen this behavior under Kind with any images other than the haproxy image.

aojea · 2022-02-17T09:11:21Z

hmm ... #760 ...

The applications seem to do big allocations up front of memory based on the amount of file descriptors they have available.

some applications preallocate based on the # file descriptors, check the thread to see if this is the same please

victor-sudakov · 2022-02-17T16:37:25Z

@aojea setting fs.nr_open=1048576 on the host system before creating a Kind cluster seems to have helped. Thanks for the hint! How come that #760 was merged two years ago and I still see the problem on a recent Kind installation?

aojea · 2022-02-17T17:10:12Z

#1799

aojea · 2022-02-17T17:12:04Z

I agree with containerd, we should not cap the limits, because everybody can have different ones for different reasons ... other thing is that some software allocate a lot of memory based on them by default or some distros have very big values ...

aojea · 2022-02-17T17:13:07Z

/close

duplicate of #760

Thanks

k8s-ci-robot · 2022-02-17T17:13:18Z

@aojea: Closing this issue.

In response to this:

/close

duplicate of #760

Thanks

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

victor-sudakov · 2022-02-18T02:27:25Z

@aojea if this bug report is a duplicate of #760 merged in 2019, why am I still seeing this problem in a recent installation of Kind? Its 2022 now.

BenTheElder · 2022-02-18T09:27:06Z

@aojea if this bug report is a duplicate of #760 merged in 2019, why am I still seeing this problem in a recent installation of Kind? Its 2022 now.

See Antonio's comments following:

How come that #760 was merged two years ago and I still see the problem on a recent Kind installation?

In more detail:

In #760 we set the file handle limit on containerd (which your pods run under) to match upstream containerd. They've since reverted that change and set limits to infinity, which we match, in #1799 mentioned above (and then commented on in #2623 (comment)).

https://github.com/containerd/containerd/blob/3122239ee50624a0159d0107e53f3a9fd612570f/containerd.service#L29-L33

https://www.freedesktop.org/software/systemd/man/systemd.exec.html#:~:text=LimitSTACK%3D%2C%20LimitCORE%3D%2C%20LimitRSS%3D%2C-,LimitNOFILE,-%3D%2C%20LimitAS%3D%2C%20LimitNPROC%3D%2C%20LimitMEMLOCK

Setting limits is problematic, because its host/workload dependent. Ideally haproxy would not excessively consume file handles. I haven't had a chance to dig in to see what mitigations are available yet.

I might argue this is closer to #2597, where a library used by NFS server behaved similarly poorly after systemd 240+ started raising the file handle limit https://bugzilla.redhat.com/show_bug.cgi?id=1796545

That bug dates to 2020, and more recent NFS / underlying library should be patched (#1487 (comment)), but in #2597 the host is a bit more dated.

In this case it's less clear, somewhere between host config, kind/containerd/systemd, and the workload.

aojea · 2022-02-18T09:37:56Z

In this case it's less clear, somewhere between host config, kind/containerd/systemd, and the workload.

I think that is haproxy, isn't it?

This sample haproxy pod (see the manifest below) rapidly consumes as much memory as limits permit and starts consuming swap (and if there is no limit configured, it easily consumes gigabytes of memory).

victor-sudakov · 2022-02-18T10:44:52Z

I think that is haproxy, isn't it?

The same haproxy image behaves quite well under Minikube, for example. Or in a real kOps-managed cluster. I've tried this path: docker-library/haproxy#179 but the expected reply is "it's not a haproxy problem".

victor-sudakov added the kind/bug Categorizes issue or PR as related to a bug. label Feb 8, 2022

k8s-ci-robot closed this as completed Feb 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A Pod becomes a memory hog in Kind (the issue seems Kind-specific) #2623

A Pod becomes a memory hog in Kind (the issue seems Kind-specific) #2623

victor-sudakov commented Feb 8, 2022

BenTheElder commented Feb 8, 2022 •

edited

Loading

victor-sudakov commented Feb 8, 2022 •

edited

Loading

BenTheElder commented Feb 8, 2022

victor-sudakov commented Feb 9, 2022

BenTheElder commented Feb 17, 2022

victor-sudakov commented Feb 17, 2022

aojea commented Feb 17, 2022 •

edited

Loading

victor-sudakov commented Feb 17, 2022

aojea commented Feb 17, 2022

aojea commented Feb 17, 2022 •

edited

Loading

aojea commented Feb 17, 2022

k8s-ci-robot commented Feb 17, 2022

victor-sudakov commented Feb 18, 2022

BenTheElder commented Feb 18, 2022

aojea commented Feb 18, 2022

victor-sudakov commented Feb 18, 2022

A Pod becomes a memory hog in Kind (the issue seems Kind-specific) #2623

A Pod becomes a memory hog in Kind (the issue seems Kind-specific) #2623

Comments

victor-sudakov commented Feb 8, 2022

BenTheElder commented Feb 8, 2022 • edited Loading

victor-sudakov commented Feb 8, 2022 • edited Loading

BenTheElder commented Feb 8, 2022

victor-sudakov commented Feb 9, 2022

BenTheElder commented Feb 17, 2022

victor-sudakov commented Feb 17, 2022

aojea commented Feb 17, 2022 • edited Loading

victor-sudakov commented Feb 17, 2022

aojea commented Feb 17, 2022

aojea commented Feb 17, 2022 • edited Loading

aojea commented Feb 17, 2022

k8s-ci-robot commented Feb 17, 2022

victor-sudakov commented Feb 18, 2022

BenTheElder commented Feb 18, 2022

aojea commented Feb 18, 2022

victor-sudakov commented Feb 18, 2022

BenTheElder commented Feb 8, 2022 •

edited

Loading

victor-sudakov commented Feb 8, 2022 •

edited

Loading

aojea commented Feb 17, 2022 •

edited

Loading

aojea commented Feb 17, 2022 •

edited

Loading