Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve logging when cgroupfs mount fails #15999

Merged
merged 1 commit into from
Jul 7, 2021
Merged

Improve logging when cgroupfs mount fails #15999

merged 1 commit into from
Jul 7, 2021

Conversation

chkhalt
Copy link

@chkhalt chkhalt commented May 4, 2021

When the variable mounted is false, we avoid to check the value of cgroupInstance because, after mount, it is related to an earlier state.

Fixes: #15997

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 4, 2021
@chkhalt chkhalt marked this pull request as ready for review May 4, 2021 09:18
@chkhalt chkhalt requested a review from a team as a code owner May 4, 2021 09:18
@ti-mo ti-mo requested review from a team and jibi and removed request for a team May 5, 2021 09:12
@ti-mo
Copy link
Contributor

ti-mo commented May 5, 2021

Looks good to me, but would appreciate someone more familiar with this to take a look.

Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Could you add to your commit msg Fixes: e8ca92b45469 ("cgroups: Add stubs for non-Linux platforms"). It will also help to check to which Cilium versions the fix has to be backported.

Also, cc @joestringer, who added the code, to double check whether we are not missing anything with the fix.

@chkhalt
Copy link
Author

chkhalt commented May 5, 2021

Thanks. Could you add to your commit msg Fixes: e8ca92b45469 ("cgroups: Add stubs for non-Linux platforms"). It will also help to check to which Cilium versions the fix has to be backported.

Sure! It's Done ;)

@joestringer joestringer added kind/cleanup This includes no functional changes. release-note/misc This PR makes changes that have no direct user impact. labels May 5, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 5, 2021
Copy link
Member

@joestringer joestringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Joao, can you add some additional context to the commit message about how you hit a problem here? It's not obvious to me at a glance why this is a problem. What environment did you use, what Cilium configuration, what was the behaviour that was wrong?

FYI this goes back further than e8ca92b ("cgroups: Add stubs for non-Linux platforms") , that commit only moved code around. This bug likely goes way back to its initial introduction.

@chkhalt
Copy link
Author

chkhalt commented May 6, 2021

Hi Joao, can you add some additional context to the commit message about how you hit a problem here? It's not obvious to me at a glance why this is a problem. What environment did you use, what Cilium configuration, what was the behaviour that was wrong?

Sure! I will place more infos in the commit but I would like to share something here to discuss.

I was trying to solve two issues about cilium running on kind but at the first time I realized that, for some reason, cilium was not able to derive a cgroup v2 sub-hierarchy as proposed by 866969.

I was following the "Getting Started Using Kind" (using the vagrant dev VMs with NETNEXT=1). After few (unsuccessful) attempts, I changed the following:

  • I used --set hostServices.enabled=true on helm, because I found this code.

    cilium/daemon/cmd/daemon.go

    Lines 575 to 579 in efc15cd

    // Cgroup v2 hierarchy root used for attachment is different when running on Kind.
    runsOnKind := option.Config.EnableHostReachableServices &&
    strings.HasPrefix(node.GetProviderID(), "kind://")
    cgroups.CheckOrMountCgrpFS(option.Config.CGroupRoot, runsOnKind)

  • To avoid conflicts (as spoted in pkg/cgroups/cgroups.go#L72, I disabled the cgroups v1 net_cls, net_prio: changing the kernel parameters into /etc/default/grub
    GRUB_CMDLINE_LINUX_DEFAULT="quiet cgroup_no_v1=net_cls cgroup_no_v1=net_prio"

Even so, the cgroup v2 sub-hierarchy was not created. So I decided to place some prints in the code and try to understand what was going wrong looking the logs - I followed the Building Container Images guide (building the master branch).

For this specific commit, If the cgroupv2 root is not mounted, the function IsMountFS will return false, false, nil because of this check:

if st.Dev == pst.Dev {
// parent has the same dev -- not a mount point
return false, false, nil
}

But after a successful mount (which does not return), the code will check if cgroupInstance is false but, as stated in the commit, it now reflects an earlier state and we don't have to check it.

if !mounted {
if err := mountCgroup(); err != nil {
return err
}
}
if !cgroupInstance {
return fmt.Errorf("Mount in the custom directory %s has a different filesystem than cgroup2", cgroupRoot)
}

I'm not sure if I miss something but IMHO this is the first thing we need to fix in order to set the sub-hierarchy. The other point seems to be related with the path format, defined in:

// To fix this, we need to attach the program to cgroup v2 root of each Kind node.
// Pods on a Kind node belong to a cgroup v2 which name format is the following:
//
// 0::/system.slice/docker-$(SOME_NODE_ID).scope/kubelet/kubepods/burstable/pod-$(SOME_POD_ID)
//

In my tests, cilium-agent always receives something like 0::/docker/b8701606d997dacd7729dcbc9fef30aa69dedfdb0c5a9f7f3f0590d37072a11d/system.slice/containerd.service when it reads /proc/self/cgroup - and that makes following check to fail:

if strings.HasPrefix(line, "0::") { // cgroup v2 starts with "0::"
re := regexp.MustCompile(`^.*kubelet`)
p := path.Join(cgroupRoot, strings.Split(line, "::")[1])
match := re.FindString(p)
if match == "" {
return "", fmt.Errorf("cannot find \"kubelet\" in cgroup v2 path: %q", p)
}
cgroupPath = match
}

Do you have any points or suggestions?

@aditighag
Copy link
Member

aditighag commented May 6, 2021

Hi Joao,

Thanks for your contribution.

But after a successful mount (which does not return), the code will check if cgroupInstance is false but, as stated in the commit, it now reflects an earlier state and we don't have to check it.

Yes, this does look like a bug to me. Having said that, I'm not super familiar with the code path to explain why we haven't hit this issue before.
The commit message can be improved to better reflect the problem instead of implementation details though.

For your 2nd question about cgroup path, I don't have a setup available to confirm your findings. I'll try to replicate your workflow.

@chkhalt
Copy link
Author

chkhalt commented May 6, 2021

Hi @aditighag,

The commit message can be improved to better reflect the problem instead of implementation details though.

Yes, indeed! I will place more details without get into the code.
@joestringer, should I remove the Fixes: e8ca92b45469 ("cgroups: Add stubs for non-Linux platforms") message?

For your 2nd question about cgroup path, I don't have a setup available to confirm your findings. I'll try to replicate your workflow.

Thanks ;)

@brb
Copy link
Member

brb commented May 6, 2021

In my tests, cilium-agent always receives something like 0::/docker/b8701606d997dacd7729dcbc9fef30aa69dedfdb0c5a9f7f3f0590d37072a11d/system.slice/containerd.service when it reads /proc/self/cgroup - and that makes following check to fail:

@johngv2 Is it on cilium-agent running on Kind? If yes, which Kind version and configuration?

@chkhalt
Copy link
Author

chkhalt commented May 6, 2021

In my tests, cilium-agent always receives something like 0::/docker/b8701606d997dacd7729dcbc9fef30aa69dedfdb0c5a9f7f3f0590d37072a11d/system.slice/containerd.service when it reads /proc/self/cgroup - and that makes following check to fail:

@johngv2 Is it on cilium-agent running on Kind? If yes, which Kind version and configuration?

@brb, exactly! I'm using kind v0.10.0 (kindest/node:v1.20.2) installed using GO111MODULE="on" go get sigs.k8s.io/kind@v0.10.0 and containerd containerd.io 1.4.4.

I started the cluster with kind create cluster --config king-config.yaml which is:

apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
networking:
  disableDefaultCNI: true

I thought it could be something related with kindest version, so I tried the versions v1.19 and v1.18 without success.

@brb
Copy link
Member

brb commented May 6, 2021

@joe2far Interesting. I'm running Kind v0.10.0 on Archlinux without any problems. Could you exec into all nodes and paste /proc/self/cgroup output? Also, what is your host OS? Maybe you could also include an equivalent of docker info for containerd?

@chkhalt
Copy link
Author

chkhalt commented May 6, 2021

@brb , I think I get the point. The pattern used in the cgroup path seems to be related with the used cgroup driver: it can be cgroupfs, systemd (some info about this). There are some underlying details I didn't understand yet, but the relevant point for now is that we have, at least, two patterns.

I made a simple test between the vagrant dev machine, and an arch linux I've installed, and I realized the differences between our environments seems to be as follows:

# cilium vagrant (ubuntu)
vagrant@k8s1:~/go/src/github.com/cilium/cilium$ curl --unix-socket /var/run/docker.sock http://localhost/v1.24/info 2>/dev/null | jq . | grep -i cgroup
  "CgroupDriver": "cgroupfs",
  "CgroupVersion": "1",

# arch linux
[vagrant@arch ~]$ curl --unix-socket /var/run/docker.sock http://localhost/v1.24/info 2>/dev/null | jq . | grep -i cgroup
  "CgroupDriver": "systemd",
  "CgroupVersion": "2",
    "cgroupns"

Btw, in the arch linux machine I have the following (after kind create cluster):

[vagrant@arch ~]$ kubectl exec -n kube-system kube-proxy-ft87g -- cat /proc/self/cgroup
0::/kubelet/kubepods/besteffort/poda9c0c0da-d46c-48d6-828c-f2a5ebff7098/5a6ba781e461ce41736b1754890ba1b2cf71fbd13d5da039d56baef76763f37a

What about to address this point in another issue?

@joestringer
Copy link
Member

joestringer commented May 6, 2021

I think we're getting somewhere, if I understand correctly it's something like:

If you deploy Cilium in a kind environment on ____ distro, by default the systemd cgroups driver is used which does not mount the cgroupsv2 filesystem in the location that Cilium expects. When Cilium attempts to check or mount the cgroupv2 filesystem, this fails with errors in the log like _______________. This commit fixes the issue by ensuring that when Cilium cannot find an existing cgroupsv2 filesystem, it can successfully mount one.

The change itself LGTM, I think with some text similar to the above in the commit message, the motivation should be clear.

@joestringer
Copy link
Member

If there's something off about the assumptions that Cilium is making about cgroup path patterns, I would suggest yes that can be investigated/solved separately; ideally we can also improve the unit tests to ensure that the core logic can handle different patterns.

@brb
Copy link
Member

brb commented May 7, 2021

What about to address this point in another issue?

@johngv2 Thanks for the info. The problem is that on the vagrant you have Docker running with cgroup v1 which is not expected to work. Could you create a separate issue for that?

@chkhalt
Copy link
Author

chkhalt commented May 7, 2021

@joestringer I have changed the issue message. I appreciate any suggestion ;)

@brb , the new issue will be related to "create cgroupv2 sub-hierarchy for cgroupv1 driver"?

@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.10.3 Jul 7, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.9.9 Jul 7, 2021
@aanm
Copy link
Member

aanm commented Jul 7, 2021

test-me-please

@joestringer
Copy link
Member

k8s-1.21-kernel-4.9 failures are hitting the issue fixed in #16767. A rebase would fix those, but I think they're safe to ignore given that everything else is passing and the functional change here is pretty minimal.

CI 3.0 timing worked out a bit poorly with the merge of #16631 . Basically when this code was run, it did not include the cilium-agent code changes from that PR, however it runs against the latest version of the github actions from that PR. What this means is that the CI sets up taints saying that cilium is not ready (so pods cannot be deployed), and cilium-agent should remove those taints to allow pod deployment to proceed. However the code to do that is not present in this PR.

I think this is good to just merge. If it broke cgroup mounting then we would see one of the socket LB tests in ginkgo fail out.

@joestringer joestringer changed the title Returns mountCgroup() when mounted is false Improve logging when cgroupfs mount fails Jul 7, 2021
@aditighag aditighag merged commit e3f9c61 into cilium:master Jul 7, 2021
@chkhalt chkhalt deleted the pr/fix_issue_15997 branch July 8, 2021 02:35
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.10 in 1.10.3 Jul 8, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.10 to Backport done to v1.10 in 1.10.3 Jul 15, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.9 in 1.9.9 Jul 15, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.9 to Backport done to v1.9 in 1.9.9 Jul 19, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.9 to Backport done to v1.9 in 1.9.9 Jul 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/cleanup This includes no functional changes. release-note/misc This PR makes changes that have no direct user impact.
Projects
No open projects
1.10.3
Backport done to v1.10
1.9.9
Backport done to v1.9
Development

Successfully merging this pull request may close these issues.

pkg/cgroups/cgroups_linux: cgrpCheckOrMountLocation can return error after mounting
9 participants