Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

Constrain the host sandbox cgroup for devices and cpusets by default when sandbox-cgroup-only selected #2793

Merged
merged 4 commits into from Jul 16, 2020

Conversation

egernst
Copy link
Member

@egernst egernst commented Jun 24, 2020

This PR pulls in package from Kubernetes for working with CPUSets.

This will calculate the union of the containers to create a sandbox-level CPUSet.

@auto-comment
Copy link

auto-comment bot commented Jun 24, 2020

Thank you for raising your pull request. Please note that the main development of Kata Containers has moved to the 2.0-dev branch of https://github.com/kata-containers/kata-containers repository. The kata-containers/runtime repository is kept for 1.x release maintenance. Please check twice if your change should go to the 2.0-dev branch directly.

If it is strongly required for adding the change to Kata Containers 1.x releases, please ping @kata-containers/runtime to assign a dedicated developer to be responsible for porting the change to 2.0-dev branch. Thanks!

@egernst egernst requested a review from a team as a code owner June 25, 2020 05:36
Copy link
Contributor

@jodh-intel jodh-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @egernst. Please could you change the log calls to remove your name :)

virtcontainers/pkg/cgroups/manager.go Outdated Show resolved Hide resolved
virtcontainers/pkg/cgroups/manager.go Outdated Show resolved Hide resolved
}).Debug("EGERNST")
}
m.Lock()
cgroups.CpusetCpus = cpuset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No validation on cpuset? Can it be blank? Can we check the expected format, etc?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An empty string is normal/expected. Validation is done @ the sandbox via CPUSets package, and I expect we'll just have an error, if the format somehow was broken, when we try to 'apply' the cgroup (see err handling there).

@egernst egernst changed the title sandbox: add function to calcule sandbox level cpuset WIP/RFC: sandbox: add function to calcule sandbox level cpuset Jun 26, 2020
@egernst egernst force-pushed the add-cpuset-calc-func branch 3 times, most recently from 9d442d3 to 78ed479 Compare June 29, 2020 16:14
@egernst
Copy link
Member Author

egernst commented Jun 29, 2020

@devimc @amshinde @jodh-intel @jcvenegas updated - PTAL. This is verified using cgroupfs. I will begin testing for systemd based group handling, but that will involve more cgroup handling updates outside the scope of 'fixing' the sandbox constraints.

devimc
devimc previously requested changes Jun 29, 2020
Copy link

@devimc devimc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @egernst - but I think that for security reasons the device cgroup is a must

virtcontainers/sandbox.go Outdated Show resolved Hide resolved
virtcontainers/pkg/cgroups/manager.go Outdated Show resolved Hide resolved
@egernst
Copy link
Member Author

egernst commented Jun 29, 2020

(still WIP until I have tests running in k8s)

@amshinde amshinde self-requested a review June 29, 2020 21:23
@egernst egernst changed the title WIP/RFC: sandbox: add function to calcule sandbox level cpuset Constrain the host sandbox cgroup for devices and cpusets by default when sandbox-cgroup-only selected Jun 29, 2020
@egernst
Copy link
Member Author

egernst commented Jun 29, 2020

Tested with containerd/k8s and with Docker CLI with cgroupfs. This should be about ready from my perspective.

@egernst
Copy link
Member Author

egernst commented Jul 1, 2020

@jodh-intel @bergwolf @devimc PTAL.

Copy link

@devimc devimc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I left some questions

virtcontainers/sandbox.go Show resolved Hide resolved
resources = *spec.Linux.Resources
// engine by default. The exception is for devices whitelist as well as sandbox-level
// CPUSet.
resources.Devices = spec.Linux.Resources.Devices
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, I understand why cpu and memory are not being honoured, but what about blockIO ?

Copy link
Member Author

@egernst egernst Jul 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple issues open for this ability now, and there's work in progress on K8S side to add this. Are you okay with holding off until that discussion (both on Kata, [1], and k8s side) complete to address this subsystem?

[1] - #2160

@egernst egernst force-pushed the add-cpuset-calc-func branch 3 times, most recently from d0cffb4 to 97b41ce Compare July 1, 2020 17:51
@egernst
Copy link
Member Author

egernst commented Jul 1, 2020

Updated to address feedback. PTAL @devimc et al

@devimc
Copy link

devimc commented Jul 1, 2020

Ci is not happy

Detected TravisCI Environment

Found 4 commits between commit 97b41cefa00bf41326262f076c93689852cf2fb1 and branch master

ERROR: Commit 63dab32080bc2dac749b080e55080e0239c545b8: Failed to find subsystem in subject: "add method to get resulting cpuset for a sandbox"

ERROR: checkcommits failed. See the document below for help on formatting

commits for the project.

@egernst
Copy link
Member Author

egernst commented Jul 8, 2020

/test

@egernst
Copy link
Member Author

egernst commented Jul 8, 2020

@chavafg @GabyCT any insights into this failure? http://jenkins.katacontainers.io/blue/organizations/jenkins/kata-containers-runtime-opensuse-15-PR/detail/kata-containers-runtime-opensuse-15-PR/1219/pipeline#log-3049 on SuSe test?

I expect my changes to only by exercised when dealing with sandbox group only flag set.

@egernst egernst mentioned this pull request Jul 8, 2020
2 tasks
@chavafg
Copy link
Contributor

chavafg commented Jul 9, 2020

not really, have restarted it

@egernst
Copy link
Member Author

egernst commented Jul 9, 2020

@devimc @amshinde still need reviews if you have time....

@amshinde
Copy link
Member

amshinde commented Jul 9, 2020

@jodh-intel Can you take another look?

@amshinde
Copy link
Member

@bergwolf @jcvenegas Can you take a look at this PR?

@jcvenegas
Copy link
Member

It is looking good probably update kata docs to document what is expected and why around cpusets would be great to point to users.

@amshinde
Copy link
Member

@chavafg @GabyCT I had restarted the failing CI, but still see some of them failing.
Are these expected to fail..can you take a look?

@chavafg
Copy link
Contributor

chavafg commented Jul 10, 2020

took a look and seems that the opensuse CI is only failing with this PR.
The arm failure seems to be not related to this PR, /cc @Pennyzct
vfio ci not sure, but seems that could be related to this PR. I see that vfio CI of tests repo is passing and looks very stable. /cc @cmaf

Copy link
Member

@bergwolf bergwolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@bergwolf
Copy link
Member

@chavafg The PR should only have impacts on cpuset/devices cgroups related tests.

@egernst
Copy link
Member Author

egernst commented Jul 14, 2020

@chavafg if you agree, can you help get this merged (skip the travis)?

@chavafg
Copy link
Contributor

chavafg commented Jul 14, 2020

So I looked at the configuration for the opensuse CI and I see that we use SandboxCgroupOnly = true. This seems to be the issue as we are using this flag in the VFIO CI which also fails.

This is the error I see in the kata-runtime logs of the CI job:

Jul 10 18:33:53 opensuse15-azuread71a0 kata-runtime[54761]: time="2020-07-10T18:33:53.8716997Z" level=debug msg="Request to hypervisor to update vCPUs" arch=amd64 command=run container=4xsQsD4p3c2dojkSwCg4 cpus-
sandbox=1 name=kata-runtime pid=54761 sandbox=4xsQsD4p3c2dojkSwCg4 source=virtcontainers subsystem=sandbox
Jul 10 18:33:53 opensuse15-azuread71a0 kata-runtime[54761]: time="2020-07-10T18:33:53.8806391Z" level=debug msg="Sandbox CPUs: 1" arch=amd64 command=run container=4xsQsD4p3c2dojkSwCg4 name=kata-runtime pid=54761
 sandbox=4xsQsD4p3c2dojkSwCg4 source=virtcontainers subsystem=sandbox
Jul 10 18:33:53 opensuse15-azuread71a0 kata-runtime[54761]: time="2020-07-10T18:33:53.890858Z" level=debug msg="Request to hypervisor to update memory" arch=amd64 command=run container=4xsQsD4p3c2dojkSwCg4 memor
y-sandbox-size-byte=2147483648 name=kata-runtime pid=54761 sandbox=4xsQsD4p3c2dojkSwCg4 source=virtcontainers subsystem=sandbox
Jul 10 18:33:53 opensuse15-azuread71a0 kata-runtime[54761]: time="2020-07-10T18:33:53.9044327Z" level=info msg="{\"QMP\": {\"version\": {\"qemu\": {\"micro\": 0, \"minor\": 0, \"major\": 5}, \"package\": \"kata-
static\"}, \"capabilities\": [\"oob\"]}}" arch=amd64 command=run container=4xsQsD4p3c2dojkSwCg4 name=kata-runtime pid=54761 source=virtcontainers subsystem=qmp
Jul 10 18:33:53 opensuse15-azuread71a0 kata-runtime[54761]: time="2020-07-10T18:33:53.9125341Z" level=info msg="{\"execute\":\"qmp_capabilities\"}" arch=amd64 command=run container=4xsQsD4p3c2dojkSwCg4 name=kata
-runtime pid=54761 source=virtcontainers subsystem=qmp
Jul 10 18:33:53 opensuse15-azuread71a0 kata-runtime[54761]: time="2020-07-10T18:33:53.9223659Z" level=info msg="{\"return\": {}}" arch=amd64 command=run container=4xsQsD4p3c2dojkSwCg4 name=kata-runtime pid=54761
 source=virtcontainers subsystem=qmp
Jul 10 18:33:53 opensuse15-azuread71a0 kata-runtime[54761]: time="2020-07-10T18:33:53.9304178Z" level=debug msg="Sandbox memory size: 2048 MB" arch=amd64 command=run container=4xsQsD4p3c2dojkSwCg4 name=kata-runt
ime pid=54761 sandbox=4xsQsD4p3c2dojkSwCg4 source=virtcontainers subsystem=sandbox
Jul 10 18:33:53 opensuse15-azuread71a0 kata-runtime[54761]: time="2020-07-10T18:33:53.937722Z" level=info msg="New client" arch=amd64 command=run container=4xsQsD4p3c2dojkSwCg4 name=kata-runtime pid=54761 proxy=
54784 source=virtcontainers subsystem=kata_agent url="unix:///run/vc/sbs/4xsQsD4p3c2dojkSwCg4/proxy.sock"
Jul 10 18:33:53 opensuse15-azuread71a0 kata-runtime[54761]: time="2020-07-10T18:33:53.9461859Z" level=debug msg="sending request" arch=amd64 command=run container=4xsQsD4p3c2dojkSwCg4 name=grpc.OnlineCPUMemReque
st pid=54761 req= source=virtcontainers subsystem=kata_agent
Jul 10 18:33:53 opensuse15-azuread71a0 kata-runtime[54761]: time="2020-07-10T18:33:53.9595868Z" level=error msg="fatal error" arch=amd64 name=kata-runtime panic="runtime error: invalid memory address or nil poin
ter dereference" pid=54761 source=runtime
Jul 10 18:33:54 opensuse15-azuread71a0 kata-runtime[54761]: time="2020-07-10T18:33:54.0764491Z" level=error msg="heap profile: 0: 0 [2: 576] @ heap/1048576" arch=amd64 name=kata-runtime pid=54761 source=runtime
Jul 10 18:33:54 opensuse15-azuread71a0 kata-runtime[54761]: time="2020-07-10T18:33:54.0992272Z" level=error msg="0: 0 [0: 0] @ 0x5563c04c360f 0x5563c04c445d 0x5563c04c01db 0x5563c04bdf53 0x5563c04bdabc 0x5563c04
c13bf 0x5563c04c10de 0x5563c04bdf95 0x5563c04bdabc 0x5563c04c4159 0x5563c04c445d 0x5563c04c01db 0x5563c04bdf53 0x5563c04bdabc 0x5563c04c4159 0x5563c04c445d 0x5563c04c01db 0x5563c04bdf53 0x5563c04bdabc 0x5563c04c
4159 0x5563c04c445d 0x5563c04c01db 0x5563c04bdf53 0x5563c04bdabc 0x5563c04c162f 0x5563c04bdecf 0x5563c04bdabc 0x5563c04c4159 0x5563c04c445d 0x5563c04c01db 0x5563c04bdf53 0x5563c04bdabc" arch=amd64 name=kata-runt
ime pid=54761 source=runtime
Jul 10 18:33:54 opensuse15-azuread71a0 kata-runtime[54761]: time="2020-07-10T18:33:54.1061465Z" level=error msg="#\t0x5563c04c360e\tencoding/json.typeFields+0xb2e\t\t/usr/local/go/src/encoding/json/encode.go:117
6" arch=amd64 name=kata-runtime pid=54761 source=runtime
Jul 10 18:33:54 opensuse15-azuread71a0 kata-runtime[54761]: time="2020-07-10T18:33:54.1129271Z" level=error msg="#\t0x5563c04c445c\tencoding/json.cachedTypeFields+0xec\t/usr/local/go/src/encoding/json/encode.go:
1278" arch=amd64 name=kata-runtime pid=54761 source=runtime
Jul 10 18:33:54 opensuse15-azuread71a0 kata-runtime[54761]: time="2020-07-10T18:33:54.121212Z" level=error msg="#\t0x5563c04c01da\tencoding/json.newStructEncoder+0x3a\t/usr/local/go/src/encoding/json/encode.go:6
74" arch=amd64 name=kata-runtime pid=54761 source=runtime
Jul 10 18:33:54 opensuse15-azuread71a0 kata-runtime[54761]: time="2020-07-10T18:33:54.1324095Z" level=error msg="#\t0x5563c04bdf52\tencoding/json.newTypeEncoder+0x372\t/usr/local/go/src/encoding/json/encode.go:4
25" arch=amd64 name=kata-runtime pid=54761 source=runtime
Jul 10 18:33:54 opensuse15-azuread71a0 kata-runtime[54761]: time="2020-07-10T18:33:54.1393119Z" level=error msg="#\t0x5563c04bdabb\tencoding/json.typeEncoder+0x19b\t\t/usr/local/go/src/encoding/json/encode.go:38
1" arch=amd64 name=kata-runtime pid=54761 source=runtime
Jul 10 18:33:54 opensuse15-azuread71a0 kata-runtime[54761]: time="2020-07-10T18:33:54.1466495Z" level=error msg="#\t0x5563c04c13be\tencoding/json.newArrayEncoder+0x4e\t/usr/local/go/src/encoding/json/encode.go:7
97" arch=amd64 name=kata-runtime pid=54761 source=runtime

You can see the full log by downloading the kata-runtime from the artifacts of the CI job: http://jenkins.katacontainers.io/job/kata-containers-runtime-opensuse-15-PR/1221/artifact/artifacts/kata-runtime_06.gz

You should be able to reproduce in your environment using this flag and running the functional tests from the tests repo:

sudo -E PATH=$PATH make functional

Copy link
Contributor

@jodh-intel jodh-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @egernst.

lgtm

Eric Ernst added 4 commits July 15, 2020 22:16
Calculate sandbox's CPUSet as the union of each of the container's
CPUSets.

Signed-off-by: Eric Ernst <eric@amperecomputing.com>
added for calculating union of cpusets

Signed-off-by: Eric Ernst <eric@amperecomputing.com>
add function for applying a CPUset change to a cgroup

Signed-off-by: Eric Ernst <eric@amperecomputing.com>
Allow for constraining the cpuset as well as the devices-whitelist . Revert
sandbox constraints for cpu/memory, as they break the K8S use case. Can
re-add behind a non-default flag in the future.

The sandbox CPUSet should be updated every time a container is created,
updated, or removed.

To facilitate this without rewriting the 'non constrained cgroup'
handling, let's add to the Sandbox's cgroupsUpdate function.

Fixes: #2792

Signed-off-by: Eric Ernst <eric@amperecomputing.com>
@egernst
Copy link
Member Author

egernst commented Jul 16, 2020

addressed issue.

/test

@egernst
Copy link
Member Author

egernst commented Jul 16, 2020

@chavafg PTAL?

@chavafg
Copy link
Contributor

chavafg commented Jul 16, 2020

Thanks @egernst, ARM CI failure is unrelated and SLES CI is also having unrelated issues. We can merge this.

@ariel-adam
Copy link

Fixes #2811

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
port-to-2.0 PRs that need to be ported to kata 2.0-dev branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants