Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libcontainer: cgroups: add intel_rdt support in runc #447

Closed
wants to merge 2 commits into from

Conversation

xiaochenshen
Copy link
Contributor

This PR fixes issue #433 (Proposal: Intel RDT/CAT cgroup support in runc/libcontainer)

Patch v2:
Rebased the code for "unified config file" according to #284.
Commit 273cbbb: is to update specs for compiling, it is NOT necessary for this pull request
when opencontainers/runtime-spec#267 is merged.

Commit 9808367:
This PR fixes issue #433

Patch v1:
This version is for code review only because kernel patch is not upstream yet and the
specs change is under discussion in opencontainers/runtime-spec#267

Commit b328258: is to update specs for compiling, it is NOT necessary for this pull request
when opencontainers/runtime-spec#267 is merged. I am not sure if this will break Jenkins building.

Commit febaf82:
This PR fixes issue #433

About Intel RDT/CAT feature:
Intel platforms with new Xeon CPU support Resource Director Technology (RDT).
Intel Cache Allocation Technology (CAT) is a sub-feature of RDT. Currently L3
Cache is the only resource that is supported in RDT.

This feature provides a way for the software to restrict cache allocation to a
defined 'subset' of L3 cache which may be overlapping with other 'subsets'.
The different subsets are identified by class of service (CLOS) and each CLOS
has a capacity bitmask (CBM).

More information can be found in the section 17.16 of Intel Software Developer
Manual.

About intel_rdt cgroup:
Linux kernel 4.6 (or later) will introduce new cgroup subsystem 'intel_rdt'
with kernel config CONFIG_INTEL_RDT.

The 'intel_rdt' cgroup manages L3 cache allocation. It has a file 'l3_cbm'
which represents the L3 cache capacity bitmask (CBM). The CBM needs to have
only contiguous bits set and number of bits that can be set is less than the
max bits. The max bits in the CBM is varied among supported Intel platforms.
The tasks belonging to a cgroup get to fill in the L3 cache represented by
the CBM. For example, if the max bits in the CBM is 10 and the L3 cache size
is 10MB, each bit represents 1MB of the L3 cache capacity.

Root cgroup always has all the bits set in the l3_cbm. User can create more
cgroups with mkdir syscall. By default the child cgroups inherit the CBM from
parent. User can change the CBM specified in hex for each cgroup.

For more information about intel_rdt cgroup:
https://lkml.org/lkml/2015/12/17/574

An example:
Root cgroup: intel_rdt.l3_cbm == 0xfffff, the max bits of CBM is 20
L3 cache size: 55 MB
This assigns 11 MB (1/5) of L3 cache to the child group:
$ /bin/echo 0xf > intel_rdt.l3_cbm

Signed-off-by: Xiaochen Shen xiaochen.shen@intel.com

@xiaochenshen
Copy link
Contributor Author

I just run some tests based on native Cgroupfs in runc. I haven't tried on SystemdCgroups.

If my understanding is correct, runc uses Cgroupfs by default with LinuxFactory.
Who knows how to test systemd based cgroup functions in runc? Thank you.

@hqhq
Copy link
Contributor

hqhq commented Dec 25, 2015

@xiaochenshen Currently you can only test systemd-cgroup through Docker.

@xiaochenshen
Copy link
Contributor Author

hqhq commented
@xiaochenshen Currently you can only test systemd-cgroup through Docker.

@hqhq
My test bed: CentOS 7.2 + 4.3 kernel (w/ intel_rdt patches). I have tried:

  • Based on docker master branch.
  • Added the runc patches in this PR.
  • Added some intel_rdt cgroup hooks for docker run (intel_rdt patch for Docker is WIP)

Both Systemd-Cgroup and Cgroupfs work for intel_rdt cgroup. The basic test
results are as expected:

Systemd Cgroup
$ docker daemon --exec-opt native.cgroupdriver=systemd
$ docker run --intelrdt-l3cbm=0xff -it busybox
$ cat /sys/fs/cgroup/intel_rdt/system.slice/docker-b6c68342c3551868385ee224a655c837ea9f4a89702c2e1ad935b5476d9c7b01.scope/intel_rdt.l3_cbm
000000ff

Cgroupfs:
$ docker daemon --exec-opt native.cgroupdriver=cgroupfs
$ docker run --intelrdt-l3cbm=0xff -it busybox
$ cat /sys/fs/cgroup/intel_rdt/docker/0c4961b3904d4d6a160f32913da63b126048a183366bfd116ec337ab9e50af06/intel_rdt.l3_cbm
000000ff

@LK4D4
Copy link
Contributor

LK4D4 commented Jan 15, 2016

@xiaochenshen You shouldn't change vendored files(in Godeps directory). Code for cgroups itself looks good. We can merge it in libcontainer without runc if you don't want to wait merge in specs.

@xiaochenshen
Copy link
Contributor Author

@xiaochenshen You shouldn't change vendored files(in Godeps directory). Code for cgroups itself looks good. We can merge it in libcontainer without runc if you don't want to wait merge in specs.

@LK4D4 The change in Godeps directory is only for compiling without merge in specs.
Thank you for pointing out that we can only merge libcontainer change. But one more thing discussed in opencontainers/runtime-spec#267, it makes sense to wait for kernel patch merged in upstream in case any changes in kernel.

@crosbymichael crosbymichael modified the milestone: 0.0.9 Feb 10, 2016
@crosbymichael
Copy link
Member

@xiaochenshen can you rebase this now that we have the spec updated?

This patch is not necessary if this pull request is merged:
opencontainers/runtime-spec#267

Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
This PR fixes issue opencontainers#433
opencontainers#433

About Intel RDT/CAT feature:
Intel platforms with new Xeon CPU support Resource Director Technology (RDT).
Intel Cache Allocation Technology (CAT) is a sub-feature of RDT. Currently L3
Cache is the only resource that is supported in RDT.

This feature provides a way for the software to restrict cache allocation to a
defined 'subset' of L3 cache which may be overlapping with other 'subsets'.
The different subsets are identified by class of service (CLOS) and each CLOS
has a capacity bitmask (CBM).

More information can be found in the section 17.16 of Intel Software Developer
Manual.

About intel_rdt cgroup:
Linux kernel 4.6 (or later) will introduce new cgroup subsystem 'intel_rdt'
with kernel config CONFIG_INTEL_RDT.

The 'intel_rdt' cgroup manages L3 cache allocation. It has a file 'l3_cbm'
which represents the L3 cache capacity bitmask (CBM). The CBM needs to have
only *contiguous bits set* and number of bits that can be set is less than the
max bits. The max bits in the CBM is varied among supported Intel platforms.
The tasks belonging to a cgroup get to fill in the L3 cache represented by
the CBM. For example, if the max bits in the CBM is 10 and the L3 cache size
is 10MB, each bit represents 1MB of the L3 cache capacity.

Root cgroup always has all the bits set in the l3_cbm. User can create more
cgroups with mkdir syscall. By default the child cgroups inherit the CBM from
parent. User can change the CBM specified in hex for each cgroup.

For more information about intel_rdt cgroup:
https://lkml.org/lkml/2015/12/17/574

An example:
    Root cgroup: intel_rdt.l3_cbm == 0xfffff, the max bits of CBM is 20
    L3 cache size: 55 MB
This assigns 11 MB (1/5) of L3 cache to the child group:
    $ /bin/echo 0xf > intel_rdt.l3_cbm

Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
@xiaochenshen
Copy link
Contributor Author

@crosbymichael @hqhq
I have rebased the code for "unified config file" according to #284.
Current commits:
273cbbb
9808367

@hqhq
Copy link
Contributor

hqhq commented Feb 14, 2016

This can't be merged if it depends on changes in specs which is not landed in specs. but you can add intel_rdt support to libcontainer easily without touching specs and runc part.

And IIUC, runc is a OCI compliant implementation which can also support features out of specs, so this can also added to runC without specs changes. You just can't do it easily with current runC implementation, which only takes configs in specs now, but that can be changed IMO.

@crosbymichael crosbymichael modified the milestones: 0.0.9, 0.1.0 Feb 18, 2016
@cyphar cyphar removed this from the 0.1.0 milestone May 9, 2016
@xiaochenshen
Copy link
Contributor Author

This PR could be closed for the new PR #1198 is open.

@LK4D4
Copy link
Contributor

LK4D4 commented Nov 21, 2016

@xiaochenshen thanks!

@LK4D4 LK4D4 closed this Nov 21, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants