Permalink
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
perf/core: Share an event with multiple cgroups
As we can run many jobs (in container) on a big machine, we want to
measure each job's performance during the run. To do that, the
perf_event can be associated to a cgroup to measure it only.
However such cgroup events need to be opened separately and it causes
significant overhead in event multiplexing during the context switch
as well as resource consumption like in file descriptors and memory
footprint.
As a cgroup event is basically a cpu event, we can share a single cpu
event for multiple cgroups. All we need is a separate counter (and
two timing variables) for each cgroup. I added a hash table to map
from cgroup id to the attached cgroups.
With this change, the cpu event needs to calculate a delta of event
counter values when the cgroups of current and the next task are
different. And it attributes the delta to the current task's cgroup.
This patch adds two new ioctl commands to perf_event for light-weight
cgroup event counting (i.e. perf stat).
* PERF_EVENT_IOC_ATTACH_CGROUP - it takes a buffer consists of a
64-bit array to attach given cgroups. The first element is a
number of cgroups in the buffer, and the rest is a list of cgroup
ids to add a cgroup info to the given event.
* PERF_EVENT_IOC_READ_CGROUP - it takes a buffer consists of a 64-bit
array to get the event counter values. The first element is size
of the array in byte, and the second element is a cgroup id to
read. The rest is to save the counter value and timings.
This attaches all cgroups in a single syscall and I didn't add the
DETACH command deliberately to make the implementation simple. The
attached cgroup nodes would be deleted when the file descriptor of the
perf_event is closed.
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>- Loading branch information