Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update stats.go,Update "docker stats " calculations #13627

Closed
wants to merge 1 commit into from

Conversation

Liuyanglong
Copy link

bug fix #13626

When I read source code about " docker stat", the current is :
cpuPercent = (cpuDelta / systemDelta) * float64(len(v.CpuStats.CpuUsage.PercpuUsage))
https://github.com/docker/docker/blob/0d445685b8d628a938790e50517f3fb949b300e0/api/client/stats.go#L199

I do not understand why " * float64(len(v.CpuStats.CpuUsage.PercpuUsage)) ", (cpuDelta / systemDelta) is Correct。

cpuDelta = float64(v.CpuStats.CpuUsage.TotalUsage - previousCPU)
systemDelta = float64(v.CpuStats.SystemUsage - previousSystem)

so total cpu delta / total system delta is correct , not need to multiplied by CPU kernal count.

@GordonTheTurtle
Copy link

Please sign your commits following these rules:
https://github.com/docker/docker/blob/master/CONTRIBUTING.md#sign-your-work
The easiest way to do this is to amend the last commit:

$ git clone -b "Liuyanglong-patch-1" git@github.com:Liuyanglong/docker.git somewhere
$ cd somewhere
$ git commit --amend -s --no-edit
$ git push -f

Ammending updates the existing PR. You DO NOT need to open a new one.

@Liuyanglong Liuyanglong changed the title update stats.go update stats.go,Update "docker stats " calculations Jun 1, 2015
@coolljt0725
Copy link
Contributor

I think cpuPercent = (cpuDelta / systemDelta) * float64(len(v.CpuStats.CpuUsage.PercpuUsage)) is right, because cpuDelta is total time consumes of all cores and systemDelta is also the total time consumes of all cores, so (cpuDelta / systemDelta) is the average cpu usage of each core, so it need multiply the number of cpu cores to calculate the total cpu usage.
So I think this shloud be close ping @thaJeztah

@thaJeztah
Copy link
Member

total time consumes of all cores, so (cpuDelta / systemDelta) is the average cpu usage of each core, so it need multiply the number of cpu cores to calculate the total cpu usage.

Agreed; for example, on a 4-core system, cpu usage can be anywhere between 0 and 400%, so it has to be multiplied by the number of cores.

I'm going to close this PR, because I don't think this change is correct, but feel free to comment if you think I closed this PR by mistake.

@thaJeztah thaJeztah closed this Jun 1, 2015
@Liuyanglong Liuyanglong deleted the Liuyanglong-patch-1 branch June 2, 2015 01:31
@HuKeping
Copy link
Contributor

Do we need to add this to the docs? I've been asked for so many times about the "over 100 percentage of CPU usage of Docker"

@thaJeztah
Copy link
Member

@HuKeping I think it's fairly standard practice to report > 100% for multiple CPUs, but I'm not against adding it to the docs, so feel free to create a PR

@neeleshkorade
Copy link

@thaJeztah I have a couple of follow-up questions on this. We are using cpuset-cpus with docker. To get CPU utilization, would the multiplier float64(len(v.CpuStats.CpuUsage.PercpuUsage) as used above still hold good in our case?

Another question I have is that even though we are using cpuset-cpus, in the cgroups files, we see more than the set number of CPUs having non-zero usage. For example, on a host of 24 cores and (cpuset-cpus=1,2), more than 2 cores show usage > 0. Why is that so? Shouldn't only two (specific) cores be used for the container in this case?

@chengyli
Copy link

chengyli commented Jul 7, 2018

Sorry to ask question on this closed PR, I have a question about the algorithm to calculate the cpu usage of a container.

As the cpuacct.usage has been defined as cpu time in nanoseconds, then why not calculating the container cpu usage by cpuPercent = (cpuDelta / nanoSecondsPerSecond) * 100.0, why must get SystemUsage from /proc/stat ?

/sys/fs/cgroup/cpuacct.usage gives the CPU time (in nanoseconds) obtained
by this group which is essentially the CPU time obtained by all the tasks
in the system.

@wenlxie
Copy link
Contributor

wenlxie commented Sep 17, 2019

I am also confused about why need to get SystemUsage from /proc/stat
What's the difference between SystemUsage and DeltaTime * core ?

@BLasan
Copy link

BLasan commented Sep 23, 2021

@thaJeztah Hi, when calculating cpu statistics, from which file the CPU system usage will be read? Is it the cpu value for the system only inside the /proc/stat file?

@thaJeztah
Copy link
Member

Stats are returned by the containerd API;

cs, err := daemon.containerd.Stats(context.Background(), c.ID)

m, err := p.(containerd.Task).Metrics(ctx)
if err != nil {
return nil, err
}
v, err := typeurl.UnmarshalAny(m.Data)
if err != nil {
return nil, err
}
return libcontainerdtypes.InterfaceToStats(m.Timestamp, v), nil

Which uses the cgroups module to collect the status (it contains implementations for cgroups v1 and v2); https://github.com/containerd/cgroups

I don't know what exactly is used there from the top of my head, but perhaps that code allows you to find the details

@BLasan
Copy link

BLasan commented Sep 23, 2021

Stats are returned by the containerd API;

cs, err := daemon.containerd.Stats(context.Background(), c.ID)

m, err := p.(containerd.Task).Metrics(ctx)
if err != nil {
return nil, err
}
v, err := typeurl.UnmarshalAny(m.Data)
if err != nil {
return nil, err
}
return libcontainerdtypes.InterfaceToStats(m.Timestamp, v), nil

Which uses the cgroups module to collect the status (it contains implementations for cgroups v1 and v2); https://github.com/containerd/cgroups

I don't know what exactly is used there from the top of my head, but perhaps that code allows you to find the details

Thanks !! I'll go through the code. Hope, the same approach was taken to monitor the network IO as well

@thaJeztah
Copy link
Member

There's some code in the stats collector for networking;

moby/daemon/stats.go

Lines 156 to 158 in 7b9275c

if stats.Networks, err = daemon.getNetworkStats(container); err != nil {
return nil, err
}

Oh, I realised there's some code related to system CPU in the collector code as well, in case it's relevant to your question;

systemUsage, err := s.getSystemCPUUsage()

@BLasan
Copy link

BLasan commented Sep 23, 2021

There's some code in the stats collector for networking;

moby/daemon/stats.go

Lines 156 to 158 in 7b9275c

if stats.Networks, err = daemon.getNetworkStats(container); err != nil {
return nil, err
}

Oh, I realised there's some code related to system CPU in the collector code as well, in case it's relevant to your question;

systemUsage, err := s.getSystemCPUUsage()

Yeah, but the problem is which file this getSystemCPUUsage use ? Does this mean the summation of all CPU usage by the system (user+system+idle_irq_softirq .. etc) ? Btw thanks for pointing out these lines of code. Will inspect and try to figure it out

@segaura
Copy link

segaura commented Sep 27, 2022

Yeah, but the problem is which file this getSystemCPUUsage use ? Does this mean the summation of all CPU usage by the system (user+system+idle_irq_softirq .. etc) ? Btw thanks for pointing out these lines of code. Will inspect and try to figure it out

Looking at

https://github.com/moby/moby/blob/2b70006e3bfa492b8641ff443493983d832955f4/daemon/stats/collector_unix.go

I see this declaration

// getSystemCPUUsage returns the host system's cpu usage in
// nanoseconds. An error is returned if the format of the underlying
// file does not match.
//
// Uses /proc/stat defined by POSIX. Looks for the cpu
// statistics line and then sums up the first seven fields
// provided. See `man 5 proc` for details on specific field
// information.
func (s *Collector) getSystemCPUUsage() (uint64, error) {

and summation of first 7 /proc/stat columns is exactly what you guessed, including user, kernel, idle and everything time.
Am I wrong?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Why "docker stat " calculation must multiplied by CPU kernal count?
10 participants