Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid CPU-related output (CPU usage in the tens of thousands percent) #8

Closed
RaduDumi opened this issue Oct 28, 2013 · 1 comment
Closed
Labels

Comments

@RaduDumi
Copy link

Hi,

I've run this command:

pidstat -h -r -u -l -C java 1 >> pidstat.txt

The file pidstat.txt has about 214kbytes, about 1600 lines and starts with:

Linux 2.6.32-358.11.1.el6.x86_64 (hadoop-164)   10/28/2013      _x86_64_        (2 CPU)

There are some timestamps for which the sum of CPU usage for all processes is slightly over 200%. (Maybe it has something to do with 4.4 in http://sebastien.godard.pagesperso-orange.fr/faq.html#pidstat)

But there is one timestamp for which the sum of CPU usage is way over 200%. Somewhere in the middle of the file I've found this:

#      Time       PID    %usr %system  %guest    %CPU   CPU  minflt/s  majflt/s     VSZ    RSS   %MEM  Command
 1382954870      9070    0.00    0.00    0.00    0.00     1      2.00      0.00 2760612 142676   3.64  /usr/java/default/bin/java -Dproc_namenode -Xmx2048m -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote -Dcom.sun.m
 1382954870      9220    0.00    0.00    0.00    0.00     1      2.00      0.00 2716832 222168   5.66  /usr/java/default/bin/java -Dproc_secondarynamenode -Xmx2048m -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote -D
 1382954870      9313    1.00    1.00    0.00    2.00     1      0.00      0.00 2780532 153112   3.90  /usr/java/default/bin/java -Dproc_jobtracker -Xmx2048m -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote -Dcom.sun
 1382954870     16199    0.00    0.00    0.00    0.00     1      3.00      0.00 1595236  91828   2.34  /usr/java/default//bin/java -XX:OnOutOfMemoryError=kill -9 %p -Xmx1000m -XX:+UseConcMarkSweepGC -Dhbase.log.dir=/home/mmmmm/hhh

#      Time       PID    %usr %system  %guest    %CPU   CPU  minflt/s  majflt/s     VSZ    RSS   %MEM  Command
 1382954871      9070    1.00    1.00    0.00    2.00     1     35.00      0.00 2760612 142740   3.64  /usr/java/default/bin/java -Dproc_namenode -Xmx2048m -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote -Dcom.sun.m

#      Time       PID    %usr %system  %guest    %CPU   CPU  minflt/s  majflt/s     VSZ    RSS   %MEM  Command
 1382954872      9220    1.00    0.00    0.00    1.00     1      0.00      0.00 2716832 222168   5.66  /usr/java/default/bin/java -Dproc_secondarynamenode -Xmx2048m -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote -D
 1382954872     16199 33162.00 9627.00    0.00 42789.00     1  62477.00      0.00 1595236  91828   2.34  /usr/java/default//bin/java -XX:OnOutOfMemoryError=kill -9 %p -Xmx1000m -XX:+UseConcMarkSweepGC -Dhbase.log.dir=/home/mmmmm/hhh
 1382954872     16267 68875.00 10469.00    0.00 79344.00     0 253793.00      0.00 1662464 115128   2.93  /usr/java/default//bin/java -XX:OnOutOfMemoryError=kill -9 %p -Xmx1000m -XX:+UseConcMarkSweepGC -Dhbase.log.dir=/home/mmmmm/hhh

#      Time       PID    %usr %system  %guest    %CPU   CPU  minflt/s  majflt/s     VSZ    RSS   %MEM  Command

#      Time       PID    %usr %system  %guest    %CPU   CPU  minflt/s  majflt/s     VSZ    RSS   %MEM  Command
 1382954874      9313    1.00    0.00    0.00    1.00     1      0.00      0.00 2780532 153112   3.90  /usr/java/default/bin/java -Dproc_jobtracker -Xmx2048m -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote -Dcom.sun

#      Time       PID    %usr %system  %guest    %CPU   CPU  minflt/s  majflt/s     VSZ    RSS   %MEM  Command
 1382954875      9070    1.00    0.00    0.00    1.00     1      0.00      0.00 2760612 142740   3.64  /usr/java/default/bin/java -Dproc_namenode -Xmx2048m -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote -Dcom.sun.m
 1382954875      9313    0.00    0.00    0.00    0.00     1      2.00      0.00 2780532 153112   3.90  /usr/java/default/bin/java -Dproc_jobtracker -Xmx2048m -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote -Dcom.sun
 1382954875      9481    1.00    0.00    0.00    1.00     1      2.00      0.00 2184424  77812   1.98  /usr/java/jdk1.7.0_21/bin/java -Dquest7.cid=/opt/aaaaaaa/mmmmmmmmmm -Duser.timezone= -Dquest7.root=/opt/aaaaaaa/mmmmm -classpat
 1382954875     16267    1.00    0.00    0.00    1.00     0      3.00      0.00 1662464 115128   2.93  /usr/java/default//bin/java -XX:OnOutOfMemoryError=kill -9 %p -Xmx1000m -XX:+UseConcMarkSweepGC -Dhbase.log.dir=/home/mmmmm/hhh

The CPU-related values shown for processes 16199 and 16267 at timestamp 1382954872 are way out of range (CPU usage 42789.00% and 42789.00%, respectively).

@sysstat
Copy link
Owner

sysstat commented Oct 31, 2013

There are several things to consider.

The sum of CPU utilization for tasks running on a given CPU can exceed
100%. This is actually because the tasks have not necessarily spent
their whole time interval attached to that CPU (this is what is answered
to question 4.4 in the FAQ).

Then you can have the sum of %CPU for all tasks running on all CPUs
which can exceed 100% (and in fact, which can even reach N * 100% where
N is the number of CPUs available on your machine). Consider this sample
output from my machine (8 CPUs with 2 compute-intensive tasks running):

09:24:29 PM UID PID %usr %system %guest %CPU CPU Command
09:24:30 PM 0 804 0.00 1.00 0.00 1.00 7 Xorg
09:24:30 PM 991 1382 100.00 0.00 0.00 100.00 2
simap_5.10_x86_
09:24:30 PM 0 2012 1.00 0.00 0.00 1.00 4 boinc_gui
09:24:30 PM 991 2806 100.00 0.00 0.00 100.00 3
simap_5.10_x86_

The sum of %CPU here is 200% (well, 202% to be accurate). To get an
average CPU utilization among all processors, use option -I. The output
here would be:

09:24:56 PM UID PID %usr %system %guest %CPU CPU Command
09:24:57 PM 0 804 0.00 1.00 0.00 0.12 7 Xorg
09:24:57 PM 991 1382 100.00 0.00 0.00 12.41 2
simap_5.10_x86_
09:24:57 PM 0 2012 1.00 0.00 0.00 0.12 4 boinc_gui
09:24:57 PM 991 2806 100.00 0.00 0.00 12.41 3
simap_5.10_x86_

and the sum of %CPU no longer exceeds 100%.

Wrt %CPU values of 42789.00%, it looks like that counters somewhere have
overflown. Not really sure about which ones are concerned.

Sebastien GODARD <sysstat [at] orange.fr>
http://sebastien.godard.pagesperso-orange.fr/

@sysstat sysstat closed this as completed Jan 6, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants