Permalink
Fetching contributors…
Cannot retrieve contributors at this time
210 lines (184 sloc) 11.1 KB
Demonstrations of ext4slower, the Linux eBPF/bcc version.
ext4slower shows ext4 reads, writes, opens, and fsyncs, slower than a threshold.
For example:
# ./ext4slower
Tracing ext4 operations slower than 10 ms
TIME COMM PID T BYTES OFF_KB LAT(ms) FILENAME
06:35:01 cron 16464 R 1249 0 16.05 common-auth
06:35:01 cron 16463 R 1249 0 16.04 common-auth
06:35:01 cron 16465 R 1249 0 16.03 common-auth
06:35:01 cron 16465 R 4096 0 10.62 login.defs
06:35:01 cron 16464 R 4096 0 10.61 login.defs
06:35:01 cron 16463 R 4096 0 10.63 login.defs
06:35:01 cron 16465 R 2972 0 18.52 pam_env.conf
06:35:01 cron 16464 R 2972 0 18.51 pam_env.conf
06:35:01 cron 16463 R 2972 0 18.49 pam_env.conf
06:35:01 dumpsystemstat 16473 R 128 0 12.58 date
06:35:01 debian-sa1 16474 R 283 0 12.66 sysstat
06:35:01 debian-sa1 16474 R 128 0 10.39 sa1
06:35:01 dumpsystemstat 16491 R 128 0 13.22 ifconfig
06:35:01 DumpThreads 16534 R 128 0 12.78 cut
06:35:01 cron 16545 R 128 0 14.76 sendmail
06:35:01 sendmail 16545 R 274 0 10.88 dynamicmaps.cf
06:35:02 postdrop 16546 R 118 0 32.94 Universal
06:35:02 pickup 9574 R 118 0 21.02 localtime
[...]
This shows various system tasks reading from ext4. The high latency here is
due to disk I/O, as I had just evicted the file system cache for this example.
This "latency" is measured from when the operation was issued from the VFS
interface to the file system, to when it completed. This spans everything:
block device I/O (disk I/O), file system CPU cycles, file system locks, run
queue latency, etc. This is a better measure of the latency suffered by
applications reading from the file system than measuring this down at the
block device interface.
Note that this only traces the common file system operations previously
listed: other file system operations (eg, inode operations including
getattr()) are not traced.
The threshold can be provided as an argument. Eg, I/O slower than 1 ms:
# ./ext4slower 1
Tracing ext4 operations slower than 1 ms
TIME COMM PID T BYTES OFF_KB LAT(ms) FILENAME
06:49:17 bash 3616 R 128 0 7.75 cksum
06:49:17 cksum 3616 R 39552 0 1.34 [
06:49:17 cksum 3616 R 96 0 5.36 2to3-2.7
06:49:17 cksum 3616 R 96 0 14.94 2to3-3.4
06:49:17 cksum 3616 R 10320 0 6.82 411toppm
06:49:17 cksum 3616 R 65536 0 4.01 a2p
06:49:17 cksum 3616 R 55400 0 8.77 ab
06:49:17 cksum 3616 R 36792 0 16.34 aclocal-1.14
06:49:17 cksum 3616 R 15008 0 19.31 acpi_listen
06:49:17 cksum 3616 R 6123 0 17.23 add-apt-repository
06:49:17 cksum 3616 R 6280 0 18.40 addpart
06:49:17 cksum 3616 R 27696 0 2.16 addr2line
06:49:17 cksum 3616 R 58080 0 10.11 ag
06:49:17 cksum 3616 R 906 0 6.30 ec2-meta-data
06:49:17 cksum 3616 R 6320 0 10.00 animate.im6
06:49:17 cksum 3616 R 5680 0 18.69 anytopnm
06:49:17 cksum 3616 R 2671 0 20.27 apport-bug
06:49:17 cksum 3616 R 12566 0 16.72 apport-cli
06:49:17 cksum 3616 R 1622 0 7.95 apport-unpack
06:49:17 cksum 3616 R 10440 0 2.37 appres
06:49:17 cksum 3616 R 48112 0 5.42 whatis
06:49:17 cksum 3616 R 14832 0 6.24 apt
06:49:17 cksum 3616 R 65536 0 24.74 apt-cache
06:49:17 cksum 3616 R 27264 0 1.68 apt-cdrom
06:49:17 cksum 3616 R 23224 0 5.31 apt-extracttemplates
06:49:17 cksum 3616 R 65536 0 8.08 apt-ftparchive
06:49:17 cksum 3616 R 65536 128 2.92 apt-ftparchive
06:49:17 cksum 3616 R 65536 0 9.58 aptitude-curses
06:49:17 cksum 3616 R 65536 128 44.25 aptitude-curses
06:49:17 cksum 3616 R 65536 384 1.69 aptitude-curses
[...]
This time a cksum(1) command can be seen reading various files (from /usr/bin).
A threshold of 0 will trace all operations. Warning: the output will be
verbose, as it will include all file system cache hits.
# ./ext4slower 0
Tracing ext4 operations
TIME COMM PID T BYTES OFF_KB LAT(ms) FILENAME
06:58:05 supervise 1884 O 0 0 0.00 status.new
06:58:05 supervise 1884 W 18 0 0.02 status.new
06:58:05 supervise 1884 O 0 0 0.00 status.new
06:58:05 supervise 1884 W 18 0 0.01 status.new
06:58:05 supervise 15817 O 0 0 0.00 run
06:58:05 supervise 15817 R 92 0 0.00 run
06:58:05 supervise 15817 O 0 0 0.00 bash
06:58:05 supervise 15817 R 128 0 0.00 bash
06:58:05 supervise 15817 R 504 0 0.00 bash
06:58:05 supervise 15817 R 28 0 0.00 bash
06:58:05 supervise 15817 O 0 0 0.00 ld-2.19.so
06:58:05 supervise 15817 R 64 0 0.00 ld-2.19.so
06:58:05 supervise 15817 R 392 0 0.00 ld-2.19.so
06:58:05 run 15817 O 0 0 0.00 ld.so.cache
06:58:05 run 15817 O 0 0 0.00 libtinfo.so.5.9
06:58:05 run 15817 R 832 0 0.00 libtinfo.so.5.9
06:58:05 run 15817 O 0 0 0.00 libdl-2.19.so
06:58:05 run 15817 R 832 0 0.00 libdl-2.19.so
06:58:05 run 15817 O 0 0 0.00 libc-2.19.so
06:58:05 run 15817 R 832 0 0.00 libc-2.19.so
06:58:05 supervise 1876 O 0 0 0.00 status.new
06:58:05 supervise 1876 W 18 0 0.01 status.new
06:58:05 supervise 1895 O 0 0 0.00 status.new
06:58:05 supervise 1895 W 18 0 0.02 status.new
06:58:05 supervise 1876 O 0 0 0.00 status.new
06:58:05 supervise 1876 W 18 0 0.01 status.new
06:58:05 supervise 1872 O 0 0 0.00 status.new
06:58:05 supervise 1872 W 18 0 0.02 status.new
06:58:05 supervise 1895 O 0 0 0.00 status.new
06:58:05 supervise 1895 W 18 0 0.01 status.new
06:58:05 supervise 15818 R 92 0 0.00 run
06:58:05 supervise 15818 O 0 0 0.00 bash
06:58:05 supervise 15818 R 128 0 0.00 bash
06:58:05 supervise 15818 R 504 0 0.00 bash
06:58:05 supervise 15818 R 28 0 0.00 bash
06:58:05 supervise 15818 O 0 0 0.00 ld-2.19.so
06:58:05 supervise 15818 R 64 0 0.00 ld-2.19.so
06:58:05 supervise 15818 R 392 0 0.00 ld-2.19.so
06:58:05 supervise 15818 O 0 0 0.00 run
06:58:05 supervise 1888 O 0 0 0.00 status.new
06:58:05 supervise 1888 W 18 0 0.01 status.new
06:58:05 supervise 1888 O 0 0 0.00 status.new
06:58:05 supervise 1888 W 18 0 0.02 status.new
06:58:05 supervise 15822 R 119 0 0.00 run
06:58:05 supervise 15822 O 0 0 0.00 bash
06:58:05 supervise 15822 R 128 0 0.00 bash
06:58:05 supervise 15822 R 504 0 0.00 bash
06:58:05 supervise 15822 R 28 0 0.00 bash
06:58:05 supervise 15822 O 0 0 0.00 ld-2.19.so
06:58:05 supervise 15822 R 64 0 0.00 ld-2.19.so
06:58:05 supervise 15822 R 392 0 0.00 ld-2.19.so
06:58:05 supervise 1892 O 0 0 0.00 status.new
06:58:05 supervise 1892 W 18 0 0.02 status.new
06:58:05 supervise 1892 O 0 0 0.00 status.new
06:58:05 supervise 1892 W 18 0 0.02 status.new
06:58:05 supervise 15820 O 0 0 0.00 run
[...]
The output now includes open operations ("O"), and writes ("W").
A -j option will print just the fields (parsable output, csv):
# ./ext4slower -j 1
ENDTIME_us,TASK,PID,TYPE,BYTES,OFFSET_b,LATENCY_us,FILE
127200712278,bash,17225,R,128,0,14329,cksum
127200722986,cksum,17225,R,3274,0,8368,command-not-found
127200735581,cksum,17225,R,65536,0,10903,libbfd-2.24-system.so
127200738482,cksum,17225,R,65536,131072,2419,libbfd-2.24-system.so
127200749226,cksum,17225,R,65536,655360,8995,libbfd-2.24-system.so
127200776273,cksum,17225,R,55080,0,25297,libbind9.so.90.0.9
127200784688,cksum,17225,R,65536,0,7873,libblas.so.3.0
127200787551,cksum,17225,R,65536,131072,2386,libblas.so.3.0
127200795524,cksum,17225,R,18624,0,4947,libcpupower.so.3.13.0-49
127200802073,cksum,17225,R,65536,0,6410,libcwidget.so.3.0.0
127200808718,cksum,17225,R,65536,131072,6181,libcwidget.so.3.0.0
127200829518,cksum,17225,R,65536,0,14213,libdns.so.100.2.2
127200832916,cksum,17225,R,65536,131072,2911,libdns.so.100.2.2
127200841044,cksum,17225,R,65536,655360,6376,libdns.so.100.2.2
127200853646,cksum,17225,R,956,0,1022,libdumbnet.la
127200857814,cksum,17225,R,61096,0,4111,libdumbnet.so.1.0.1
127200869655,cksum,17225,R,65536,0,11252,libgettextlib-0.18.3.so
127200872985,cksum,17225,R,65536,131072,2882,libgettextlib-0.18.3.so
127200883063,cksum,17225,R,65536,0,9661,libgettextsrc-0.18.3.so
127200884767,cksum,17225,R,65536,131072,1251,libgettextsrc-0.18.3.so
127200904830,cksum,17225,R,65536,0,19571,libgirepository-1.0.so.1.0.0
127200906354,cksum,17225,R,65536,131072,1080,libgirepository-1.0.so.1.0.0
127200936047,cksum,17225,R,65536,0,28674,libGraphicsMagick.a
127200939091,cksum,17225,R,65536,131072,2576,libGraphicsMagick.a
127200947295,cksum,17225,R,65536,655360,6463,libGraphicsMagick.a
127200958793,cksum,17225,R,65536,1966080,7034,libGraphicsMagick.a
[...]
This may be useful for visualizing with another tool, for example, for
producing a scatter plot of ENDTIME vs LATENCY, to look for time-based
patterns.
USAGE message:
# ./ext4slower -h
usage: ext4slower [-h] [-j] [-p PID] [min_ms]
Trace common ext4 file operations slower than a threshold
positional arguments:
min_ms minimum I/O duration to trace, in ms (default 10)
optional arguments:
-h, --help show this help message and exit
-j, --csv just print fields: comma-separated values
-p PID, --pid PID trace this PID only
examples:
./ext4slower # trace operations slower than 10 ms (default)
./ext4slower 1 # trace operations slower than 1 ms
./ext4slower -j 1 # ... 1 ms, parsable output (csv)
./ext4slower 0 # trace all operations (warning: verbose)
./ext4slower -p 185 # trace PID 185 only