Permalink
| Demonstrations of ext4slower, the Linux eBPF/bcc version. | |
| ext4slower shows ext4 reads, writes, opens, and fsyncs, slower than a threshold. | |
| For example: | |
| # ./ext4slower | |
| Tracing ext4 operations slower than 10 ms | |
| TIME COMM PID T BYTES OFF_KB LAT(ms) FILENAME | |
| 06:35:01 cron 16464 R 1249 0 16.05 common-auth | |
| 06:35:01 cron 16463 R 1249 0 16.04 common-auth | |
| 06:35:01 cron 16465 R 1249 0 16.03 common-auth | |
| 06:35:01 cron 16465 R 4096 0 10.62 login.defs | |
| 06:35:01 cron 16464 R 4096 0 10.61 login.defs | |
| 06:35:01 cron 16463 R 4096 0 10.63 login.defs | |
| 06:35:01 cron 16465 R 2972 0 18.52 pam_env.conf | |
| 06:35:01 cron 16464 R 2972 0 18.51 pam_env.conf | |
| 06:35:01 cron 16463 R 2972 0 18.49 pam_env.conf | |
| 06:35:01 dumpsystemstat 16473 R 128 0 12.58 date | |
| 06:35:01 debian-sa1 16474 R 283 0 12.66 sysstat | |
| 06:35:01 debian-sa1 16474 R 128 0 10.39 sa1 | |
| 06:35:01 dumpsystemstat 16491 R 128 0 13.22 ifconfig | |
| 06:35:01 DumpThreads 16534 R 128 0 12.78 cut | |
| 06:35:01 cron 16545 R 128 0 14.76 sendmail | |
| 06:35:01 sendmail 16545 R 274 0 10.88 dynamicmaps.cf | |
| 06:35:02 postdrop 16546 R 118 0 32.94 Universal | |
| 06:35:02 pickup 9574 R 118 0 21.02 localtime | |
| [...] | |
| This shows various system tasks reading from ext4. The high latency here is | |
| due to disk I/O, as I had just evicted the file system cache for this example. | |
| This "latency" is measured from when the operation was issued from the VFS | |
| interface to the file system, to when it completed. This spans everything: | |
| block device I/O (disk I/O), file system CPU cycles, file system locks, run | |
| queue latency, etc. This is a better measure of the latency suffered by | |
| applications reading from the file system than measuring this down at the | |
| block device interface. | |
| Note that this only traces the common file system operations previously | |
| listed: other file system operations (eg, inode operations including | |
| getattr()) are not traced. | |
| The threshold can be provided as an argument. Eg, I/O slower than 1 ms: | |
| # ./ext4slower 1 | |
| Tracing ext4 operations slower than 1 ms | |
| TIME COMM PID T BYTES OFF_KB LAT(ms) FILENAME | |
| 06:49:17 bash 3616 R 128 0 7.75 cksum | |
| 06:49:17 cksum 3616 R 39552 0 1.34 [ | |
| 06:49:17 cksum 3616 R 96 0 5.36 2to3-2.7 | |
| 06:49:17 cksum 3616 R 96 0 14.94 2to3-3.4 | |
| 06:49:17 cksum 3616 R 10320 0 6.82 411toppm | |
| 06:49:17 cksum 3616 R 65536 0 4.01 a2p | |
| 06:49:17 cksum 3616 R 55400 0 8.77 ab | |
| 06:49:17 cksum 3616 R 36792 0 16.34 aclocal-1.14 | |
| 06:49:17 cksum 3616 R 15008 0 19.31 acpi_listen | |
| 06:49:17 cksum 3616 R 6123 0 17.23 add-apt-repository | |
| 06:49:17 cksum 3616 R 6280 0 18.40 addpart | |
| 06:49:17 cksum 3616 R 27696 0 2.16 addr2line | |
| 06:49:17 cksum 3616 R 58080 0 10.11 ag | |
| 06:49:17 cksum 3616 R 906 0 6.30 ec2-meta-data | |
| 06:49:17 cksum 3616 R 6320 0 10.00 animate.im6 | |
| 06:49:17 cksum 3616 R 5680 0 18.69 anytopnm | |
| 06:49:17 cksum 3616 R 2671 0 20.27 apport-bug | |
| 06:49:17 cksum 3616 R 12566 0 16.72 apport-cli | |
| 06:49:17 cksum 3616 R 1622 0 7.95 apport-unpack | |
| 06:49:17 cksum 3616 R 10440 0 2.37 appres | |
| 06:49:17 cksum 3616 R 48112 0 5.42 whatis | |
| 06:49:17 cksum 3616 R 14832 0 6.24 apt | |
| 06:49:17 cksum 3616 R 65536 0 24.74 apt-cache | |
| 06:49:17 cksum 3616 R 27264 0 1.68 apt-cdrom | |
| 06:49:17 cksum 3616 R 23224 0 5.31 apt-extracttemplates | |
| 06:49:17 cksum 3616 R 65536 0 8.08 apt-ftparchive | |
| 06:49:17 cksum 3616 R 65536 128 2.92 apt-ftparchive | |
| 06:49:17 cksum 3616 R 65536 0 9.58 aptitude-curses | |
| 06:49:17 cksum 3616 R 65536 128 44.25 aptitude-curses | |
| 06:49:17 cksum 3616 R 65536 384 1.69 aptitude-curses | |
| [...] | |
| This time a cksum(1) command can be seen reading various files (from /usr/bin). | |
| A threshold of 0 will trace all operations. Warning: the output will be | |
| verbose, as it will include all file system cache hits. | |
| # ./ext4slower 0 | |
| Tracing ext4 operations | |
| TIME COMM PID T BYTES OFF_KB LAT(ms) FILENAME | |
| 06:58:05 supervise 1884 O 0 0 0.00 status.new | |
| 06:58:05 supervise 1884 W 18 0 0.02 status.new | |
| 06:58:05 supervise 1884 O 0 0 0.00 status.new | |
| 06:58:05 supervise 1884 W 18 0 0.01 status.new | |
| 06:58:05 supervise 15817 O 0 0 0.00 run | |
| 06:58:05 supervise 15817 R 92 0 0.00 run | |
| 06:58:05 supervise 15817 O 0 0 0.00 bash | |
| 06:58:05 supervise 15817 R 128 0 0.00 bash | |
| 06:58:05 supervise 15817 R 504 0 0.00 bash | |
| 06:58:05 supervise 15817 R 28 0 0.00 bash | |
| 06:58:05 supervise 15817 O 0 0 0.00 ld-2.19.so | |
| 06:58:05 supervise 15817 R 64 0 0.00 ld-2.19.so | |
| 06:58:05 supervise 15817 R 392 0 0.00 ld-2.19.so | |
| 06:58:05 run 15817 O 0 0 0.00 ld.so.cache | |
| 06:58:05 run 15817 O 0 0 0.00 libtinfo.so.5.9 | |
| 06:58:05 run 15817 R 832 0 0.00 libtinfo.so.5.9 | |
| 06:58:05 run 15817 O 0 0 0.00 libdl-2.19.so | |
| 06:58:05 run 15817 R 832 0 0.00 libdl-2.19.so | |
| 06:58:05 run 15817 O 0 0 0.00 libc-2.19.so | |
| 06:58:05 run 15817 R 832 0 0.00 libc-2.19.so | |
| 06:58:05 supervise 1876 O 0 0 0.00 status.new | |
| 06:58:05 supervise 1876 W 18 0 0.01 status.new | |
| 06:58:05 supervise 1895 O 0 0 0.00 status.new | |
| 06:58:05 supervise 1895 W 18 0 0.02 status.new | |
| 06:58:05 supervise 1876 O 0 0 0.00 status.new | |
| 06:58:05 supervise 1876 W 18 0 0.01 status.new | |
| 06:58:05 supervise 1872 O 0 0 0.00 status.new | |
| 06:58:05 supervise 1872 W 18 0 0.02 status.new | |
| 06:58:05 supervise 1895 O 0 0 0.00 status.new | |
| 06:58:05 supervise 1895 W 18 0 0.01 status.new | |
| 06:58:05 supervise 15818 R 92 0 0.00 run | |
| 06:58:05 supervise 15818 O 0 0 0.00 bash | |
| 06:58:05 supervise 15818 R 128 0 0.00 bash | |
| 06:58:05 supervise 15818 R 504 0 0.00 bash | |
| 06:58:05 supervise 15818 R 28 0 0.00 bash | |
| 06:58:05 supervise 15818 O 0 0 0.00 ld-2.19.so | |
| 06:58:05 supervise 15818 R 64 0 0.00 ld-2.19.so | |
| 06:58:05 supervise 15818 R 392 0 0.00 ld-2.19.so | |
| 06:58:05 supervise 15818 O 0 0 0.00 run | |
| 06:58:05 supervise 1888 O 0 0 0.00 status.new | |
| 06:58:05 supervise 1888 W 18 0 0.01 status.new | |
| 06:58:05 supervise 1888 O 0 0 0.00 status.new | |
| 06:58:05 supervise 1888 W 18 0 0.02 status.new | |
| 06:58:05 supervise 15822 R 119 0 0.00 run | |
| 06:58:05 supervise 15822 O 0 0 0.00 bash | |
| 06:58:05 supervise 15822 R 128 0 0.00 bash | |
| 06:58:05 supervise 15822 R 504 0 0.00 bash | |
| 06:58:05 supervise 15822 R 28 0 0.00 bash | |
| 06:58:05 supervise 15822 O 0 0 0.00 ld-2.19.so | |
| 06:58:05 supervise 15822 R 64 0 0.00 ld-2.19.so | |
| 06:58:05 supervise 15822 R 392 0 0.00 ld-2.19.so | |
| 06:58:05 supervise 1892 O 0 0 0.00 status.new | |
| 06:58:05 supervise 1892 W 18 0 0.02 status.new | |
| 06:58:05 supervise 1892 O 0 0 0.00 status.new | |
| 06:58:05 supervise 1892 W 18 0 0.02 status.new | |
| 06:58:05 supervise 15820 O 0 0 0.00 run | |
| [...] | |
| The output now includes open operations ("O"), and writes ("W"). | |
| A -j option will print just the fields (parsable output, csv): | |
| # ./ext4slower -j 1 | |
| ENDTIME_us,TASK,PID,TYPE,BYTES,OFFSET_b,LATENCY_us,FILE | |
| 127200712278,bash,17225,R,128,0,14329,cksum | |
| 127200722986,cksum,17225,R,3274,0,8368,command-not-found | |
| 127200735581,cksum,17225,R,65536,0,10903,libbfd-2.24-system.so | |
| 127200738482,cksum,17225,R,65536,131072,2419,libbfd-2.24-system.so | |
| 127200749226,cksum,17225,R,65536,655360,8995,libbfd-2.24-system.so | |
| 127200776273,cksum,17225,R,55080,0,25297,libbind9.so.90.0.9 | |
| 127200784688,cksum,17225,R,65536,0,7873,libblas.so.3.0 | |
| 127200787551,cksum,17225,R,65536,131072,2386,libblas.so.3.0 | |
| 127200795524,cksum,17225,R,18624,0,4947,libcpupower.so.3.13.0-49 | |
| 127200802073,cksum,17225,R,65536,0,6410,libcwidget.so.3.0.0 | |
| 127200808718,cksum,17225,R,65536,131072,6181,libcwidget.so.3.0.0 | |
| 127200829518,cksum,17225,R,65536,0,14213,libdns.so.100.2.2 | |
| 127200832916,cksum,17225,R,65536,131072,2911,libdns.so.100.2.2 | |
| 127200841044,cksum,17225,R,65536,655360,6376,libdns.so.100.2.2 | |
| 127200853646,cksum,17225,R,956,0,1022,libdumbnet.la | |
| 127200857814,cksum,17225,R,61096,0,4111,libdumbnet.so.1.0.1 | |
| 127200869655,cksum,17225,R,65536,0,11252,libgettextlib-0.18.3.so | |
| 127200872985,cksum,17225,R,65536,131072,2882,libgettextlib-0.18.3.so | |
| 127200883063,cksum,17225,R,65536,0,9661,libgettextsrc-0.18.3.so | |
| 127200884767,cksum,17225,R,65536,131072,1251,libgettextsrc-0.18.3.so | |
| 127200904830,cksum,17225,R,65536,0,19571,libgirepository-1.0.so.1.0.0 | |
| 127200906354,cksum,17225,R,65536,131072,1080,libgirepository-1.0.so.1.0.0 | |
| 127200936047,cksum,17225,R,65536,0,28674,libGraphicsMagick.a | |
| 127200939091,cksum,17225,R,65536,131072,2576,libGraphicsMagick.a | |
| 127200947295,cksum,17225,R,65536,655360,6463,libGraphicsMagick.a | |
| 127200958793,cksum,17225,R,65536,1966080,7034,libGraphicsMagick.a | |
| [...] | |
| This may be useful for visualizing with another tool, for example, for | |
| producing a scatter plot of ENDTIME vs LATENCY, to look for time-based | |
| patterns. | |
| USAGE message: | |
| # ./ext4slower -h | |
| usage: ext4slower [-h] [-j] [-p PID] [min_ms] | |
| Trace common ext4 file operations slower than a threshold | |
| positional arguments: | |
| min_ms minimum I/O duration to trace, in ms (default 10) | |
| optional arguments: | |
| -h, --help show this help message and exit | |
| -j, --csv just print fields: comma-separated values | |
| -p PID, --pid PID trace this PID only | |
| examples: | |
| ./ext4slower # trace operations slower than 10 ms (default) | |
| ./ext4slower 1 # trace operations slower than 1 ms | |
| ./ext4slower -j 1 # ... 1 ms, parsable output (csv) | |
| ./ext4slower 0 # trace all operations (warning: verbose) | |
| ./ext4slower -p 185 # trace PID 185 only |