Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux Disk Usage Tools Compared: QDirStat vs. K4DirStat vs. Baobab vs. Filelight vs. ncdu #97

Closed
shundhammer opened this issue Apr 6, 2019 · 1 comment
Labels

Comments

@shundhammer
Copy link
Owner

shundhammer commented Apr 6, 2019

Screenshots

QDirStat

QDirStat

K4DirStat

K4DirStat

Baobab

Baobab

Filelight

Filelight

ncdu

ncdu

Performance

All the programs had to scan my /work directory on my (normal rotational, non-SSD) disk, both with cleared kernel directory caches and with caches filled from previous program runs.

To clear those caches on Linux, start a root shell and then enter

echo 3 >/proc/sys/vm/drop_caches

In all cases, the test procedure was:

  • start the program once (without timing) to make sure all shared libraries are loaded and auxiliary processes are started (this might be important for the more complex desktop environments like KDE and GNOME)

  • drop caches

  • start the program, this time recording the time until the directory tree is fully loaded and the program is ready for user input

  • exit the program

  • leave caches as they are (i.e. the directory tree is cached by the kernel now)

  • start with timing and exit

  • start with timing and exit

  • start with timing and exit

  • drop caches

  • start with timing and exit

  • drop caches

  • start with timing and exit

QDirStat and KDirStat both display the elapsed time for directory reading once that is complete. For the others, a stopwatch was used (which makes the timing somewhat less accurate, of course).

Benchmark Results: /work

/work is an ext4 filesystem with 230 GB / 216k items on a Samsung 1 TB 7200 rpm disk.

Update 2019-04-13: Added results from latest QDirStat with performance improvement

Version Run 1 Run 2 Run 3 Run 4 Run 5 Run 6
Cache cold hot hot hot cold cold
qdirstat 07235ec 25.8 1.5 1.5 1.6 25.3 25.2
qdirstat -s 07235ec 24.9 1.5 1.5 1.5 24.7 24.8
qdirstat 1.5 32.3 1.9 2.0 2.0 33.0 33.0
qdirstat -s 1.5 31.4 1.5 1.8 1.8 32.0 31.7
.
k4dirstat 3.1.3 35.2 2.4 2.3 2.4 34.3 34.2
baobab 3.28.0 24.6 3.5 3.5 3.4 24.7 24.1
filelight 4.17 19.8 1.1 1.2 1.1 19.2 19.5
ncdu 1.12 18.6 1.5 1.2 1.3 19.0 19.0
du -hs 8.28 17.9 0.5 0.5 0.5 18.0 18.0

The exact command for du -hs was time du -hs /work, so the timing was more accurate than with a stopwatch.

Don't get all hung up with split seconds at the results for baobab, filelight and ncdu: Operating a manual stopwatch isn't all that accurate with one hand on the keyboard and the other at the stopwatch.

Benchmark Conclusions

du -hs doesn't do much; it doesn't have a user interface. So this can safely be considered the theoretical minimum how fast this can possibly get: Traverse an entire directory tree, open each directory in sequence (using the opendir() / readdir() / closedir() syscalls) and obtain detailed information from the filesystem for each file or directory encountered (using the stat() or lstat() syscalls).

ncdu comes close. Since it uses a text-based (ncurses) user interface, it doesn't have much overhead for GUI stuff. On the other hand, it also can't do very much (but it can delete the selected file).

Filelight is really fast. In particular, re-reading the same directory appears to be faster than with ncdu (but this might be attributed to the inaccuracies of using a manual stopwatch). According to the output of ps, it uses 3 threads (thus 3 CPU cores); however since this is largely I/O bound, it is not obvious how this helps.

Baobab is fast for uncached reads, but surprisingly slow for the cached ones. But it doesn't keep any information about individual files in memory, only the directories with the sums. That's why it only offers to delete entire subdirectories, but not individual files (duh!).

K4DirStat is a little slower than QDirStat 1.5, even though both use the same directory reading code (inherited from KDirStat). This might be because of more display updates and because of re-sorting the diplayed tree all the time during reading. It is significantly slower than the latest QDirStat with the performance improvements, though.

QDirStat becomes a little faster with the -s (--slow-updates) command line option which was designed for remote X connections that have become very slow with Qt5 (due to always using a pixel buffer that has to be transferred over the network connnection instead of X protocol draw primitives like XDrawString()). But the difference is really negligible.

The latest QDirStat from Git master got quite some performance improvements due to using fstatat() instead of lstat() and sorting the directory entries by i-no before that call so the corresponding i-nodes can be read sequentially with minimized disk seek times (which has no effect on SSDs, though).

Features

Feature QDirStat K4DirStat Baobab Filelight ncdu du
Show tree total size + + + + + +
Show subtree size + + + + + +
Show size of individual files + + + + +
Stop at mounted filesystems + + + + +
Exclude rules + + + +
Show treemap + +
Show some other graph + +
Delete a file + + + +
Delete a directory / subtree + + + + +
Open directory in filemanager + + + +
Custom cleanup actions + +
File type view +
File size histogram view +
Package manager support +
Proper Btrfs subvol handling + ? ? ? ? ?

Quirks and Oddities

Size Units

Baobab shows all sizes in 1000-based units, not 1024-based like all the others. That's why the sizes appear to be different (but they really are not).

Unit 1000-based 1024-based
1 KB 1000 B 1024 B
1 MB 1,000,000 B 1,048,576 B
1 GB 1,000,000,000 B 1,073,741,824 B

Versions Used

All running on Xubuntu 18.04.02 LTS with all the latest updates.

Program Version
QDirStat Git master (07235ec, post-1.5)
QDirStat 1.5
K4DirStat 3.1.3-1
Baobab 3.28.0-1
Filelight 4:17.12.3-0
ncdu 1.12-1
du coreutils-8.28

Hardware

  • Intel Core i7 870 2.93 GHz
  • 16 GB RAM
  • 2 * HD Samsung SpinPoint F3 (HD103SJ) 1 TB, 7200 rpm
  • Samsung SSD 860 250 GB (not used in this test)
@shundhammer shundhammer added the doc label Apr 6, 2019
@shundhammer
Copy link
Owner Author

Feel free to comment to this issue. That is the main reason why I used this issue tracker instead of the GitHub wiki (the other being that the wiki does not support this nice uploading of screenshots that the issue tracker has).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant