nocache - minimize filesystem caching effects
nocache tool tries to minimize the effect an application has on
the Linux file system cache. This is done by intercepting the
close system calls and calling
posix_fadvise with the
POSIX_FADV_DONTNEED parameter. Because the library remembers which
pages (ie., 4K-blocks of the file) were already in file system cache
when the file was opened, these will not be marked as "don't need",
because other applications might need that, although they are not
actively used (think: hot standby).
Use case: backup processes that should not interfere with the present state of the cache.
Installation and Usage
make. Then, prepend
./nocache to your command:
./nocache cp -a ~/ /mnt/backup/home-$(hostname)
make install will install the shared library, man
pages and the
/usr/local. You can specify an alternate prefix by using
make install PREFIX=/usr.
Debian packages are available, see https://packages.qa.debian.org/n/nocache.html.
Please note that
nocache will only build on a system that has
support for the
posix_fadvise syscall and exposes it, too. This
should be the case on most modern Unices, but kfreebsd notably has no
support for this as of now.
For testing purposes, I included two small tools:
posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED)on the file argument. Thus, if the file is not accessed by any other application, the pages will be eradicated from the fs cache. Specifying -n will repeat the syscall the given number of times which can be useful in some circumstances (see below).
cachestatshas three modes: In quiet mode (
-q), the exit status is 0 (success) if the file is fully cached. In normal mode, the number of cached vs. not-cached pages is printed. In verbose mode (
-v), an actual map is printed out, where each page that is present in the cache is marked with
It looks like this:
$ cachestats -v ~/somefile.mp3 pages in cache: 85/114 (74.6%) [filesize=453.5K, pagesize=4K] cache map: 0: |x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x| 32: |x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x| 64: |x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x| | | | | | | | | | | | | 96: | | | | | | | | | | | | | | | | | |x|
Also, you can use
vmstat 1 to view cache statistics.
For debugging purposes, you can specify a filename that
nocache should log
debugging messages to via the
-D command line switch, e.g. use
nocache -D /tmp/nocache.log …. Note that for simple testing the file
might be a good choice.
nocache, the file will be fully cached when you copy it
$ ./cachestats ~/file.mp3 pages in cache: 154/1945 (7.9%) [filesize=7776.2K, pagesize=4K] $ cp ~/file.mp3 /tmp $ ./cachestats ~/file.mp3 pages in cache: 1945/1945 (100.0%) [filesize=7776.2K, pagesize=4K]
nocache, the original caching state will be preserved.
$ ./cachestats ~/file.mp3 pages in cache: 154/1945 (7.9%) [filesize=7776.2K, pagesize=4K] $ ./nocache cp ~/file.mp3 /tmp $ ./cachestats ~/file.mp3 pages in cache: 154/1945 (7.9%) [filesize=7776.2K, pagesize=4K]
The pre-loaded library tries really hard to catch all system calls
that open or close a file. This happens by "hijacking" the libc
functions that wrap the actual system calls. In some cases, this may
fail, for example because the application does some clever wrapping.
(That is the reason why
__openat_2 is defined: GNU
tar uses this
instead of a regular
However, since the actual
fadvise calls are performed right before
the file descriptor is closed, this may not happen if they are left
open when the application exits, although the destructor tries to do
There are timing issues to consider, as well. If you consider
nocache cat <file>, in most (all?) cases the cache will not be restored. For
discussion and possible solutions see http://lwn.net/Articles/480930/.
My experience showed that in many cases you could "fix" this by doing
posix_fadvise call twice. For both tools
cachedel you can specify the number using
-n, like so:
$ nocache -n 2 cat ~/file.mp3
This actually only sets the environment variable
to the specified value, and the shared library reads out this value.
If test number 3 in
t/basic.t fails, then try increasing this number
until it works, e.g.:
$ env NOCACHE_NR_FADVISE=2 make test
One could also consider that the fact pages are kept mean the kernel considers they are hot, and decide the overhead of allocating one byte per page for mincore and the actual mincore calls are not worth it when the kernel actually does keep some pages when it wants to.
In this case you can either run
-f or set the
NOCACHE_FLUSHALL environment variable to 1, e.g.:
$ nocache -f cat ~/file.mp3 $ env NOCACHE_FLUSHALL=1 make test
If you have a fairly recent Linux kernel, you can also try to simply cage a cache-intensive application in a memory-limited cgroup. For example, I use the following command in the startup part of my backup script:
sudo env ppid=$$ sh -c ' cgcreate -g memory:backup ; cgcreate -g blkio:backup ; echo 256M > /sys/fs/cgroup/memory/backup/memory.limit_in_bytes ; echo 100 > /sys/fs/cgroup/blkio/backup/blkio.weight ; echo $ppid > /sys/fs/cgroup/memory/backup/tasks ; echo $ppid > /sys/fs/cgroup/blkio/backup/tasks ; '
Most of the application logic is from Tobias Oetiker's patch for
rsync http://insights.oetiker.ch/linux/fadvise.html. Note however,
rsync uses sockets, so if you try a
nocache rsync, only
the local process will be intercepted.