Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hard link mode can corrupt .d files in the cache #599

Closed
waldheinz opened this issue May 28, 2020 · 5 comments
Closed

Hard link mode can corrupt .d files in the cache #599

waldheinz opened this issue May 28, 2020 · 5 comments
Labels
bug Does not work as intended/documented
Milestone

Comments

@waldheinz
Copy link

waldheinz commented May 28, 2020

How to reproduce

  1. Given three source files,

a.cpp

#include "a.hpp"

int main() {
  return 0;
}

a.hpp:

#include "b.hpp"

b.hpp

// hi there!
  1. Compile that code with ccache, something like:
ccache -o a.o -MD -MF a.m a.cpp
  1. Delete the b.hpp file, remove the relevant include in a.hpp and compile again. Ccache considers this a direct hit, handing out the cached object file and a bogus a.m file, still referencing the deleted b.hpp file.

Please note that the steps above are a minimal contrived example of what I think is going on, see below for a full log of a real instance of this issue.

Actual behavior

Here is a log of the second compilation of a source file as triggered by our build system, after switching branches in git to a state where a header was removed as in the example above:

[2020-05-28T13:55:18.428661 26081] === CCACHE 3.7.9 STARTED =========================================
[2020-05-28T13:55:18.428661 26081] Config: (default) base_dir = 
[2020-05-28T13:55:18.428661 26081] Config: (environment) cache_dir = /home/trem/Arbeitsplatz/Meetwise/mdi/mdi/.build/ccache
[2020-05-28T13:55:18.428661 26081] Config: (default) cache_dir_levels = 2
[2020-05-28T13:55:18.428661 26081] Config: (default) compiler = 
[2020-05-28T13:55:18.428661 26081] Config: (default) compiler_check = mtime
[2020-05-28T13:55:18.428661 26081] Config: (default) compression = false
[2020-05-28T13:55:18.428661 26081] Config: (default) compression_level = 6
[2020-05-28T13:55:18.428661 26081] Config: (default) cpp_extension = 
[2020-05-28T13:55:18.428661 26081] Config: (/home/trem/Arbeitsplatz/Meetwise/mdi/mdi/.build/ccache/ccache.conf) debug = true
[2020-05-28T13:55:18.428661 26081] Config: (/home/trem/Arbeitsplatz/Meetwise/mdi/mdi/.build/ccache/ccache.conf) depend_mode = true
[2020-05-28T13:55:18.428661 26081] Config: (default) direct_mode = true
[2020-05-28T13:55:18.428661 26081] Config: (default) disable = false
[2020-05-28T13:55:18.428661 26081] Config: (default) extra_files_to_hash = 
[2020-05-28T13:55:18.428661 26081] Config: (environment) hard_link = true
[2020-05-28T13:55:18.428661 26081] Config: (default) hash_dir = true
[2020-05-28T13:55:18.428661 26081] Config: (default) ignore_headers_in_manifest = 
[2020-05-28T13:55:18.428661 26081] Config: (default) keep_comments_cpp = false
[2020-05-28T13:55:18.428661 26081] Config: (default) limit_multiple = 0.8
[2020-05-28T13:55:18.428661 26081] Config: (default) log_file = 
[2020-05-28T13:55:18.428661 26081] Config: (default) max_files = 0
[2020-05-28T13:55:18.428661 26081] Config: (/home/trem/Arbeitsplatz/Meetwise/mdi/mdi/.build/ccache/ccache.conf) max_size = 5.0G
[2020-05-28T13:55:18.428661 26081] Config: (default) path = 
[2020-05-28T13:55:18.428661 26081] Config: (default) pch_external_checksum = false
[2020-05-28T13:55:18.428661 26081] Config: (default) prefix_command = 
[2020-05-28T13:55:18.428661 26081] Config: (default) prefix_command_cpp = 
[2020-05-28T13:55:18.428661 26081] Config: (default) read_only = false
[2020-05-28T13:55:18.428661 26081] Config: (default) read_only_direct = false
[2020-05-28T13:55:18.428661 26081] Config: (default) recache = false
[2020-05-28T13:55:18.428661 26081] Config: (default) run_second_cpp = true
[2020-05-28T13:55:18.428661 26081] Config: (default) sloppiness = 
[2020-05-28T13:55:18.428661 26081] Config: (default) stats = true
[2020-05-28T13:55:18.428661 26081] Config: (default) temporary_dir = 
[2020-05-28T13:55:18.428661 26081] Config: (default) umask = 
[2020-05-28T13:55:18.428684 26081] Command line: /nix/store/1y4cahmbqafljv92vc8qrxymj260wq1y-ccache-3.7.9/bin/ccache /nix/store/25g1sxj2fpyfrwfa3vzcplqbi2xnvjhw-gcc-9.3.0/bin/gcc -x c++ -c src/mdi-common/src/mdi-common/api/lua.cpp -o .build/playground/mdi-common/objects/mdi-common/api/lua.cpp.o -MD -MF .build/playground/mdi-common/objects/mdi-common/api/lua.cpp.m -std=c++17 -pthread -fno-omit-frame-pointer -fPIC -fsanitize=address -ggdb3 -Og -pipe -U_FORTIFY_SOURCE -fstrict-aliasing -Wall -Wdouble-promotion -Wduplicated-cond -Werror -Wextra -Wno-unused-parameter -Wnull-dereference -Wstrict-aliasing=2 -Wuninitialized -Wunused-macros -I src/mdi-common/src -I //home/trem/Arbeitsplatz/Meetwise/mdi/mdi/playground/include -I //home/trem/Arbeitsplatz/Meetwise/mdi/mdi/playground/include/json-c -I /home/trem/Arbeitsplatz/Meetwise/mdi/mdi/playground/include -isystem .build/playground/mdi-common/protobuf -isystem .build/playground/mdi-common/wayland -isystem /home/trem/Arbeitsplatz/Meetwise/mdi/mdi/playground/include -DDEFAULT_PREFIX="/home/trem/Arbeitsplatz/Meetwise/mdi/mdi/playground" -DDEFAULT_PREFIX="/home/trem/Arbeitsplatz/Meetwise/mdi/mdi/playground" -DDEFAULT_PREFIX="/home/trem/Arbeitsplatz/Meetwise/mdi/mdi/playground" -DDEFAULT_PREFIX="/home/trem/Arbeitsplatz/Meetwise/mdi/mdi/playground" -B/nix/store/244yvw8fnxdv6zfkq2c6jqmw35jg6ib7-gcc-9.3.0-lib/lib -B/nix/store/nwsn18fysga1n5s0bj4jp4wfwvlbx8b1-glibc-2.30/lib/ -idirafter /nix/store/796fpc57zsb4x57krgcdspxk0vp2rykq-glibc-2.30-dev/include -idirafter /nix/store/yzgp666pca54iaj74f6yd7l56pvyjlnr-ccache-links/lib/gcc/x86_64-unknown-linux-gnu/9.3.0/include-fixed -B/nix/store/cixpn8i8l8ikgdnsh9qaqhvw73mz7hbi-ccache-links-wrapper-/bin/
[2020-05-28T13:55:18.428689 26081] Hostname: squirrel
[2020-05-28T13:55:18.428698 26081] Working directory: /home/trem/Arbeitsplatz/Meetwise/mdi/mdi
[2020-05-28T13:55:18.428780 26081] Source file: src/mdi-common/src/mdi-common/api/lua.cpp
[2020-05-28T13:55:18.428781 26081] Dependency file: .build/playground/mdi-common/objects/mdi-common/api/lua.cpp.m
[2020-05-28T13:55:18.428781 26081] Object file: .build/playground/mdi-common/objects/mdi-common/api/lua.cpp.o
[2020-05-28T13:55:18.428826 26081] Hashing CWD /home/trem/Arbeitsplatz/Meetwise/mdi/mdi
[2020-05-28T13:55:18.428842 26081] Trying direct lookup
[2020-05-28T13:55:18.428921 26081] Looking for object file hash in /home/trem/Arbeitsplatz/Meetwise/mdi/mdi/.build/ccache/f/5/72d206323eb782644bec9c8478f2a1-19746.manifest
[2020-05-28T13:55:18.439709 26081] Failed to stat src/mdi-common/src/mdi-common/test.hpp: No such file or directory
[2020-05-28T13:55:18.444384 26081] Got object file hash from manifest
[2020-05-28T13:55:18.444473 26081] Unlink .build/playground/mdi-common/objects/mdi-common/api/lua.cpp.o via .build/playground/mdi-common/objects/mdi-common/api/lua.cpp.o.ccache.rm.tmp
[2020-05-28T13:55:18.444486 26081] Created from cache: /home/trem/Arbeitsplatz/Meetwise/mdi/mdi/.build/ccache/5/3/3359699a692dd1ce9d9db41da0b7de-5155242.o -> .build/playground/mdi-common/objects/mdi-common/api/lua.cpp.o
[2020-05-28T13:55:18.444539 26081] Copying /home/trem/Arbeitsplatz/Meetwise/mdi/mdi/.build/ccache/5/3/3359699a692dd1ce9d9db41da0b7de-5155242.d to .build/playground/mdi-common/objects/mdi-common/api/lua.cpp.m via .build/playground/mdi-common/objects/mdi-common/api/lua.cpp.m.squirrel.26081.O7NECj (uncompressed)
[2020-05-28T13:55:18.444708 26081] Created from cache: /home/trem/Arbeitsplatz/Meetwise/mdi/mdi/.build/ccache/5/3/3359699a692dd1ce9d9db41da0b7de-5155242.d -> .build/playground/mdi-common/objects/mdi-common/api/lua.cpp.m
[2020-05-28T13:55:18.444724 26081] Succeeded getting cached result
[2020-05-28T13:55:18.444746 26081] Result: cache hit (direct)
[2020-05-28T13:55:18.444785 26081] Acquired lock /home/trem/Arbeitsplatz/Meetwise/mdi/mdi/.build/ccache/5/stats.lock
[2020-05-28T13:55:18.444855 26081] Releasing lock /home/trem/Arbeitsplatz/Meetwise/mdi/mdi/.build/ccache/5/stats.lock
[2020-05-28T13:55:18.444856 26081] Unlink /home/trem/Arbeitsplatz/Meetwise/mdi/mdi/.build/ccache/5/stats.lock

Expected behavior

Per log output from above:

[2020-05-28T13:55:18.439709 26081] Failed to stat src/mdi-common/src/mdi-common/test.hpp: No such file or directory

Ccache should detect a cache miss and compile the code again.

Environment

ccache version 3.7.9

@waldheinz waldheinz added the bug Does not work as intended/documented label May 28, 2020
@jrosdahl
Copy link
Member

Thanks for the bug report.

I can't reproduce the bug from the description, though. It doesn't seem likely that such a standard case would be badly handled by ccache (since somebody would have reported it already), so my guess is that there has to be something more subtle in play. But maybe the use of the depend mode makes it subtle or at least an uncommon enough setup.

[2020-05-28T13:55:18.439709 26081] Failed to stat src/mdi-common/src/mdi-common/test.hpp: No such file or directory

The log line you pointed out does not by itself tell much about if it's a bug or not. It says that a header file was not found for one of the result entries when looking for a match in the manifest file, but a manifest file can store several result entries and the subsequent log line (Got object file hash from manifest) implies that another match was found. So the question is what that match was.

It's of interest to take a look at the manifest file. In this case ccache --dump-manifest /home/trem/Arbeitsplatz/Meetwise/mdi/mdi/.build/ccache/f/5/72d206323eb782644bec9c8478f2a1-19746.manifest will list the manifest result entries, their associated headers and what their hashes should be.

For debugging purposes the ideal set of information to work with would be:

  1. <obj>.ccache-* files (from CCACHE_DEBUG=1) for the first run.
  2. <obj>.ccache-* files (from CCACHE_DEBUG=1) for the second run where the bug appears.
  3. The manifest file.
  4. Hashes or contents of the include files in question.

For reference, here is a script I wrote when trying to reproduce the bug with ccache 3.7.9:

#!/bin/sh

set -eu

tmpdir=$(mktemp -d -p .)
trap "cd /; rm -rf $tmpdir" EXIT
cd $tmpdir

export CCACHE_DEBUG=1
export CCACHE_DEPEND=1
export CCACHE_DIR=ccache.dir
export CCACHE_HARDLINK=1
export CCACHE_SLOPPINESS=include_file_mtime,include_file_ctime

ccache=${CCACHE:-ccache}
cc=${CC:-cc}

(echo '#include "a.hpp"'; echo 'int main(){return 0;}') >a.cpp
echo '#include "b.hpp"' >a.hpp
echo '// hi there!' >b.hpp

$ccache $cc -c -o a.o -MD -MF a.m a.cpp
grep Result a.o.ccache-log
cat a.m

echo
rm b.hpp
echo >a.hpp

$ccache $cc -c -o a.o -MD -MF a.m a.cpp
grep Result a.o.ccache-log
cat a.m

Output for me:

[2020-05-28T21:38:57.285351 1384599] Result: cache miss
a.o: a.cpp /usr/include/stdc-predef.h a.hpp b.hpp

[2020-05-28T21:38:57.308242 1384607] Result: cache miss
a.o: a.cpp /usr/include/stdc-predef.h a.hpp

@waldheinz
Copy link
Author

Thanks for the fast response. I agree that it seems unlinkely that such a bug goes unnoticed for any extended amount of time, yet I was able to reproduce the result I consider bogus by simply removing the three lines dealing with the tmpdir in the reproducer you provided (thanks for that, I should really have done this).

I see expected output for the first run:

$ ./repro.sh 
[2020-05-29T10:52:58.556795 24363] Result: cache miss
a.o: a.cpp \
 /nix/store/796fpc57zsb4x57krgcdspxk0vp2rykq-glibc-2.30-dev/include/stdc-predef.h \
 a.hpp b.hpp

[2020-05-29T10:52:58.593491 24383] Result: cache miss
a.o: a.cpp \
 /nix/store/796fpc57zsb4x57krgcdspxk0vp2rykq-glibc-2.30-dev/include/stdc-predef.h \
 a.hpp

But for the second run, without touching anything:

$ ./repro.sh 
[2020-05-29T10:53:54.147333 25270] Result: cache hit (direct)
a.o: a.cpp \
 /nix/store/796fpc57zsb4x57krgcdspxk0vp2rykq-glibc-2.30-dev/include/stdc-predef.h \
 a.hpp

[2020-05-29T10:53:54.151951 25275] Result: cache hit (direct)
a.o: a.cpp \
 /nix/store/796fpc57zsb4x57krgcdspxk0vp2rykq-glibc-2.30-dev/include/stdc-predef.h \
 a.hpp

Note that the first cat a.m output here misses the b.hpp header, which IMHO should be present.

@jrosdahl
Copy link
Member

jrosdahl commented May 29, 2020

Thanks. Now I understand.

The issue is not the depend mode, it's the hard link mode. I would guess that the hard link mode is rarely used, which explains why the issue hasn't been reported before.

What happens is this:

  1. The compiler writes a.m including b.hpp and ccache hard links a.m into cache entry 1. a.m and cache entry 1's .d file now share i-nodes.
  2. The compiler writes new content without b.hpp to a.m and ccache hard links a.m into cache entry 2. a.m, cache entry 1 and cache entry 2 now share i-nodes, all of which of course contain the new content without b.hpp.

Thus the second compilation corrupts cache entry 1. This is an instance of the warning present for the hard_link configuration option:

Warning | Do not enable this option unless you are aware of the consequences. Using hard links may be slightly faster in some situations, but there are several pitfalls since the resulting object file will share i-node with the cached object file:

  1. If the resulting object file is modified in any way, the cached object file will be modified as well. For instance, if you run strip object.o or echo >object.o, you will corrupt the cache.

The compiler always unlinks the object file before writing a new one, but it doesn't do that for the dependency file, which is news to me.

So the quick fix is to not use the hard link mode since it's dangerous. A slow fix is to wait for ccache 4.0 which won't store dependency files as hard links and also will detect size changes of hard-linked object files (a46fd5b).

Edit: I will consider making 3.7.10 never use hard links for the .d file.

@jrosdahl jrosdahl added this to the 3.7.10 milestone May 29, 2020
jrosdahl added a commit to jrosdahl/ccache that referenced this issue May 29, 2020
The compiler unlinks the destination object file before writing, but it
apparently doesn’t do that for dependency files. This means that
compilation can corrupt a .d file that shares i-node with a cached .d
file when using the hard link mode. Here is a scenario where this can
happen:

1. There is a test.c which includes test.h.
2. When test.c is compiled, the compiler writes test.d which mentions
   test.h and ccache hard links test.d into cache entry 1. test.d and
   cache entry 1's .d file now share i-nodes.
3. The include of test.h is removed from test.c.
4. When test.c is compiled again the compiler overwrites test.d with new
   content without test.h and ccache hard links test.d into cache entry
   2. test.d, cache entry 1 and cache entry 2 now share i-nodes, all of
   which contain the new content without test.h.

Since we can’t be sure how the compiler behaves for other types of files
(.dwo, .cov, etc.), only try to to hard link object files.

Fixes ccache#599.
jrosdahl added a commit to jrosdahl/ccache that referenced this issue May 29, 2020
The compiler unlinks the destination object file before writing, but it
apparently doesn’t do that for dependency files. This means that
compilation can corrupt a .d file that shares i-node with a cached .d
file when using the hard link mode. Here is a scenario where this can
happen:

1. There is a test.c which includes test.h.
2. When test.c is compiled, the compiler writes test.d which mentions
   test.h and ccache hard links test.d into cache entry 1. test.d and
   cache entry 1's .d file now share i-nodes.
3. The include of test.h is removed from test.c.
4. When test.c is compiled again the compiler overwrites test.d with new
   content without test.h and ccache hard links test.d into cache entry
   2. test.d, cache entry 1 and cache entry 2 now share i-nodes, all of
   which contain the new content without test.h.

Since we can’t be sure how the compiler behaves for other types of files
(.dwo, .cov, etc.), only try to to hard link object files.

Fixes ccache#599.

(cherry picked from commit 443afc1)
@jrosdahl
Copy link
Member

Fixed on 3.7-maint now.

@jrosdahl jrosdahl changed the title False cache hit when using -MD False cache hit when using "-MD" May 29, 2020
@jrosdahl jrosdahl changed the title False cache hit when using "-MD" Hard link mode can corrupt .d files in the cache May 29, 2020
@waldheinz
Copy link
Author

Thanks! I always falsely assumed that the hardlink mode only influences files handed out of the cache (and I made sure we do not mess with those). That it also influences how files get into the cache was news to me. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Does not work as intended/documented
Projects
None yet
Development

No branches or pull requests

2 participants