Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clang relies on __PRETTY_FUNCTION__ and asserts #11

Closed
peterwaller-arm opened this issue Oct 29, 2021 · 8 comments · Fixed by #22
Closed

Clang relies on __PRETTY_FUNCTION__ and asserts #11

peterwaller-arm opened this issue Oct 29, 2021 · 8 comments · Fixed by #22

Comments

@peterwaller-arm
Copy link
Contributor

peterwaller-arm commented Oct 29, 2021

IIRC, we observed that use of __PRETTY_FUNCTION__ resulted in binaries which depended on the full path of the sources. In order to make the binaries reproducible, we have a workaround:

# Prevent paths leaking through __PRETTY_FUNCTION__
COMPILER_FLAGS+="-Wno-builtin-macro-redefined '-D__PRETTY_FUNCTION__=__func__' "

Unfortunately the resulting clang binary is completely broken and refuses to compile anything:

https://llvm.org/doxygen/TypeName_8h_source.html

It asserts on assert(!Name.empty() && "Unable to find the template parameter!");. I suspect that func doesn't contain the template type parameter information in it.

We'll need to find a workaround for this or stop overriding __PRETTY_FUNCTION__ and therefore have binaries which can only be reproduced if built at a specific absolute path.

@peterwaller-arm
Copy link
Contributor Author

From a quick grep, I suspect paths are leaking into the binary through this, in case it gives us inspiration for a workaround or fix to clang:
https://github.com/llvm/llvm-project/blob/d0ec4a8ed9a39b4e3a35e0826b63ac6c5bc21da1/clang/lib/AST/TypePrinter.cpp#L1327

@peterwaller-arm
Copy link
Contributor Author

Another approach would be to do something to ensure the compiler gets fed with relative paths contained within -I and the cpp paths, rather than absolute paths as are currently passed.

See also, cmake FAQ Why does CMake use full paths, or can I copy my build tree?.

So it doesn't look like we can make cmake help us here. It used to have CMAKE_USE_RELATIVE_PATHS but that was removed.

Fortuitously, though, it seems that we can use CCACHE_BASEDIR=$SOURCE_ROOT -- with that, ccache rewrites the paths to be relative to the source root, and then the paths do not leak into the binary.

This seems like a pretty decent workaround, but it does force anyone who wants to reproduce our binaries to use ccache.

Looking at the clang logic, I don't think the full path leaking into the binary is a bug, but it looks like possibly a feature request to make it respect -ffile-prefix-map. That doesn't look too hard to implement in principle -- the same stuff is used to format anonymous types for the debug prefix map. All the logic is there to do it, it just needs wiring together.

@peterwaller-arm
Copy link
Contributor Author

On second thought it looks like there is an -fmacro-prefix-map, which sounds like it's intended to do this sort of thing, but it fails in this case. So that seems more bug-ish:

https://github.com/llvm/llvm-project/blob/2d83392a88572e3126fe78e65d522bdd1a8a0ca3/clang/include/clang/Driver/Options.td#L2898-L2899

@veselink1
Copy link
Member

Unfortunately the resulting clang binary is completely broken and refuses to compile anything.

I'm very surprised and troubled by this.

I would prefer to think that -fmacro-prefix-map is supposed to cover this and that this is a bug,
Repro: https://godbolt.org/z/W1dGGoMhW

The GCC doc (https://gcc.gnu.org/onlinedocs/gcc/Preprocessor-Options.html) states:

When preprocessing files residing in directory old, expand the FILE and BASE_FILE macros as if the files resided in directory new instead.

This seems to have been implemented in clang for compatibility with GCC (https://reviews.llvm.org/D49466) and hence the implementation only takes these two macros in consideration. PRETTY_FUNCTION does not include file names in GCC though. So if this is a bug, I wouldn't know if it is a bug in -fmacro-prefix-map or in __PRETTY_FUNCTION__ (assuming that compatibility between clang and GCC is desired).

Is patching TypeName.h an option? The file hasn't been changed in 6 years. It would appear that "UNKNOWN_TYPE" is a valid return value. Even better, we could use typeid(T).name() to get the type name in another way (do the builds have RTTI enabled?).

Patching the absolute paths before passing to the compiler seems to me to be the cleanest solution.
Would something like if [ -f $ARG ]; then ARG="$(realpath --relative-to="${PWD}" "$ARG")" fi work in this case?

@peterwaller-arm
Copy link
Contributor Author

Would something like if [ -f $ARG ]; then ARG="$(realpath --relative-to="${PWD}" "$ARG")" fi work in this case?

Maybe, except that you also have to handle -I paths as well (because otherwise header paths end up similarly in the binary with abspaths). I'm not convinced we can roll this ourselves quickly, and we'd end up having a compiler-wrapper. It feels better to use ccache since at least it is well developed compiler wrapper, if someone doesn't want to actually use the cache then it's should be possible to use CCACHE_DISABLE=1 in conjunction with CCACHE_BASEDIR.

@veselink1
Copy link
Member

veselink1 commented Oct 30, 2021

FWIW, we don't need to handle the conversion perfectly. In fact, we only need to patch the source .cpp file path, because that's what gets embedded into the compiled output. (Edit: This is incorrect, paths could leak through headers).
Revised proposal:

if [[ -f $ARG && ${ARG: -4} == ".cpp" ]]; then

But the Ccache solution might be simpler still.

@veselink1
Copy link
Member

Bug report for Clang: https://bugs.llvm.org/show_bug.cgi?id=52360

peterwaller-arm added a commit that referenced this issue Oct 31, 2021
Turns out paths continue to leak into binaries so long as the compiler
is fed with absolute paths, even if we use CCACHE_BASEDIR.

So rewrite them in build.ninja.

Also, fix llvm-config, which also captures the build directory.
@peterwaller-arm
Copy link
Contributor Author

I've found a ccache issue -- it fails to apply CCACHE_BASEDIR under some circumstances, for example:

[2021-10-31T15:01:10.767668 3911986] Executing /r/clang-12-thinltopgo/bin/clang++ -ffunction-sections -fdata-sections -mno-outline -no-canonical-prefixes -ffile-prefix-map=/tfs/20210112-00873T140753-2ed914cb7e9c073= -fmacro-prefix-map=/tfs/20210112-00873T140753-2ed914cb7e9c073= -fno-ident -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -w -ffunction-sections -fdata-sections -O3 -fno-exceptions -fno-rtti -std=c++14 -S -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/MC/MCParser -I../llvm/lib/MC/MCParser -Iinclude -I../llvm/include -UNDEBUG -E ../llvm/lib/MC/MCParser/AsmParser.cpp
[2021-10-31T15:01:10.954470 3911986] Found unsupported .incbin directive in source code
[2021-10-31T15:01:10.954718 3911986] Failed; falling back to running the real compiler
[2021-10-31T15:01:10.954727 3911986] Executing /r/clang-12-thinltopgo/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Ilib/MC/MCParser -I/tfs/20210112-00873T140753-2ed914cb7e9c073/llvm/lib/MC/MCParser -Iinclude -I/tfs/20210112-00873T140753-2ed914cb7e9c073/llvm/include -ffunction-sections -fdata-sections -mno-outline -no-canonical-prefixes -ffile-prefix-map=/tfs/20210112-00873T140753-2ed914cb7e9c073= -fmacro-prefix-map=/tfs/20210112-00873T140753-2ed914cb7e9c073= -fno-ident -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -w -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -fno-exceptions -fno-rtti -UNDEBUG -std=c++14 -MD -MT lib/MC/MCParser/CMakeFiles/LLVMMCParser.dir/AsmParser.cpp.o -MF lib/MC/MCParser/CMakeFiles/LLVMMCParser.dir/AsmParser.cpp.o.d -o lib/MC/MCParser/CMakeFiles/LLVMMCParser.dir/AsmParser.cpp.o -c /tfs/20210112-00873T140753-2ed914cb7e9c073/llvm/lib/MC/MCParser/AsmParser.cpp -S -o test.S

So when it 'falls back to running the real compiler', it does not rewrite the paths. So that's a little unfortunate. It may be intentional that the rewrite does not take place in that case.

I've taken to fixing the issue by applying a rewrite to build.ninja in 8920f91.

peterwaller-arm added a commit that referenced this issue Oct 31, 2021
Turns out paths continue to leak into binaries so long as the compiler
is fed with absolute paths, even if we use CCACHE_BASEDIR.

So rewrite them in build.ninja.

Also, fix llvm-config, which also captures the build directory.
peterwaller-arm added a commit that referenced this issue Nov 5, 2021
Turns out paths continue to leak into binaries so long as the compiler
is fed with absolute paths, even if we use CCACHE_BASEDIR.

So rewrite them in build.ninja.

Also, fix llvm-config, which also captures the build directory.

Signed-off-by: Peter Waller <peter.waller@arm.com>
peterwaller-arm added a commit that referenced this issue Nov 18, 2021
Turns out paths continue to leak into binaries so long as the compiler
is fed with absolute paths, even if we use CCACHE_BASEDIR.

So rewrite them in build.ninja.

Also, fix llvm-config, which also captures the build directory.

Signed-off-by: Peter Waller <peter.waller@arm.com>
peterwaller-arm added a commit that referenced this issue Nov 26, 2021
Turns out paths continue to leak into binaries so long as the compiler
is fed with absolute paths, even if we use CCACHE_BASEDIR.

So rewrite them in build.ninja.

Also, fix llvm-config, which also captures the build directory.

Signed-off-by: Peter Waller <peter.waller@arm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants