Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clang::ASTWriter can create a crashing PCH if an incorrect hasErrors value is passed #53952

Closed
TestingPlant opened this issue Feb 19, 2022 · 9 comments
Assignees
Labels
clang:codegen crash Prefer [crash-on-valid] or [crash-on-invalid] good first issue https://github.com/llvm/llvm-project/contribute

Comments

@TestingPlant
Copy link

With the following code:

#include "clang/Frontend/ASTUnit.h"
#include "clang/Serialization/ASTWriter.h"
#include "clang/Serialization/InMemoryModuleCache.h"
#include "clang/Tooling/Tooling.h"
#include "llvm/ADT/SmallString.h"
#include "llvm/Bitstream/BitstreamWriter.h"
#include <fstream>
#include <memory>
#include <sstream>
#include <string>

int main() {
	std::ifstream codeFile("input.cc");
	std::stringstream code;
	code << codeFile.rdbuf();

	const std::unique_ptr<clang::ASTUnit> astUnit = clang::tooling::buildASTFromCode(code.str());

	const bool hasErrors = false; // This will not cause a crash if this is true
	llvm::SmallString<128> pchData;
	llvm::BitstreamWriter pchDataStream(pchData);
	clang::InMemoryModuleCache moduleCache;
	clang::ASTWriter astWriter(pchDataStream, pchData, moduleCache, {}, false);
	astWriter.WriteAST(astUnit->getSema(), "", nullptr, "", hasErrors);

	std::ofstream file("input.gch", std::ios::binary | std::ios::out);
	file << static_cast<std::string>(pchData);
}

and the below file:

// input.cc
int main() {
	foo(FOO);
}

clang will crash when using the input.gch file generated from the code.

Output:

$ ./a.out                         
input.cc:3:6: error: use of undeclared identifier 'FOO'    
        foo(FOO);                                          
            ^
$ clang input.gch
/tmp/input.cc:3:2: error: cannot compile this l-value expression yet
        foo(FOO);
        ^~~
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /usr/bin/clang-13 -cc1 -triple x86_64-pc-linux-gnu -emit-obj -mrelax-all --mrelax-relocations -disable-free -disable-llvm-verifier -discard-value-names -main-file-name input.gch -mrelocation-model pic -pic-level 2 -pic-is-pie -mframe-pointer=all -fmath-errno -fno-rounding-math -mconstructor-aliases -munwind-tables -target-cpu x86-64 -tune-cpu generic -debugger-tuning=gdb -fcoverage-compilation-dir=/tmp -resource-dir /usr/lib/clang/13.0.1 -fdebug-compilation-dir=/tmp -ferror-limit 19 -stack-protector 2 -fgnuc-version=4.2.1 -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/input-11f595.o -x precompiled-header input.gch
1.	<eof> parser at end of file
2.	/tmp/input.cc:2:5: LLVM IR generation of declaration 'main'
3.	/tmp/input.cc:2:5: Generating code for declaration 'main'
 #0 0x00007fd23586cea7 (/usr/lib/libLLVM-13.so+0xba6ea7)
 #1 0x00007fd23586a6a6 (/usr/lib/libLLVM-13.so+0xba46a6)
 #2 0x00007fd234920da0 __restore_rt sigaction.c:0:0
 #3 0x00007fd235a39b48 llvm::PointerType::get(llvm::Type*, unsigned int) (/usr/lib/libLLVM-13.so+0xd73b48)
 #4 0x00007fd23cdec3bd clang::CodeGen::CodeGenFunction::EmitUnsupportedLValue(clang::Expr const*, char const*) (/usr/lib/libclang-cpp.so.13+0x180e3bd)
 #5 0x00007fd23ce020e5 clang::CodeGen::CodeGenFunction::EmitLValue(clang::Expr const*) (/usr/lib/libclang-cpp.so.13+0x18240e5)
 #6 0x00007fd23ce12222 clang::CodeGen::CodeGenFunction::EmitCallee(clang::Expr const*) (/usr/lib/libclang-cpp.so.13+0x1834222)
 #7 0x00007fd23ce1266b clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) (/usr/lib/libclang-cpp.so.13+0x183466b)
 #8 0x00007fd23ce1dbbe (/usr/lib/libclang-cpp.so.13+0x183fbbe)
 #9 0x00007fd23ce57e52 clang::CodeGen::CodeGenFunction::EmitScalarExpr(clang::Expr const*, bool) (/usr/lib/libclang-cpp.so.13+0x1879e52)
#10 0x00007fd23ce11cdf clang::CodeGen::CodeGenFunction::EmitAnyExpr(clang::Expr const*, clang::CodeGen::AggValueSlot, bool) (/usr/lib/libclang-cpp.so.13+0x1833cdf)
#11 0x00007fd23ce11eb6 clang::CodeGen::CodeGenFunction::EmitIgnoredExpr(clang::Expr const*) (/usr/lib/libclang-cpp.so.13+0x1833eb6)
#12 0x00007fd23cf1f972 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/lib/libclang-cpp.so.13+0x1941972)
#13 0x00007fd23cf20822 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/lib/libclang-cpp.so.13+0x1942822)
#14 0x00007fd23cf78446 clang::CodeGen::CodeGenFunction::EmitFunctionBody(clang::Stmt const*) (/usr/lib/libclang-cpp.so.13+0x199a446)
#15 0x00007fd23cf98d40 clang::CodeGen::CodeGenFunction::GenerateCode(clang::GlobalDecl, llvm::Function*, clang::CodeGen::CGFunctionInfo const&) (/usr/lib/libclang-cpp.so.13+0x19bad40)
#16 0x00007fd23cfa1d78 clang::CodeGen::CodeGenModule::EmitGlobalFunctionDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/usr/lib/libclang-cpp.so.13+0x19c3d78)
#17 0x00007fd23cf9fb75 clang::CodeGen::CodeGenModule::EmitGlobalDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/usr/lib/libclang-cpp.so.13+0x19c1b75)
#18 0x00007fd23cfc01bf (/usr/lib/libclang-cpp.so.13+0x19e21bf)
#19 0x00007fd23d00f5a4 (/usr/lib/libclang-cpp.so.13+0x1a315a4)
#20 0x00007fd23cf2c779 (/usr/lib/libclang-cpp.so.13+0x194e779)
#21 0x00007fd23d3de019 (/usr/lib/libclang-cpp.so.13+0x1e00019)
#22 0x00007fd23d3683ab non-virtual thunk to clang::ASTReader::StartTranslationUnit(clang::ASTConsumer*) (/usr/lib/libclang-cpp.so.13+0x1d8a3ab)
#23 0x00007fd23bfcc0f6 clang::ParseAST(clang::Sema&, bool, bool) (/usr/lib/libclang-cpp.so.13+0x9ee0f6)
#24 0x00007fd23d5548b9 clang::FrontendAction::Execute() (/usr/lib/libclang-cpp.so.13+0x1f768b9)
#25 0x00007fd23d4fabbf clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/usr/lib/libclang-cpp.so.13+0x1f1cbbf)
#26 0x00007fd23d59ff20 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/usr/lib/libclang-cpp.so.13+0x1fc1f20)
#27 0x000055fe56ac784c cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/usr/bin/clang-13+0x1684c)
#28 0x000055fe56ac9c2d (/usr/bin/clang-13+0x18c2d)
#29 0x000055fe56abe185 main (/usr/bin/clang-13+0xd185)
#30 0x00007fd23490bb25 __libc_start_main (/usr/lib/libc.so.6+0x27b25)
#31 0x000055fe56ac048e _start (/usr/bin/clang-13+0xf48e)
clang-13: error: unable to execute command: Segmentation fault (core dumped)
clang-13: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 13.0.1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
clang-13: note: diagnostic msg: Error generating preprocessed source(s) - no preprocessable inputs.
@TestingPlant TestingPlant changed the title clang::ASTWriter can create a crahsing PCH if an incorrect hasErrors value is passed clang::ASTWriter can create a crashing PCH if an incorrect hasErrors value is passed Feb 19, 2022
@EugeneZelenko EugeneZelenko added clang:codegen crash Prefer [crash-on-valid] or [crash-on-invalid] and removed new issue labels Feb 19, 2022
@llvmbot
Copy link
Collaborator

llvmbot commented Feb 19, 2022

@llvm/issue-subscribers-clang-codegen

@vgvassilev vgvassilev added the good first issue https://github.com/llvm/llvm-project/contribute label Feb 20, 2022
@Aadi-Mittal-2004
Copy link

Hey there, I am new to open source so can you please explain the issue in detail so I can start working on it

@phyBrackets
Copy link
Member

Hey @Aadi-Mittal-2004 , If you are new to LLVM or open source in general, I'd suggest go through the contribution guidelines you can find here for LLVM https://llvm.org/docs/Contributing.html .

About the issue,
The issue is with passing the incorrect hasErrors value in the ASTWriter::WriteAST, that can lead to a crashing PCH(Pre compiled header), here when hasErrors is set to false but there are actually errors in the source code. In that case, the ASTWriter may generate an invalid PCH that can cause a crash when the PCH is later used for compilation but when hasErrors is set to true, then the ASTWriter will write the AST with compiler errors. This means that the generated PCH will contain information about the errors in the source code that were encountered during semantic analysis, and because of this when it later used in the compilation it will probably not cause a crash.
So, I think we need to correctly handle and check the hasErrors state before any major action.

@rajkumarananthu
Copy link
Contributor

Hi,

If no-one is working on this issue, I would like to take up this, can someone please assign the issue to me. As I am not in the contributor list, I am not able to assign it to myself.

@phyBrackets Thanks for the contributing guidelines and the detailed explanation of the scenario here.

Thanks
Rajkumar Ananthu.

@rajkumarananthu
Copy link
Contributor

rajkumarananthu commented Sep 22, 2023

Hi @TestingPlant @phyBrackets @danix800

I am trying to reproduce the issue and find the root cause for the same, but stuck with some linker error, can anyone of you help me in figuring this out.

I am new to llvm project, I managed to build clang and llvm properly. I tried few other things with the build to get more exposure on how things work.

But when I am trying to compile and link the code given above, I am facing an issue with linker as follows:

/usr/bin/ld: /tmp/cch79L0i.o:(.data.rel+0x0): undefined reference to `llvm::EnableABIBreakingChecks'
/usr/bin/ld: /tmp/cch79L0i.o: in function `llvm::MallocAllocator::Deallocate(void const*, unsigned long, unsigned long)':
test.cc:(.text._ZN4llvm15MallocAllocator10DeallocateEPKvmm[_ZN4llvm15MallocAllocator10DeallocateEPKvmm]+0x2f): undefined reference to `llvm::deallocate_buffer(void*, unsigned long, unsigned long)'
/usr/bin/ld: /tmp/cch79L0i.o: in function `llvm::SmallString<128u>::operator std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >() const':
test.cc:(.text._ZNK4llvm11SmallStringILj128EEcvNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEv[_ZNK4llvm11SmallStringILj128EEcvNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEv]+0x38): undefined reference to `llvm::SmallVectorBase<unsigned long>::size() const'
/usr/bin/ld: /tmp/cch79L0i.o: in function `llvm::SmallVectorTemplateCommon<char, void>::end()':
test.cc:(.text._ZN4llvm25SmallVectorTemplateCommonIcvE3endEv[_ZN4llvm25SmallVectorTemplateCommonIcvE3endEv]+0x28): undefined reference to `llvm::SmallVectorBase<unsigned long>::size() const'
/usr/bin/ld: /tmp/cch79L0i.o: in function `llvm::SmallVectorTemplateCommon<std::unique_ptr<clang::PCHContainerReader, std::default_delete<clang::PCHContainerReader> >, void>::end()':
test.cc:(.text._ZN4llvm25SmallVectorTemplateCommonISt10unique_ptrIN5clang18PCHContainerReaderESt14default_deleteIS3_EEvE3endEv[_ZN4llvm25SmallVectorTemplateCommonISt10unique_ptrIN5clang18PCHContainerReaderESt14default_deleteIS3_EEvE3endEv]+0x28): undefined reference to `llvm::SmallVectorBase<unsigned int>::size() const'

I am using the following command to compile & generate the executable for the input C++ program: (bld_ninja is my build directory)

g++ -I clang/include -I bld_ninja/tools/clang/include -I bld_ninja/include -I llvm/include test.cc $CLANGLIBS -std=c++17   -fno-exceptions -funwind-tables -fno-rtti -D_GNU_SOURCE -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_LIBCPP_ENABLE_HARDENED_MODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -L bld_ninja/lib

$CLANGLIBS has the clang libraries linked in correct order:

export CLANGLIBS="-lclangTooling -lclangFrontendTool -lclangFrontend -lclangDriver -lclangSerialization -lclangCodeGen -lclangParse -lclangSema -lclangStaticAnalyzerFrontend -lclangStaticAnalyzerCheckers -lclangStaticAnalyzerCore -lclangAnalysis -lclangARCMigrate -lclangRewrite -lclangRewriteFrontend -lclangEdit -lclangAST -lclangLex -lclangBasic -lcurses"

Based on the error above, I assume that this error is because the LLVM libraries are not linked properly, and I am not sure about the order of LLVM libraries that has to be listed as part of this. I tried using llvm-config to get some, but that did not solve my problem.

Can anyone help me with this.

Thanks
Rajkumar Ananthu

@phyBrackets
Copy link
Member

Hi, not exactly sure did you try building with $(llvm-config --cxxflags) $(llvm-config --ldflags) or you might wanna use -DLLVM_DISABLE_ABI_BREAKING_CHECKS_ENFORCING=OFF

@rajkumarananthu
Copy link
Contributor

@phyBrackets I have tried this still it is the same issue, I have tried passing the absolute paths also, still the same.

@rajkumarananthu
Copy link
Contributor

Hi Team,

I kind of followed this thread: https://stackoverflow.com/questions/8607432/link-fails-with-clang-llvm-using-g

And used https://github.com/loarabia/Clang-tutorial/blob/master/makefile this make file to compile,

And then for the linker errors I am getting further, I followed the LLVM thread https://discourse.llvm.org/t/undefined-reference-only-when-including-astmatchers/67687 and added libclang-cpp.so to the $CLANGLIBS list and thus solved my issue.

Now I am able to reproduce the issue, I will work to root cause the issue further and post any updates here.

Thank you for the time and support!

@rajkumarananthu
Copy link
Contributor

rajkumarananthu commented Oct 3, 2023

Hi Team,

I am able to root cause the issue, hasErrors has to be true, when there is a compilation error in the astUnit. Because of this the PCH generated is crashing.

I can avoid this in two ways:

  1. throw a Diagnostic error if the received hasErrors with ASTWriter::WriteAST() is not equal to Sema.PP.getDiagnostics().hasUncompilableErrorOccured()
  2. Or force the ASTHasCompilerErrors member variable inside ASTWriter class to be of correct value which is equal to Sema.PP.getDiagnostics().hasUncompilableErrorOccured()

I am not sure which is better, any ideas or suggestions?? Forcing ASTHasErrors can't be right in some cases, but this kind of scenario can happen when it is given hasErrors to WriteAST is wrong.

And one more thing, are we still following Phabricator review process or we moved to pull-requests?

Thanks
Rajkumar Ananthu.

rajkumarananthu added a commit to rajkumarananthu/llvm-project that referenced this issue Oct 3, 2023
The issue llvm#53952 is reported indicating clang is giving a crashing pch file, when hasErrors is been passed incorrectly to WriteAST method.

To fix the issue, I have a added an assertion to make sure the given value of ASTHasCompilerErrors is matching with Preprocessor diagnostics. And this assertion will get triggered inside Debug builds.

For release builds, based on the conditional check, forcefully set the ASTHasCompilerErrors member variable to a valid value from Preprocessor.
kazutakahirata added a commit that referenced this issue Oct 5, 2023
…ber variable correctly based on the PP diagnostics. (#68127)"

This reverts commit a50e63b.

With clang-14.0.6 as the host compiler, I'm getting:

ld.lld: error: undefined symbol: clang::ASTWriter::WriteAST(clang::Sema&, llvm::StringRef, clang::Module*, llvm::StringRef, bool, bool)
>>> referenced by ASTUnit.cpp
>>>               ASTUnit.cpp.o:(clang::ASTUnit::serialize(llvm::raw_ostream&)) in archive lib/libclangFrontend.a
AaronBallman added a commit that referenced this issue Oct 6, 2023
…rors member variable correctly based on the PP diagnostics. (#68127)""

This reverts commit a6acf3f and
relands a50e63b. The original revert
was done by mistake.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen crash Prefer [crash-on-valid] or [crash-on-invalid] good first issue https://github.com/llvm/llvm-project/contribute
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants