Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce llvm/gwp-asan #45226

Merged
merged 13 commits into from
Feb 9, 2023
Merged

Conversation

hanfei1991
Copy link
Member

@hanfei1991 hanfei1991 commented Jan 12, 2023

reimplementation for #36826

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Introduce gwp-asan implemented by llvm runtime. This closes #27039.

Information about CI checks: https://clickhouse.com/docs/en/development/continuous-integration/

@robot-clickhouse robot-clickhouse added pr-improvement Pull request with some product improvements submodule changed At least one submodule changed in this PR. labels Jan 12, 2023
@hanfei1991 hanfei1991 marked this pull request as ready for review January 12, 2023 15:00
@hanfei1991 hanfei1991 changed the title introduce llvm/gwp-asan [WIP] introduce llvm/gwp-asan Jan 12, 2023
@qoega
Copy link
Member

qoega commented Jan 12, 2023

Is it enabled by default? It means in current stateful/integration/stress it was not triggered?
And do you have an example how report will look like? We may need to adapt scripts that try to find Sanitizer reports.

@hanfei1991
Copy link
Member Author

Is it enabled by default? It means in current stateful/integration/stress it was not triggered? And do you have an example how report will look like? We may need to adapt scripts that try to find Sanitizer reports.

It's supposed to enable by default. Why it's not triggered in stateful/integration/stress tests? I'm not familiar with these tests and please let me know.

Yeah I'm preparing some illustrations and will refine it soon

@hanfei1991
Copy link
Member Author

#40410

@hanfei1991 hanfei1991 changed the title [WIP] introduce llvm/gwp-asan Introduce llvm/gwp-asan Jan 18, 2023
@nickitat nickitat self-assigned this Jan 18, 2023
@nickitat nickitat added the pr-performance Pull request with some performance improvements label Jan 19, 2023
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 removed the pr-performance Pull request with some performance improvements label Jan 20, 2023
@nickitat
Copy link
Member

imo we should enable it for debug and release checks, but not for binaries we'll ship to users. it doesn't make a lot of sense anyway, because we don't have infrastructure like Chrome has for sending and processing these reports.
have you checked default parameters? maybe we could afford to give gwp-asan more memory or increase sampling rate to increase success probability?

@tavplubix
Copy link
Member

but not for binaries we'll ship to users

I don't agree. The idea of gwp-asan is that it can be enabled on production. For tests we have ordinary asan.

it doesn't make a lot of sense anyway, because we don't have infrastructure like Chrome has for sending and processing these reports

We have an integration with Sentry. But I think it's totally fine if users will simply create issues in our repo.

@nickitat
Copy link
Member

nickitat commented Jan 24, 2023

I don't agree. The idea of gwp-asan is that it can be enabled on production. For tests we have ordinary asan.

browser production specifically. Chrome isn't so compute intensive.

profit of using this tool is not yet confirmed. doc says that currently they track only allocations <= page size, so it is really questionable.

but it is simple to resolve our argument - let Alexey decide )

@alexey-milovidov
Copy link
Member

alexey-milovidov commented Jan 26, 2023

I'd agree with @tavplubix: the original idea is to enable it in production.
If it has <1% overhead, it is worth it.

The increased crash rate is also worth it because crashing is memory-safe while corrupting memory isn't.
Also, I expect it will uncover only very rare corner cases.

@alexey-milovidov
Copy link
Member

Could you please add something into trap.cpp, so we can easily test how the GWP-ASan works on dev machines?

Copy link
Member

@alexey-milovidov alexey-milovidov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code LGTM. Nice and contained.

@hanfei1991 hanfei1991 marked this pull request as draft January 26, 2023 13:48
@hanfei1991
Copy link
Member Author

Could you please add something into trap.cpp, so we can easily test how the GWP-ASan works on dev machines?

find it not working .....

WIP

@hanfei1991 hanfei1991 marked this pull request as ready for review January 30, 2023 16:09
@hanfei1991
Copy link
Member Author

Could you please add something into trap.cpp, so we can easily test how the GWP-ASan works on dev machines?

run select trap('use after free') 20,000 times and SIGSEGV is captured

023.01.30 17:05:21.203746 [ 3461259 ] {} <Trace> BaseDaemon: Received signal 11                                                                                                                                                                                       [27/1042]2023.01.30 17:05:21.203926 [ 3478690 ] {} <Fatal> BaseDaemon: ########################################                                  
2023.01.30 17:05:21.203967 [ 3478690 ] {} <Fatal> BaseDaemon: (version 22.13.1.1, build id: BAA7548756AE5AEA2F61910F246D99A036B2C359) (from thread 3461260) (query_id: 925bda95-803b-4e52-b4d8-3d468efe3981) (query: select trap('use after free');) Received signal Segmentation fault (11)                                                        
2023.01.30 17:05:21.203988 [ 3478690 ] {} <Fatal> BaseDaemon: Address: 0x7fb265660000 Access: write. Attempted access has violated the permissions assigned to the memory area.                                                                                                 2023.01.30 17:05:21.204006 [ 3478690 ] {} <Fatal> BaseDaemon: Stack trace: 0x1058a4fc 0xbd6720a 0xbd66ece 0x154394f1 0x15439cd9 0x1543ac69 0x15a43997 0x15a4316a 0x15c0e6e0 0x15c1b08f 0x15c0fbc9 0x15c175ea 0x15c058d5 0x15be8f10 0x15bf32e1 0x15bf844f 0x1637c5cf 0x163767ff 0
x16372243 0x1636ff71 0x163ec13a 0x163ea0bf 0x1633da33 0x1633cd73 0x16693ab8 0x16691c16 0x1726c1e4 0x1727b359 0x196b3ca7 0x196b418d 0x19824c07 0x198227e3 0x7fb265800b43 0x7fb265892a00                                                                                          2023.01.30 17:05:21.645462 [ 3478690 ] {} <Fatal> BaseDaemon: 3. DB::FunctionTrap::executeImpl(std::__1::vector<DB::ColumnWithTypeAndName, std::__1::allocator<DB::ColumnWithTypeAndName>> const&, std::__1::shared_ptr<DB::IDataType const> const&, unsigned long) const @ 0x10
58a4fc in /home/hanfei/ClickHouse/build/programs/clickhouse                                                                                                                                                                                                                     
2023.01.30 17:05:22.000379 [ 3461450 ] {} <Trace> AsynchronousMetrics: MemoryTracking: was 370.07 MiB, peak 407.74 MiB, free memory in arenas 364.00 KiB, will set to 832.15 MiB (RSS), difference: 462.08 MiB
2023.01.30 17:05:22.066063 [ 3478690 ] {} <Fatal> BaseDaemon: 4. DB::IFunction::executeImplDryRun(std::__1::vector<DB::ColumnWithTypeAndName, std::__1::allocator<DB::ColumnWithTypeAndName>> const&, std::__1::shared_ptr<DB::IDataType const> const&, unsigned long) const @ 0
xbd6720a in /home/hanfei/ClickHouse/build/programs/clickhouse                                                                                                                                                                                                                   
2023.01.30 17:05:22.485915 [ 3478690 ] {} <Fatal> BaseDaemon: 5. DB::FunctionToExecutableFunctionAdaptor::executeDryRunImpl(std::__1::vector<DB::ColumnWithTypeAndName, std::__1::allocator<DB::ColumnWithTypeAndName>> const&, std::__1::shared_ptr<DB::IDataType const> const&, unsigned long) const @ 0xbd66ece in /home/hanfei/ClickHouse/build/programs/clickhouse                                                 
2023.01.30 17:05:22.492873 [ 3478690 ] {} <Fatal> BaseDaemon: 6. ./build/./src/Functions/IFunction.cpp:0: DB::IExecutableFunction::executeWithoutLowCardinalityColumns(std::__1::vector<DB::ColumnWithTypeAndName, std::__1::allocator<DB::ColumnWithTypeAndName>> const&, std::
__1::shared_ptr<DB::IDataType const> const&, unsigned long, bool) const @ 0x154394f1 in /home/hanfei/ClickHouse/build/programs/clickhouse                                                                                            
2023.01.30 17:05:22.499199 [ 3478690 ] {} <Fatal> BaseDaemon: 7.1. inlined from ./build/./contrib/boost/boost/smart_ptr/intrusive_ptr.hpp:115: intrusive_ptr                                                           
2023.01.30 17:05:22.499214 [ 3478690 ] {} <Fatal> BaseDaemon: 7.2. inlined from ./build/./contrib/boost/boost/smart_ptr/intrusive_ptr.hpp:122: boost::intrusive_ptr<DB::IColumn const>::operator=(boost::intrusive_ptr<DB::IColumn const>&&)                          
2023.01.30 17:05:22.499227 [ 3478690 ] {} <Fatal> BaseDaemon: 7.3. inlined from ./build/./src/Common/COW.h:136: COW<DB::IColumn>::immutable_ptr<DB::IColumn>::operator=(COW<DB::IColumn>::immutable_ptr<DB::IColumn>&&)                                                  
2023.01.30 17:05:22.499238 [ 3478690 ] {} <Fatal> BaseDaemon: 7. ./build/./src/Functions/IFunction.cpp:302: DB::IExecutableFunction::executeWithoutSparseColumns(std::__1::vector<DB::ColumnWithTypeAndName, std::__1::allocator<DB::ColumnWithTypeAndName>> const&, std::__1::s
hared_ptr<DB::IDataType const> const&, unsigned long, bool) const @ 0x15439cd9 in /home/hanfei/ClickHouse/build/programs/clickhouse     
2023.01.30 17:05:22.505820 [ 3478690 ] {} <Fatal> BaseDaemon: 8. ./build/./src/Functions/IFunction.cpp:0: DB::IExecutableFunction::execute(std::__1::vector<DB::ColumnWithTypeAndName, std::__1::allocator<DB::ColumnWithTypeAndName>> const&, std::__1::shared_ptr<DB::IDataTyp
e const> const&, unsigned long, bool) const @ 0x1543ac69 in /home/hanfei/ClickHouse/build/programs/clickhouse                                                                                                                                                                   
2023.01.30 17:05:22.578952 [ 3478690 ] {} <Fatal> BaseDaemon: 9. ./build/./src/Interpreters/ActionsDAG.cpp:0: DB::ActionsDAG::addFunctionImpl(std::__1::shared_ptr<DB::IFunctionBase const> const&, std::__1::vector<DB::ActionsDAG::Node const*, std::__1::allocator<DB::Action
sDAG::Node const*>>, std::__1::vector<DB::ColumnWithTypeAndName, std::__1::allocator<DB::ColumnWithTypeAndName>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, bool) @ 0x15a43997 in /home/hanfei/ClickHouse/build/programs/clickhouse
2023.01.30 17:05:22.649428 [ 3478690 ] {} <Fatal> BaseDaemon: 10. ./build/./src/Interpreters/ActionsDAG.cpp:0: DB::ActionsDAG::addFunction(std::__1::shared_ptr<DB::IFunctionOverloadResolver> const&, std::__1::vector<DB::ActionsDAG::Node const*, std::__1::allocator<DB::Act
ionsDAG::Node const*>>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>) @ 0x15a4316a in /home/hanfei/ClickHouse/build/programs/clickhouse
2023.01.30 17:05:22.674120 [ 3478690 ] {} <Fatal> BaseDaemon: 11. ./build/./src/Interpreters/ActionsVisitor.cpp:0: DB::ScopeStack::addFunction(std::__1::shared_ptr<DB::IFunctionOverloadResolver> const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<c
har>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>) @ 0x15c0e6e0 in /home/hanfei/ClickHou
se/build/programs/clickhouse                                        
2023.01.30 17:05:22.703083 [ 3478690 ] {} <Fatal> BaseDaemon: 12.1. inlined from ./build/./contrib/llvm-project/libcxx/include/string:1499: std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::__is_long[abi:v15000]() const
2023.01.30 17:05:22.703116 [ 3478690 ] {} <Fatal> BaseDaemon: 12.2. inlined from ./build/./contrib/llvm-project/libcxx/include/string:2333: ~basic_string
2023.01.30 17:05:22.703125 [ 3478690 ] {} <Fatal> BaseDaemon: 12. ./build/./src/Interpreters/ActionsVisitor.h:184: DB::ActionsMatcher::Data::addFunction(std::__1::shared_ptr<DB::IFunctionOverloadResolver> const&, std::__1::vector<std::__1::basic_string<char, std::__1::cha
r_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>) @ 0x15c1b08f in /home/hanfe
i/ClickHouse/build/programs/clickhouse                              
2023.01.30 17:05:22.729203 [ 3478690 ] {} <Fatal> BaseDaemon: 13.1. inlined from ./build/./contrib/llvm-project/libcxx/include/string:1499: std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::__is_long[abi:v15000]() const
2023.01.30 17:05:22.729232 [ 3478690 ] {} <Fatal> BaseDaemon: 13.2. inlined from ./build/./contrib/llvm-project/libcxx/include/string:2333: ~basic_string
2023.01.30 17:05:22.729256 [ 3478690 ] {} <Fatal> BaseDaemon: 13. ./build/./src/Interpreters/ActionsVisitor.cpp:1312: DB::ActionsMatcher::visit(DB::ASTFunction const&, std::__1::shared_ptr<DB::IAST> const&, DB::ActionsMatcher::Data&) @ 0x15c0fbc9 in /home/hanfei/ClickHous
e/build/programs/clickhouse                                         
2023.01.30 17:05:22.753967 [ 3478690 ] {} <Fatal> BaseDaemon: 14. ./build/./src/Interpreters/ActionsVisitor.cpp:832: DB::ActionsMatcher::visit(DB::ASTExpressionList&, std::__1::shared_ptr<DB::IAST> const&, DB::ActionsMatcher::Data&) @ 0x15c175ea in /home/hanfei/ClickHouse
/build/programs/clickhouse                                          
2023.01.30 17:05:22.767804 [ 3461330 ] {} <Debug> DNSResolver: Updating DNS cache                                                       
2023.01.30 17:05:22.767844 [ 3461330 ] {} <Debug> DNSResolver: Updated DNS cache                                                        
2023.01.30 17:05:22.782038 [ 3478690 ] {} <Fatal> BaseDaemon: 15. ./build/./src/Interpreters/InDepthNodeVisitor.h:78: DB::InDepthNodeVisitor<DB::ActionsMatcher, true, false, std::__1::shared_ptr<DB::IAST> const>::doVisit(std::__1::shared_ptr<DB::IAST> const&) @ 0x15c058d5
 in /home/hanfei/ClickHouse/build/programs/clickhouse               
2023.01.30 17:05:22.809896 [ 3478690 ] {} <Fatal> BaseDaemon: 16.1. inlined from ./build/./src/Interpreters/InDepthNodeVisitor.h:0: void DB::InDepthNodeVisitor<DB::ActionsMatcher, true, false, std::__1::shared_ptr<DB::IAST> const>::visitImplMain<false>(std::__1::shared_pt
r<DB::IAST> const&)                                                 
2023.01.30 17:05:22.809929 [ 3478690 ] {} <Fatal> BaseDaemon: 16.2. inlined from ./build/./src/Interpreters/InDepthNodeVisitor.h:51: void DB::InDepthNodeVisitor<DB::ActionsMatcher, true, false, std::__1::shared_ptr<DB::IAST> const>::visitImpl<false>(std::__1::shared_ptr<D
B::IAST> const&)                                                    
2023.01.30 17:05:22.809939 [ 3478690 ] {} <Fatal> BaseDaemon: 16.3. inlined from ./build/./src/Interpreters/InDepthNodeVisitor.h:32: DB::InDepthNodeVisitor<DB::ActionsMatcher, true, false, std::__1::shared_ptr<DB::IAST> const>::visit(std::__1::shared_ptr<DB::IAST> const&)
2023.01.30 17:05:22.809952 [ 3478690 ] {} <Fatal> BaseDaemon: 16. ./build/./src/Interpreters/ExpressionAnalyzer.cpp:563: DB::ExpressionAnalyzer::getRootActions(std::__1::shared_ptr<DB::IAST> const&, bool, std::__1::shared_ptr<DB::ActionsDAG>&, bool) @ 0x15be8f10 in /home/
hanfei/ClickHouse/build/programs/clickhouse                         
2023.01.30 17:05:22.845307 [ 3478690 ] {} <Fatal> BaseDaemon: 17.1. inlined from ./build/./contrib/llvm-project/libcxx/include/__memory/shared_ptr.h:701: ~shared_ptr
2023.01.30 17:05:22.845341 [ 3478690 ] {} <Fatal> BaseDaemon: 17. ./build/./src/Interpreters/ExpressionAnalyzer.cpp:1535: DB::SelectQueryExpressionAnalyzer::appendSelect(DB::ExpressionActionsChain&, bool) @ 0x15bf32e1 in /home/hanfei/ClickHouse/build/programs/clickhouse
2023.01.30 17:05:22.882993 [ 3478690 ] {} <Fatal> BaseDaemon: 18. ./build/./src/Interpreters/ExpressionAnalyzer.cpp:2080: DB::ExpressionAnalysisResult::ExpressionAnalysisResult(DB::SelectQueryExpressionAnalyzer&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> con
st&, bool, bool, bool, std::__1::shared_ptr<DB::FilterDAGInfo> const&, std::__1::shared_ptr<DB::FilterDAGInfo> const&, DB::Block const&) @ 0x15bf844f in /home/hanfei/ClickHouse/build/programs/clickhouse
2023.01.30 17:05:22.931446 [ 3478690 ] {} <Fatal> BaseDaemon: 19. ./build/./src/Interpreters/InterpreterSelectQuery.cpp:840: DB::InterpreterSelectQuery::getSampleBlockImpl() @ 0x1637c5cf in /home/hanfei/ClickHouse/build/programs/clickhouse
2023.01.30 17:05:22.975258 [ 3478690 ] {} <Fatal> BaseDaemon: 20. ./build/./src/Interpreters/InterpreterSelectQuery.cpp:673: DB::InterpreterSelectQuery::InterpreterSelectQuery(std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::Context> const&, std::__1::optio
nal<DB::Pipe>, std::__1::shared_ptr<DB::IStorage> const&, DB::SelectQueryOptions const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, st
d::__1::allocator<char>>>> const&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, std::__1::shared_ptr<DB::PreparedSets>)::$_2::operator()(bool) const @ 0x163767ff in /home/hanfei/ClickHouse/build/programs/clickhouse
2023.01.30 17:05:23.000365 [ 3461450 ] {} <Trace> AsynchronousMetrics: MemoryTracking: was 832.15 MiB, peak 832.15 MiB, free memory in arenas 488.00 KiB, will set to 908.82 MiB (RSS), difference: 76.67 MiB
2023.01.30 17:05:23.017808 [ 3478690 ] {} <Fatal> BaseDaemon: 21.1. inlined from ./build/./contrib/llvm-project/libcxx/include/__memory/shared_ptr.h:815: std::__1::shared_ptr<DB::Context>::operator->[abi:v15000]() const
2023.01.30 17:05:23.017836 [ 3478690 ] {} <Fatal> BaseDaemon: 21. ./build/./src/Interpreters/InterpreterSelectQuery.cpp:680: DB::InterpreterSelectQuery::InterpreterSelectQuery(std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::Context> const&, std::__1::optio
nal<DB::Pipe>, std::__1::shared_ptr<DB::IStorage> const&, DB::SelectQueryOptions const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, st
d::__1::allocator<char>>>> const&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, std::__1::shared_ptr<DB::PreparedSets>) @ 0x16372243 in /home/hanfei/ClickHouse/build/programs/clickhouse
2023.01.30 17:05:23.060823 [ 3478690 ] {} <Fatal> BaseDaemon: 22.1. inlined from ./build/./contrib/llvm-project/libcxx/include/optional:260: ~__optional_destruct_base
2023.01.30 17:05:23.060855 [ 3478690 ] {} <Fatal> BaseDaemon: 22. ./build/./src/Interpreters/InterpreterSelectQuery.cpp:198: DB::InterpreterSelectQuery::InterpreterSelectQuery(std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::Context> const&, DB::SelectQuery
Options const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&) @ 0x1636ff71 in /home/hanfei/ClickHous
e/build/programs/clickhouse                                         
2023.01.30 17:05:23.074620 [ 3478690 ] {} <Fatal> BaseDaemon: 23. ./build/./src/Interpreters/InterpreterSelectWithUnionQuery.cpp:257: DB::InterpreterSelectWithUnionQuery::buildCurrentChildInterpreter(std::__1::shared_ptr<DB::IAST> const&, std::__1::vector<std::__1::basic_
string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&) @ 0x163ec13a in /home/hanfei/ClickHouse/build/programs/clickhouse
2023.01.30 17:05:23.086657 [ 3478690 ] {} <Fatal> BaseDaemon: 24. ./build/./src/Interpreters/InterpreterSelectWithUnionQuery.cpp:150: DB::InterpreterSelectWithUnionQuery::InterpreterSelectWithUnionQuery(std::__1::shared_ptr<DB::IAST> const&, std::__1::shared_ptr<DB::Conte
xt>, DB::SelectQueryOptions const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>> const&) @ 0x163ea0bf in /h
ome/hanfei/ClickHouse/build/programs/clickhouse                     
2023.01.30 17:05:23.091620 [ 3478690 ] {} <Fatal> BaseDaemon: 25.1. inlined from ./build/./contrib/llvm-project/libcxx/include/vector:434: ~vector
2023.01.30 17:05:23.091634 [ 3478690 ] {} <Fatal> BaseDaemon: 25. ./build/./contrib/llvm-project/libcxx/include/__memory/unique_ptr.h:714: std::__1::__unique_if<DB::InterpreterSelectWithUnionQuery>::__unique_single std::__1::make_unique[abi:v15000]<DB::InterpreterSelectWi
thUnionQuery, std::__1::shared_ptr<DB::IAST>&, std::__1::shared_ptr<DB::Context>&, DB::SelectQueryOptions const&>(std::__1::shared_ptr<DB::IAST>&, std::__1::shared_ptr<DB::Context>&, DB::SelectQueryOptions const&) @ 0x1633da33 in /home/hanfei/ClickHouse/build/programs/cli
ckhouse

@alesapin
Copy link
Member

alesapin commented Feb 7, 2023

Stack trace:
https://pastila.nl/?0765695e/1b9c1c4dfe1f10d353aa149d2ca0dd96

#0  0x00007f9c3abcc170 in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f9c3abc40a3 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x000000000bc29b31 in pthread_mutex_lock (arg=0x120888b0 <Memory::GuardedAlloc+56>)
    at ../../ClickHouse/src/Common/ThreadFuzzer.cpp:391
#3  0x00000000119f572d in gwp_asan::GuardedPoolAllocator::disable (this=0x12088878 <Memory::GuardedAlloc>)
    at ../../ClickHouse/contrib/llvm-project/compiler-rt/lib/gwp_asan/guarded_pool_allocator.cpp:114
#4  0x00007f9c3adbfad0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#5  0x00007f9c3ae0df14 in fork () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x0000000011a1b3ed in Poco::Util::ServerApplication::beDaemon (this=<optimized out>)
    at ../../ClickHouse/contrib/poco/Util/src/ServerApplication.cpp:664
#7  0x0000000011a1b271 in Poco::Util::ServerApplication::run (this=0x7ffd39278dc8, argc=2, argv=0x7f9c39e91f20)
    at ../../ClickHouse/contrib/poco/Util/src/ServerApplication.cpp:595
#8  0x000000000bc6fa1f in mainEntryClickHouseServer (argc=2, argv=0x7f9c39e91f20)
    at ../../ClickHouse/programs/server/Server.cpp:195
#9  0x00000000075dbc71 in main (argc_=<optimized out>, argv_=<optimized out>)
    at ../../ClickHouse/programs/main.cpp:481

@alexey-milovidov alexey-milovidov merged commit d9fbf64 into ClickHouse:master Feb 9, 2023
@hanfei1991 hanfei1991 deleted the hanfei/gwp-asan branch February 9, 2023 09:38
@hanfei1991 hanfei1991 restored the hanfei/gwp-asan branch February 9, 2023 09:40
@Enna1
Copy link

Enna1 commented Feb 9, 2023

Hi! Is there any reason why we choose llvm/gwpasan implementation instead of the implementation in #36826 ?

@hanfei1991
Copy link
Member Author

Hi! Is there any reason why we choose llvm/gwpasan implementation instead of the implementation in #36826 ?

I don't have a specific reason. Just trust a official implementation more.

@hanfei1991
Copy link
Member Author

Hi! Is there any reason why we choose llvm/gwpasan implementation instead of the implementation in #36826 ?

Actually I read your Chinese blog and hence, knew about the llvm's implementation and found it easy to use. Do you have any suggestions on different versions of gwp-asan?

@Enna1
Copy link

Enna1 commented Feb 10, 2023

Actually I read your Chinese blog and hence, knew about the llvm's implementation and found it easy to use.

Wow! I never expect someone would read my Chinese blog, hope it would be helpful.

Hi! Is there any reason why we choose llvm/gwpasan implementation instead of the implementation in #36826 ?

We (ByteDance dynamic analysis group) are incorporating with our inner ClickHouse team and trying to integrate gwp-asan to our inner version ClickHouse.
There were some gwp-asan implmentations we considered:

I did a search and found that #36826, so I backported this version gwp-asan to our inner vrrsion ClickHouse.
Currently we are testing and benchmarking about this. Recently I found ClickHouse introduced llvm/gwp-asan instead of the implementation in #36826, so I asked this question.

Do you have any suggestions on different versions of gwp-asan?

Actually I don not specific suggestion. But I read the implementation in #36826, Here are some difference compared with llvm/gwp-asan implementation:

@hanfei1991
Copy link
Member Author

failure of mprotect is not a user fault. Mostly due to frequent calls of mprotect exceeds system limit: https://man7.org/linux/man-pages/man2/mprotect.2.html (but gwp-asan use very small space of slots, I don't think the failure is because of this). Anyway, no matter what reasons, the failure of syscall should be properly handled. e.g. In tcmalloc's implementation, mprotect's failure is captured like this https://github.com/google/tcmalloc/blob/5034f8cecdbe559bf24e0ae7f7eb7c10b873ac9e/tcmalloc/guarded_page_allocator.cc#L94

We can try to fix it like tcmalloc , and ask llvm why the implementation is like this...

#if USE_GWP_ASAN
if (unlikely(GuardedAlloc.shouldSample()))
{
if constexpr (sizeof...(TAlign) == 1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It worth adding some metric, for guard allocations. To increase introspection.

@azat
Copy link
Collaborator

azat commented Feb 23, 2023

I would say that mprotect problem is really huge, and it is better to disable GWP ASan, at least by default in releases, for now.

Since once VMA limits reached you will get mmap failures, std::bad_alloc, ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-improvement Pull request with some product improvements submodule changed At least one submodule changed in this PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a technique similar to GWP-ASan
10 participants