Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix crash during decoding min/max when type mismatch in parquet file #8849

Merged
merged 3 commits into from Jul 19, 2022
Merged

[BugFix] Fix crash during decoding min/max when type mismatch in parquet file #8849

merged 3 commits into from Jul 19, 2022

Conversation

dirtysalt
Copy link
Contributor

@dirtysalt dirtysalt commented Jul 18, 2022

What type of PR is this:

  • bug
  • feature
  • enhancement
  • refactor
  • others

Which issues of this PR fixes :

Fixes ##8848

Problem Summary(Required) :

It's because we ignore return status right here.

RETURN_IF_ERROR(ColumnConverterFactory::create_converter(field, type, timezone, &converter));

Status::NotSupported can be taken as success of decode_min_max_column or not. Because right down there, following code could also return Staus::NotSupported, but this error is fatal.

ColumnConverterFactory::create_converter(field, type, timezone, &converter));

@imay
Copy link
Contributor

imay commented Jul 18, 2022

@dirtysalt I think you should explain the reason why this PR will fix the crash.

imay
imay previously approved these changes Jul 18, 2022
Copy link
Contributor

@imay imay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@miomiocat
Copy link
Contributor

LGTM

DorianZheng
DorianZheng previously approved these changes Jul 18, 2022
Copy link
Contributor

@DorianZheng DorianZheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dirtysalt dirtysalt dismissed stale reviews from DorianZheng and imay via 3251c87 July 18, 2022 13:01
@dirtysalt
Copy link
Contributor Author

dirtysalt commented Jul 18, 2022

I get a very weird ASN stacktrace only when pipeline mode enabled .

It looks like min_max_conjunct_ctxs has been deallocated twice. And I suspect that the following call makes min_max_conjunct_ctxs held by RuntimeState too, and when RuntimeState is deallocated, min_max_conjunct_ctxs will be deallocated again.

start time: Mon Jul 18 20:54:54 CST 2022
I0000 00:00:00.000000 177517 vlog_is_on.cc:197] RAW: Set VLOG level for "*" to 2
=================================================================
==177517==ERROR: AddressSanitizer: heap-use-after-free on address 0x61900087ff80 at pc 0x000007c6edb5 bp 0x7f7a9b3bf580 sp 0x7f7a9b3bf578
READ of size 8 at 0x61900087ff80 thread T508 (pip_executor)
    #0 0x7c6edb4 in starrocks::ExprContext::close(starrocks::RuntimeState*) /home/disk4/zhangyan/StarRocks/be/src/exprs/expr_context.cpp:94
    #1 0x7c6dde7 in starrocks::ExprContext::~ExprContext() /home/disk4/zhangyan/StarRocks/be/src/exprs/expr_context.cpp:49
    #2 0x7c735a7 in starrocks::ObjectPool::add<starrocks::ExprContext>(starrocks::ExprContext*)::{lambda(void*)#1}::operator()(void*) const /home/disk4/zhangyan/StarRocks/be/src/common/object_pool.h:45
    #3 0x7c735d8 in starrocks::ObjectPool::add<starrocks::ExprContext>(starrocks::ExprContext*)::{lambda(void*)#1}::_FUN(void*) /home/disk4/zhangyan/StarRocks/be/src/common/object_pool.h:45
    #4 0x55cd402 in starrocks::ObjectPool::clear() /home/disk4/zhangyan/StarRocks/be/src/common/object_pool.h:52
    #5 0x55cd20b in starrocks::ObjectPool::~ObjectPool() /home/disk4/zhangyan/StarRocks/be/src/common/object_pool.h:34
    #6 0xa6c558e in std::_Sp_counted_ptr<starrocks::ObjectPool*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/shared_ptr_base.h:380
    #7 0x50ee19a in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/shared_ptr_base.h:158
    #8 0x50ec0ed in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/shared_ptr_base.h:733
    #9 0xa6ac9a7 in std::__shared_ptr<starrocks::ObjectPool, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr() /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/shared_ptr_base.h:1183
    #10 0xa6ac9c3 in std::shared_ptr<starrocks::ObjectPool>::~shared_ptr() /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/shared_ptr.h:121
    #11 0xa6a57c3 in starrocks::RuntimeState::~RuntimeState() /home/disk4/zhangyan/StarRocks/be/src/runtime/runtime_state.cpp:101
    #12 0x6abc315 in std::default_delete<starrocks::RuntimeState>::operator()(starrocks::RuntimeState*) const /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/unique_ptr.h:85
    #13 0x6adf294 in std::_Sp_counted_deleter<starrocks::RuntimeState*, std::default_delete<starrocks::RuntimeState>, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/shared_ptr_base.h:474
    #14 0x50ee19a in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/shared_ptr_base.h:158
    #15 0x50ec0ed in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/shared_ptr_base.h:733
    #16 0x6aae6c3 in std::__shared_ptr<starrocks::RuntimeState, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr() /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/shared_ptr_base.h:1183
    #17 0x6aae6df in std::shared_ptr<starrocks::RuntimeState>::~shared_ptr() /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/shared_ptr.h:121
    #18 0x6c2cd5a in starrocks::pipeline::GlobalDriverExecutor::_worker_thread() /home/disk4/zhangyan/StarRocks/be/src/exec/pipeline/pipeline_driver_executor.cpp:163
    #19 0x6c2b0b5 in operator() /home/disk4/zhangyan/StarRocks/be/src/exec/pipeline/pipeline_driver_executor.cpp:56
    #20 0x6c3327f in __invoke_impl<void, starrocks::pipeline::GlobalDriverExecutor::initialize(int)::<lambda()>&> /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/invoke.h:60
    #21 0x6c32975 in __invoke_r<void, starrocks::pipeline::GlobalDriverExecutor::initialize(int)::<lambda()>&> /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/invoke.h:110
    #22 0x6c31d8f in _M_invoke /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/std_function.h:291
    #23 0x51d645f in std::function<void ()>::operator()() const /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/std_function.h:622
    #24 0xac5c5b9 in starrocks::FunctionRunnable::run() /home/disk4/zhangyan/StarRocks/be/src/util/threadpool.cpp:44
    #25 0xac59274 in starrocks::ThreadPool::dispatch_thread() /home/disk4/zhangyan/StarRocks/be/src/util/threadpool.cpp:513
    #26 0xac77b33 in void std::__invoke_impl<void, void (starrocks::ThreadPool::*&)(), starrocks::ThreadPool*&>(std::__invoke_memfun_deref, void (starrocks::ThreadPool::*&)(), starrocks::ThreadPool*&) /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/invoke.h:73
    #27 0xac7748c in std::__invoke_result<void (starrocks::ThreadPool::*&)(), starrocks::ThreadPool*&>::type std::__invoke<void (starrocks::ThreadPool::*&)(), starrocks::ThreadPool*&>(void (starrocks::ThreadPool::*&)(), starrocks::ThreadPool*&) /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/invoke.h:95
    #28 0xac7687c in void std::_Bind<void (starrocks::ThreadPool::*(starrocks::ThreadPool*))()>::__call<void, , 0ul>(std::tuple<>&&, std::_Index_tuple<0ul>) /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/functional:416
    #29 0xac751b0 in void std::_Bind<void (starrocks::ThreadPool::*(starrocks::ThreadPool*))()>::operator()<, void>() /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/functional:499
    #30 0xac7204f in void std::__invoke_impl<void, std::_Bind<void (starrocks::ThreadPool::*(starrocks::ThreadPool*))()>&>(std::__invoke_other, std::_Bind<void (starrocks::ThreadPool::*(starrocks::ThreadPool*))()>&) /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/invoke.h:60
    #31 0xac6f55f in std::enable_if<is_invocable_r_v<void, std::_Bind<void (starrocks::ThreadPool::*(starrocks::ThreadPool*))()>&>, void>::type std::__invoke_r<void, std::_Bind<void (starrocks::ThreadPool::*(starrocks::ThreadPool*))()>&>(std::_Bind<void (starrocks::ThreadPool::*(starrocks::ThreadPool*))()>&) /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/invoke.h:110
    #32 0xac6b2bc in std::_Function_handler<void (), std::_Bind<void (starrocks::ThreadPool::*(starrocks::ThreadPool*))()> >::_M_invoke(std::_Any_data const&) /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/std_function.h:291
    #33 0x51d645f in std::function<void ()>::operator()() const /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/std_function.h:622
    #34 0xac43d80 in starrocks::Thread::supervise_thread(void*) /home/disk4/zhangyan/StarRocks/be/src/util/thread.cpp:326
    #35 0x7f7b89dd5ea4 in start_thread (/lib64/libpthread.so.0+0x7ea4)
    #36 0x7f7b893f0b0c in clone (/lib64/libc.so.6+0xfeb0c)

0x61900087ff80 is located 0 bytes inside of 992-byte region [0x61900087ff80,0x619000880360)
freed by thread T186 here:
    #0 0x50a5637 in operator delete(void*, unsigned long) ../../../../libsanitizer/asan/asan_new_delete.cpp:172
    #1 0x7f30586 in starrocks::vectorized::VectorizedBinaryPredicate<(starrocks::PrimitiveType)10, starrocks::vectorized::BinaryPredFunc<std::greater_equal<starrocks::Slice> > >::~VectorizedBinaryPredicate() /home/disk4/zhangyan/StarRocks/be/src/exprs/vectorized/binary_predicate.cpp:52
    #2 0x7c81870 in starrocks::ObjectPool::add<starrocks::Expr>(starrocks::Expr*)::{lambda(void*)#1}::operator()(void*) const /home/disk4/zhangyan/StarRocks/be/src/common/object_pool.h:45
    #3 0x7c81890 in starrocks::ObjectPool::add<starrocks::Expr>(starrocks::Expr*)::{lambda(void*)#1}::_FUN(void*) /home/disk4/zhangyan/StarRocks/be/src/common/object_pool.h:45
    #4 0x55cd402 in starrocks::ObjectPool::clear() /home/disk4/zhangyan/StarRocks/be/src/common/object_pool.h:52
    #5 0x55cd20b in starrocks::ObjectPool::~ObjectPool() /home/disk4/zhangyan/StarRocks/be/src/common/object_pool.h:34
    #6 0x55d7c3a in starrocks::connector::HiveDataSource::~HiveDataSource() /home/disk4/zhangyan/StarRocks/be/src/connector/hive_connector.h:39
    #7 0x55d7c71 in starrocks::connector::HiveDataSource::~HiveDataSource() /home/disk4/zhangyan/StarRocks/be/src/connector/hive_connector.h:39
    #8 0x69fdf6a in std::default_delete<starrocks::connector::DataSource>::operator()(starrocks::connector::DataSource*) const /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/unique_ptr.h:85
    #9 0x69fc94c in std::unique_ptr<starrocks::connector::DataSource, std::default_delete<starrocks::connector::DataSource> >::~unique_ptr() /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/unique_ptr.h:361
    #10 0x6bac5fb in starrocks::pipeline::ConnectorChunkSource::~ConnectorChunkSource() /home/disk4/zhangyan/StarRocks/be/src/exec/pipeline/scan/connector_scan_operator.cpp:159
    #11 0x6bb1c87 in void __gnu_cxx::new_allocator<starrocks::pipeline::ConnectorChunkSource>::destroy<starrocks::pipeline::ConnectorChunkSource>(starrocks::pipeline::ConnectorChunkSource*) /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/ext/new_allocator.h:156
    #12 0x6bb1c1a in void std::allocator_traits<std::allocator<starrocks::pipeline::ConnectorChunkSource> >::destroy<starrocks::pipeline::ConnectorChunkSource>(std::allocator<starrocks::pipeline::ConnectorChunkSource>&, starrocks::pipeline::ConnectorChunkSource*) /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/alloc_traits.h:531
    #13 0x6bb17b6 in std::_Sp_counted_ptr_inplace<starrocks::pipeline::ConnectorChunkSource, std::allocator<starrocks::pipeline::ConnectorChunkSource>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/shared_ptr_base.h:560
    #14 0x50ee19a in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/shared_ptr_base.h:158
    #15 0x50ec0ed in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/shared_ptr_base.h:733
    #16 0x6b77b37 in std::__shared_ptr<starrocks::pipeline::ChunkSource, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr() /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/shared_ptr_base.h:1183
    #17 0x6b799f9 in std::__shared_ptr<starrocks::pipeline::ChunkSource, (__gnu_cxx::_Lock_policy)2>::operator=(std::__shared_ptr<starrocks::pipeline::ChunkSource, (__gnu_cxx::_Lock_policy)2>&&) /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/shared_ptr_base.h:1279
    #18 0x6b7859f in std::shared_ptr<starrocks::pipeline::ChunkSource>::operator=(std::shared_ptr<starrocks::pipeline::ChunkSource>&&) /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/shared_ptr.h:384
    #19 0x6b7017d in starrocks::pipeline::ScanOperator::_close_chunk_source_unlocked(starrocks::RuntimeState*, int) /home/disk4/zhangyan/StarRocks/be/src/exec/pipeline/scan/scan_operator.cpp:235
    #20 0x6b7067f in starrocks::pipeline::ScanOperator::_finish_chunk_source_task(starrocks::RuntimeState*, int, long, long, long) /home/disk4/zhangyan/StarRocks/be/src/exec/pipeline/scan/scan_operator.cpp:261
    #21 0x6b71485 in operator() /home/disk4/zhangyan/StarRocks/be/src/exec/pipeline/scan/scan_operator.cpp:334
    #22 0x6b75c2b in __invoke_impl<void, starrocks::pipeline::ScanOperator::_trigger_next_scan(starrocks::RuntimeState*, int)::<lambda()>&> /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/invoke.h:60
    #23 0x6b75abe in __invoke_r<void, starrocks::pipeline::ScanOperator::_trigger_next_scan(starrocks::RuntimeState*, int)::<lambda()>&> /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/invoke.h:110
    #24 0x6b7564f in _M_invoke /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/std_function.h:291
    #25 0x51d645f in std::function<void ()>::operator()() const /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/std_function.h:622
    #26 0xa6355ec in starrocks::PriorityThreadPool::work_thread(int) /home/disk4/zhangyan/StarRocks/be/src/util/priority_thread_pool.hpp:180
    #27 0xa6509dc in void std::__invoke_impl<void, void (starrocks::PriorityThreadPool::* const&)(int), starrocks::PriorityThreadPool*&, int&>(std::__invoke_memfun_deref, void (starrocks::PriorityThreadPool::* const&)(int), starrocks::PriorityThreadPool*&, int&) /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/invoke.h:73
    #28 0xa65081c in std::__invoke_result<void (starrocks::PriorityThreadPool::* const&)(int), starrocks::PriorityThreadPool*&, int&>::type std::__invoke<void (starrocks::PriorityThreadPool::* const&)(int), starrocks::PriorityThreadPool*&, int&>(void (starrocks::PriorityThreadPool::* const&)(int), starrocks::PriorityThreadPool*&, int&) /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/invoke.h:95
    #29 0xa6507c6 in decltype (__invoke((*this)._M_pmf, (forward<starrocks::PriorityThreadPool*&>)({parm#1}), (forward<int&>)({parm#1}))) std::_Mem_fn_base<void (starrocks::PriorityThreadPool::*)(int), true>::operator()<starrocks::PriorityThreadPool*&, int&>(starrocks::PriorityThreadPool*&, int&) const /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/functional:122

previously allocated by thread T186 here:
    #0 0x50a4817 in operator new(unsigned long) ../../../../libsanitizer/asan/asan_new_delete.cpp:99
    #1 0x7f2b4e6 in starrocks::Expr* starrocks::vectorized::BinaryPredicateBuilder::operator()<(starrocks::PrimitiveType)10>(starrocks::TExprNode const&) /home/disk4/zhangyan/StarRocks/be/src/exprs/vectorized/binary_predicate.cpp:121
    #2 0x7f298d9 in starrocks::Expr* starrocks::type_dispatch_predicate<starrocks::Expr*, starrocks::vectorized::BinaryPredicateBuilder, starrocks::TExprNode>(starrocks::PrimitiveType, bool, starrocks::vectorized::BinaryPredicateBuilder, starrocks::TExprNode) /home/disk4/zhangyan/StarRocks/be/src/runtime/primitive_type_infra.h:128
    #3 0x7f27aa8 in starrocks::vectorized::VectorizedBinaryPredicateFactory::from_thrift(starrocks::TExprNode const&) /home/disk4/zhangyan/StarRocks/be/src/exprs/vectorized/binary_predicate.cpp:134
    #4 0x7c785a3 in starrocks::Expr::create_vectorized_expr(starrocks::ObjectPool*, starrocks::TExprNode const&, starrocks::Expr**) /home/disk4/zhangyan/StarRocks/be/src/exprs/expr.cpp:258
    #5 0x7c77899 in starrocks::Expr::create_tree_from_thrift(starrocks::ObjectPool*, std::vector<starrocks::TExprNode, std::allocator<starrocks::TExprNode> > const&, starrocks::Expr*, int*, starrocks::Expr**, starrocks::ExprContext**) /home/disk4/zhangyan/StarRocks/be/src/exprs/expr.cpp:217
    #6 0x7c76bd0 in starrocks::Expr::create_expr_tree(starrocks::ObjectPool*, starrocks::TExpr const&, starrocks::ExprContext**) /home/disk4/zhangyan/StarRocks/be/src/exprs/expr.cpp:187
    #7 0x7c7730f in starrocks::Expr::create_expr_trees(starrocks::ObjectPool*, std::vector<starrocks::TExpr, std::allocator<starrocks::TExpr> > const&, std::vector<starrocks::ExprContext*, std::allocator<starrocks::ExprContext*> >*) /home/disk4/zhangyan/StarRocks/be/src/exprs/expr.cpp:203
    #8 0x55c571e in starrocks::connector::HiveDataSource::_init_conjunct_ctxs(starrocks::RuntimeState*) /home/disk4/zhangyan/StarRocks/be/src/connector/hive_connector.cpp:73
    #9 0x55c50db in starrocks::connector::HiveDataSource::open(starrocks::RuntimeState*) /home/disk4/zhangyan/StarRocks/be/src/connector/hive_connector.cpp:58
    #10 0x6bac97c in starrocks::pipeline::ConnectorChunkSource::_read_chunk(starrocks::RuntimeState*, std::shared_ptr<starrocks::vectorized::Chunk>*) /home/disk4/zhangyan/StarRocks/be/src/exec/pipeline/scan/connector_scan_operator.cpp:178
    #11 0x6b35655 in starrocks::pipeline::ChunkSource::buffer_next_batch_chunks_blocking(unsigned long, starrocks::RuntimeState*) /home/disk4/zhangyan/StarRocks/be/src/exec/pipeline/scan/chunk_source.cpp:56
    #12 0x6b712e8 in operator() /home/disk4/zhangyan/StarRocks/be/src/exec/pipeline/scan/scan_operator.cpp:328
    #13 0x6b75c2b in __invoke_impl<void, starrocks::pipeline::ScanOperator::_trigger_next_scan(starrocks::RuntimeState*, int)::<lambda()>&> /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/invoke.h:60
    #14 0x6b75abe in __invoke_r<void, starrocks::pipeline::ScanOperator::_trigger_next_scan(starrocks::RuntimeState*, int)::<lambda()>&> /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/invoke.h:110
    #15 0x6b7564f in _M_invoke /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/std_function.h:291
    #16 0x51d645f in std::function<void ()>::operator()() const /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/std_function.h:622
    #17 0xa6355ec in starrocks::PriorityThreadPool::work_thread(int) /home/disk4/zhangyan/StarRocks/be/src/util/priority_thread_pool.hpp:180
    #18 0xa6509dc in void std::__invoke_impl<void, void (starrocks::PriorityThreadPool::* const&)(int), starrocks::PriorityThreadPool*&, int&>(std::__invoke_memfun_deref, void (starrocks::PriorityThreadPool::* const&)(int), starrocks::PriorityThreadPool*&, int&) /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/invoke.h:73
    #19 0xa65081c in std::__invoke_result<void (starrocks::PriorityThreadPool::* const&)(int), starrocks::PriorityThreadPool*&, int&>::type std::__invoke<void (starrocks::PriorityThreadPool::* const&)(int), starrocks::PriorityThreadPool*&, int&>(void (starrocks::PriorityThreadPool::* const&)(int), starrocks::PriorityThreadPool*&, int&) /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/invoke.h:95
    #20 0xa6507c6 in decltype (__invoke((*this)._M_pmf, (forward<starrocks::PriorityThreadPool*&>)({parm#1}), (forward<int&>)({parm#1}))) std::_Mem_fn_base<void (starrocks::PriorityThreadPool::*)(int), true>::operator()<starrocks::PriorityThreadPool*&, int&>(starrocks::PriorityThreadPool*&, int&) const /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/functional:122
    #21 0xa650777 in void std::__invoke_impl<void, std::_Mem_fn<void (starrocks::PriorityThreadPool::*)(int)>&, starrocks::PriorityThreadPool*&, int&>(std::__invoke_other, std::_Mem_fn<void (starrocks::PriorityThreadPool::*)(int)>&, starrocks::PriorityThreadPool*&, int&) /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/invoke.h:60
    #22 0xa6506ce in std::enable_if<is_invocable_r_v<void, std::_Mem_fn<void (starrocks::PriorityThreadPool::*)(int)>&, starrocks::PriorityThreadPool*&, int&>, void>::type std::__invoke_r<void, std::_Mem_fn<void (starrocks::PriorityThreadPool::*)(int)>&, starrocks::PriorityThreadPool*&, int&>(std::_Mem_fn<void (starrocks::PriorityThreadPool::*)(int)>&, starrocks::PriorityThreadPool*&, int&) /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/bits/invoke.h:110
    #23 0xa6505ac in void std::_Bind_result<void, std::_Mem_fn<void (starrocks::PriorityThreadPool::*)(int)> (starrocks::PriorityThreadPool*, int)>::__call<void, , 0ul, 1ul>(std::tuple<>&&, std::_Index_tuple<0ul, 1ul>) /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/functional:566
    #24 0xa6502b0 in void std::_Bind_result<void, std::_Mem_fn<void (starrocks::PriorityThreadPool::*)(int)> (starrocks::PriorityThreadPool*, int)>::operator()<>() /home/disk5/sr-deps/toolchain/installed/gcc-10.3.0/include/c++/10.3.0/functional:625
    #25 0xa64f803 in boost::detail::thread_data<std::_Bind_result<void, std::_Mem_fn<void (starrocks::PriorityThreadPool::*)(int)> (starrocks::PriorityThreadPool*, int)> >::run() /home/disk3/sr-deps/thirdparty/installed/include/boost/thread/detail/thread.hpp:120

@dirtysalt
Copy link
Contributor Author

It's because release order of ExprContxt


I find a quite subtle problem in create expr tree from thrift. It's about orders of element in pool. Correct me if I'm wrong

Status Expr::create_tree_from_thrift(ObjectPool* pool, const std::vector<TExprNode>& nodes, Expr* parent, int* node_idx,
                   Expr** root_expr, ExprContext** ctx) {
  int num_children = nodes[*node_idx].num_children;
  Expr* expr = nullptr;
  RETURN_IF_ERROR(create_vectorized_expr(pool, nodes[*node_idx], &expr));
  DCHECK(expr != nullptr);
  if (parent != nullptr) {
    parent->add_child(expr);
  } else {
    *root_expr = expr;
    *ctx = pool->add(new ExprContext(expr));
  }
  for (int i = 0; i < num_children; i++) {
    *node_idx += 1;
    RETURN_IF_ERROR(create_tree_from_thrift(pool, nodes, expr, node_idx, nullptr, nullptr));
   }
  return Status::OK();
}

Suppose we have following expr a >= 2, we

  • create binary predicate >= (GE)
  • then create ExprContext (ctx)
  • then create ColumnRef (a)
  • then create IntLiteral (2)

So the elements in pool are ( GE, ctx, a, 2). And when we call pool->clear, elements are released in reversed order.
-But when ExprContext is released, its dtor is called

  • and in its dtor, Expr::close is called.
  • and in Expr::close, it will traverse its children and call their close method
  • But we should note that, children expr (in our example, they are a & 2) are released already. We are accessing freed memory
ExprContext::~ExprContext() {
  if (_prepared) {
    close(_runtime_state);
  }
}


void ExprContext::close(RuntimeState* state) {
  _root->close(state, this, scope);
}

void Expr::close() {
  for (Expr* child : _children) child->close(); 
}



The fix is easy, to put ExprContext creation at the end of pool

Status Expr::create_tree_from_thrift(ObjectPool* pool, const std::vector<TExprNode>& nodes, Expr* parent, int* node_idx,
                                     Expr** root_expr, ExprContext** ctx) {
    // propagate error case
    if (*node_idx >= nodes.size()) {
        return Status::InternalError("Failed to reconstruct expression tree from thrift.");
    }
    int num_children = nodes[*node_idx].num_children;
    Expr* expr = nullptr;
    RETURN_IF_ERROR(create_vectorized_expr(pool, nodes[*node_idx], &expr));
    DCHECK(expr != nullptr);
    if (parent != nullptr) {
        parent->add_child(expr);
    }
    for (int i = 0; i < num_children; i++) {
        *node_idx += 1;
        RETURN_IF_ERROR(create_tree_from_thrift(pool, nodes, expr, node_idx, nullptr, nullptr));
        // we are expecting a child, but have used all nodes
        // this means we have been given a bad tree and must fail
        if (*node_idx >= nodes.size()) {
            return Status::InternalError("Failed to reconstruct expression tree from thrift.");
        }
    }
    if (parent == nullptr) {
        DCHECK(root_expr != nullptr);
        DCHECK(ctx != nullptr);
        *root_expr = expr;
        *ctx = pool->add(new ExprContext(expr));
    }
    return Status::OK();
}

@imay
Copy link
Contributor

imay commented Jul 18, 2022

LGTM

@wanpengfei-git wanpengfei-git added the Approved Ready to merge label Jul 19, 2022
@wanpengfei-git
Copy link
Collaborator

run starrocks_admit_test

@dirtysalt
Copy link
Contributor Author

run starrocks_admit_test

1 similar comment
@dirtysalt
Copy link
Contributor Author

run starrocks_admit_test

@dirtysalt
Copy link
Contributor Author

pr regression 580 passed

@dirtysalt dirtysalt merged commit e64bb9a into StarRocks:main Jul 19, 2022
@dirtysalt dirtysalt deleted the fix-parquet-type-mismatch-decode-min-max branch July 19, 2022 06:41
hiliuxg pushed a commit to hiliuxg/starrocks that referenced this pull request Jul 21, 2022
[Refactor] rename work group to resource group (StarRocks#8527)

Update release-2.0.md (StarRocks#8809)

[BugFix] fix bug in runtime filter plan (StarRocks#8728)

[Starrocks On ES] support explain for ES table (StarRocks#8202)

[Refactor] Remove CreateTableStmt in old parser (StarRocks#8824)

[SR SQL Planner Summer Camp] Support modify partition in new Parser and Analyzer (StarRocks#8485)

Optimize 'DataUsedCapacity' display (StarRocks#8823)

[SR SQL Planner Summer Camp] Support show partitions in new Parser and Analyzer (StarRocks#8489)

[BugFix] BrokerLoad cann't handle kerberos login with multiple keytab (StarRocks#8820)

UserGroupInformation.loginUserFromKeytab(principal, keytab) will modify the loginUserInfo globally.
If two users concurrently start broker load, the later one will override previous one's loginUserInfo.

Remove unused test case (StarRocks#8832)

[Actions] add documentation label for pr  (StarRocks#8841)

* update label trigger

* add pr documentation label

Co-authored-by: dulong <dulong@starrocks.com>

[Refactor] Remove cluster in ConnectContext (StarRocks#8826)

Update release-2.3.md (StarRocks#8833)

Fix be crash in ASAN mode (StarRocks#8722)

If element of ArrayColumn is empty, we don't need to read data from data_column of ArrayColumn because data is empty.
So we don't add an empty read_range into element_read_ranges, which will cause inconsistency between the begin of element_read_range and element_ordinal of element_iterator. That's why be crash in ASAN mode.

[BugFix] potential dead lock in stream_load_pipe (StarRocks#8647)

_push_front is only called when no_block_read is timeout, we already acquire lock
at that time(condition variable will immediately acquire lock after wait for).
so there is no need to acquire the lock again in _push_front

[Others] Add config for librdkafka debug (StarRocks#8783)

[Enhancement] Assign tablet ranges for one-phase aggregate without partition and colocate (StarRocks#8771)

Fix two unstable FE UT (StarRocks#8758)

Use real GlobalStateMgr, not mock GlobalStateMgr, ensure there is only one GlobalStateMgr in UT, then won't Missing invocations issue.

[Enhancement] Adjust downgrade strategy of chunks_partitioner (StarRocks#8696)

remove dead code mutable_index_* (StarRocks#8857)

[BugFix] set bdb timeout to 1 minutes for UT (StarRocks#8855)

[Enhancement] Optimize logic of materialized view when base table is dropped (StarRocks#8613)

Optimize logic of materialized view when base table is dropped

Fix unstable RollupJobV2Test (StarRocks#8876)

mv task definition view (StarRocks#8856)

select * from information_schema.task_runs;

select * from information_schema.tasks;

definition add insert overwrite

[Refactor] Add getOriginName in Database class (StarRocks#8846)

Add getOriginName and change some Log and Exception to expose origin name other than full name.

Update release-2.3.md (StarRocks#8874)

[Refactor] ShowDynamicPartitionStmt to new Parser and Analyzer (StarRocks#8511)

[Feature] New Parser and Analyzer support ShowProcesslistStmt (StarRocks#8671)

New Parser and Analyzer support ShowProcesslistStmt

Support recover partition in new Parser and Analyzer (StarRocks#8493)

docs add navigation (StarRocks#8864)

[Refactor] Fix UT (StarRocks#8878)

[Refactor] Remove used code (StarRocks#8880)

[Refactor] Support BDBJEJournal with prefix (StarRocks#8672)

[DOC] add validity period during which users can restore a deleted table (StarRocks#8788)

* add validity period during which users can restore a deleted table

* Update DROP TABLE.md

* Update DROP TABLE.md

[SR SQL Planner Summer Camp][Feature] alter table statement support of add list partition (StarRocks#8805)

[BugFix] Fix crash during decoding min/max when type mismatch in parquet file (StarRocks#8849)

[Refactor] Force reorder columns of chunk (StarRocks#8597)

[Refactor] Refactor hdfs scanner expression usage (StarRocks#8884)

[BugFix] Fix runtime filter delivery (StarRocks#8840)

This PR fixes:
1. different wait timeout on scan node(`_scan_wait_timeout_ms`) scan and other nodes(`_wait_timeout_ms`)
2. put global runtime filter into cache(`runtime_filter_cache`) if fragment instance is not allocated yet.
3. And during wait time, nodes on fragment instance will find runtime filter from cache(`runtime_filter_cache`)

[Enhancement] introduce query debug trace for BE (StarRocks#7649)

display data_type in information_schema.columns for (StarRocks#8895)

hll/bitmap/percentile

[Bug Fix]show partitions add absence column for filter columns check (StarRocks#8894)

[Enhance] take the fragment prepare time into account of vruntime (StarRocks#8865)

delete two sql statement (StarRocks#8901)

[Refactor] remove radix_sort (StarRocks#8891)

[BugFix] Fix OlapTableSink close accelerate release resource (StarRocks#8893)

[Bugfix] fix be crash when call DictMappingExpr::evaluate (StarRocks#8858)

Remove dead function of primitive type size (StarRocks#8912)

Revert "[SR SQL Planner Summer Camp][Feature] alter table statement support of add list partition (StarRocks#8805)" (StarRocks#8917)

This reverts commit fb28ebe.

[BugFix] Fix consistency problem (StarRocks#8896)

[Refactor] [Step-1]Change the name from master to leader (StarRocks#8787)

Change the master related code to leader.
Split the task into two steps:
1. change the name from master to leader in the internal code paths.
2. change the name from master to leader in the interface.
The advantage of this is that we only need to rollback the step 2 if If there are compatibility problems.

Remove some old Analyzer codes 4 (StarRocks#8872)

[BugFix] Fix potential dangling pointer of TabletSchema in Segment (StarRocks#8913)

For lake tablet, the raw pointer of TabletSchema in Segment may point
to a memory location that has been deleted, if the cached TabletSchema
has been evicted from the in-memory cache of lake::TabletManager.

Use a std::shared_ptr to manage the ownership of TabletSchema in Segment
to fix the issue.

[Lake] Disable primary key lake table (StarRocks#8924)

[Refactor] Remove admin user related code (StarRocks#8921)

[Refactor] Renaming constant variable in persistent index (StarRocks#8862)

[bug fix]fix TaskManager initialDelay not correct (StarRocks#8633)

when timeunit is not second,delay will not correct.

[Enhancement] ThreadPoolToken releasing the tasks outside of lock (StarRocks#8920)

the `threadpool` is from kudu: https://github.com/apache/kudu/blob/master/src/kudu/util/threadpool.cc

```
    // Clear the queue under the lock, but defer the releasing of the tasks
    // outside the lock, in case there are concurrent threads wanting to access
    // the ThreadPool. The task's destructors may acquire locks, etc, so this
    // also prevents lock inversions.
```

```
F0720 10:02:59.244752 123894 threadpool.cpp:183] Check failed: _entries.empty()
```

if we declare `destructor = default`,compiler will disable the compiler-generated move constructor and move assignment operator.

[BugFix] Support shuffle broker load for non duplicate keys table (StarRocks#8714)

Before we unify the ingestion interface, we solved the multi-replica inconsistency problem of non duplicate key table by adding shuffle service.

[Refactor] move state listener out of GlobalStateMgr, and rename it to state change executor (StarRocks#8747)

[Feature] Support truncate lake table (StarRocks#8886)

use constant to indicate length and add a getter for kv_pair_size. (StarRocks#8885)

[Feature] introduce jemalloc into thirdparty (StarRocks#6238)

[others] Fix unstable RefreshTableStmtTest (StarRocks#8951)

[Enhance] make re2 driver-local to avoid contention (StarRocks#8904)

[BugFix] fix runtime filter cache use: hold shared pointer to avoid invalid pointer. (StarRocks#8944)

Got following exception

```
*** Aborted at 1658293232 (unix time) try "date -d @1658293232" if you are using GNU date ***
PC: @          0x5cc1f60 __dynamic_cast
*** SIGSEGV (@0xfffffffffffffff0) received by PID 32857 (TID 0x7f1df458d700) from PID 18446744073709551600; stack trace: ***
    @          0x41c0692 google::(anonymous namespace)::FailureSignalHandler()
    @     0x7f1e2fe6be9b os::Linux::chained_handler()
    @     0x7f1e2fe7090c JVM_handle_linux_signal
    @     0x7f1e2fe63858 signalHandler()
    @     0x7f1e2f34b630 (unknown)
    @          0x5cc1f60 __dynamic_cast
    @          0x29a8031 starrocks::vectorized::OrcChunkReader::_add_runtime_filter()
    @          0x29ba656 starrocks::vectorized::OrcChunkReader::set_conjuncts_and_runtime_filters()
    @          0x1b1e41a starrocks::vectorized::HdfsOrcScanner::do_open()
    @          0x1b17259 starrocks::vectorized::HdfsScanner::open()
    @          0x173ceae starrocks::connector::HiveDataSource::_init_scanner()
    @          0x173f6e0 starrocks::connector::HiveDataSource::open()
    @          0x1c5ef30 starrocks::vectorized::ConnectorScanNode::_scanner_thread()
    @          0x314c348 starrocks::PriorityThreadPool::work_thread()
    @          0x414ccc7 thread_proxy
    @     0x7f1e2f343ea5 start_thread
    @     0x7f1e2e95eb0d __clone
    @                0x0 (unknown)
```

It's because we just use  runtime filter raw pointer. But runtime filter could be released by runtime filte cache, but we probably are still using it. The robust way is to hold share ptr of runtime filter.

----

the reason why we don't need following code is because, if there are multiple fragment instances executed, some of them  finish early and close queyr(clean runtime filter cache), following fragment instance won't get runtime filter from cache.

There is a `clean_thread` in runtime filter cache, and that will clean runtime filter has not retrived used for a long time

```
_exec_env->runtime_filter_cache()->remove(_query_id);
```

New Parser and Analyzer support User Defined Function-statements (StarRocks#8528)

[BugFix] Disable one stage aggregate with one distinct function (StarRocks#8918)

[Refactor] [Step 1/2] Change the name from master to leader (StarRocks#8953)

Change the master related code to leader.
Split the task into two steps:

change the name from master to leader in the internal code paths.
change the name from master to leader in the interface.
The advantage of this is that we only need to rollback the step 2 if If there are compatibility problems.

This is PR is still the step 1 of the task.
The previous work is done by StarRocks#8787

Add move constructor and move assignment for Schema (StarRocks#8937)

Otherwise the `Schema` is not movable because we define the `constructor`;

[Feature] histogram statistics support collect top-n data (StarRocks#8916)

[BugFix] When the status of HeartBeat is not ok, it also need to be synced to follower (StarRocks#8265)

When master the received heartbeat status is not ok, the heartbeat info also need to be synced to follower.
Otherwise, the failed heartbeat information will not be synchronized to the follower.
Since the failed heartbeat info also modifies fe's memory, this.heartbeatRetryTimes++;
if it is not synchronized to the follower, this will cause the master and follower's metadata to be inconsistent.

[Refactor] Remove some unused code in old parser for admin stmt (StarRocks#8963)

1. Some code in old parser is unused, it can be removed.
2. Fix the bug for SetReplicaStatus  in the new parser.

[BugFix] Persist `LoadStatistic` in `EtlStatus` (StarRocks#8689)

The main reason is that LoadJob did not persist LoadStatistic in the process of either replaying or snapshot. Unfortunately, there is no obvious way to add new metadata since we persist the LoadJob class in a hard-coded way. After due consideration, we serialized LoadStatics in JSON for the convenience of adding/deleting fields in the future and hide this JSON in a deprecated map in EtlStatus.

[Bugfix] Fix wrong result when process 'is null' in condition expr in dictionary optimization (StarRocks#8869)

[Feat] Support COM_FILED_LIST in external catalog database (StarRocks#8929)

Optimize materialized view refresh task by passing properties to task (StarRocks#8860)

Optimize materialized view refresh task by passing properties to task through hints

[Lake] Create shard with cache property (StarRocks#8977)

fix explain row count overflow (StarRocks#8980)

[BugFix] Fix error of THRIFT_RPC_ERROR with CN (StarRocks#8773)

[SR SQL Planner Summer Camp]New Parser and Analyzer support AlterUserStmt (StarRocks#8503)

[Enhancement] Support rewrite window function sort partition by in lowcardinality optimization (StarRocks#8957)

relation with StarRocks#6119

[BugFix] Fix LocalTabletsChannel head-use-after-free (StarRocks#8978)

Fix StarRocks#8906

[Enhancement] Remove redundant management on open db (StarRocks#8335)

Remove redundant management on open db in BDBEnvironment.
Close database initiatively in BDBJournal & BDBJournalCursor
In every main loop, Replayer will update names of local databases by calling refresh(), then keep reading increasing log id until it doesn't exist.

[Refactor] Extract common logics to success_once (StarRocks#8952)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Approved Ready to merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants