Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug][TubeMQ] Core file generated while the C++ consumer is closed #5700

Closed
1 of 2 tasks
chen9t opened this issue Aug 25, 2022 · 3 comments · Fixed by #5707 or #5721
Closed
1 of 2 tasks

[Bug][TubeMQ] Core file generated while the C++ consumer is closed #5700

chen9t opened this issue Aug 25, 2022 · 3 comments · Fixed by #5707 or #5721
Assignees
Labels
Milestone

Comments

@chen9t
Copy link

chen9t commented Aug 25, 2022

What happened

Receive callback from xxx-xxx-xxx-29729-1661311398803-5-0.1.6-C-0.5.0 after it's shutdown:

[2022-08-24 11:24:19.203][tid:139827279062784][ShutDown:121][INFO][CONSUMER] ShutDown consumer begin, client=xxx-xxx-xxx-29729-1661311398803-5-0.1.6-C-0.5.0
[2022-08-24 11:24:19.204][tid:139827279062784][close2Master:622][INFO][CONSUMER] close2Master begin, clientid=xxx-xxx-xxx-29729-1661311398803-5-0.1.6-C-0.5.0
[2022-08-24 11:24:19.204][tid:139824323061504][processRebalanceEvent:590][INFO][CONSUMER] Rebalance found Shutdown notify, existed, client=xxx-xxx-xxx-29729-1661311398803-5-0.1.6-C-0.5.0
[2022-08-24 11:24:19.204][tid:139824323061504][processRebalanceEvent:614][INFO][CONSUMER] rebalance event Handler stopped!
[2022-08-24 11:24:19.204][tid:139827279062784][close2Master:637][INFO][CONSUMER] close2Master finished, clientid=xxx-xxx-xxx-29729-1661311398803-5-0.1.6-C-0.5.0
[2022-08-24 11:24:19.204][tid:139827279062784][closeAllBrokers:717][INFO][CONSUMER] closeAllBrokers begin, clientid=xxx-xxx-xxx-29729-1661311398803-5-0.1.6-C-0.5.0
[2022-08-24 11:24:19.210][tid:139827279062784][closeAllBrokers:722][INFO][CONSUMER] closeAllBrokers end, clientid=xxx-xxx-xxx-29729-1661311398803-5-0.1.6-C-0.5.0
[2022-08-24 11:24:19.211][tid:139827279062784][ShutDown:136][INFO][CONSUMER] ShutDown consumer finished, client=xxx-xxx-xxx-29729-1661311398803-5-0.1.6-C-0.5.0
[2022-08-24 11:24:23.868][tid:139824655247104][operator():527][WARN][CONSUMER] heartBeat2Master failue to (9.22.19.103:8609) : Request is timeout, client=xxx-xxx-xxx-29729-1661311398803-5-0.1.6-C-0.5.0

And client aborted:

Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/data/dorisadmin/dorisenv/starrocks/be/lib/starrocks_be'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000004abcd4b in std::_Function_handler<void (tubemq::ErrorCode, tubemq::ResponseContext const&), tubemq::BaseConsumer::heartBeat2Master()::{lambda(tubemq::ErrorCode, tubemq::ResponseContext const&)#1}>::_M_invoke(std::_Any_data const&, tubemq::ErrorCode&&, tubemq::ResponseContext const&) ()
Missing separate debuginfos, use: debuginfo-install glibc-2.17-323.tl2.x86_64 libgcc-4.8.5-39.tl2.1.x86_64 zlib-1.2.7-15.el7.x86_64
(gdb) bt
#0  0x0000000004abcd4b in std::_Function_handler<void (tubemq::ErrorCode, tubemq::ResponseContext const&), tubemq::BaseConsumer::heartBeat2Master()::{lambda(tubemq::ErrorCode, tubemq::ResponseContext const&)#1}>::_M_invoke(std::_Any_data const&, tubemq::ErrorCode&&, tubemq::ResponseContext const&) ()
#1  0x0000000004ad0c6f in tubemq::Promise<tubemq::ResponseContext>::callbackAndNotify() ()
#2  0x0000000004ac90cd in tubemq::ClientConnection::requestCallback(unsigned int, tubemq::ErrorCode*, tubemq::Any*) ()
#3  0x0000000004ac93e3 in tubemq::ClientConnection::requestTimeoutHandle(std::error_code const&, std::shared_ptr<tubemq::RequestContext>) ()
#4  0x0000000004ad166e in asio::detail::wait_handler<std::_Bind<void (tubemq::ClientConnection::*(std::shared_ptr<tubemq::ClientConnection>, std::_Placeholder<1>, std::shared_ptr<tubemq::RequestContext>))(std::error_code const&, std::shared_ptr<tubemq::RequestContext>)>, asio::execution::any_executor<asio::execution::context_as_t<asio::execution_context&>, asio::execution::detail::blocking::never_t<0>, asio::execution::prefer_only<asio::execution::detail::blocking::possibly_t<0> >, asio::execution::prefer_only<asio::execution::detail::outstanding_work::tracked_t<0> >, asio::execution::prefer_only<asio::execution::detail::outstanding_work::untracked_t<0> >, asio::execution::prefer_only<asio::execution::detail::relationship::fork_t<0> >, asio::execution::prefer_only<asio::execution::detail::relationship::continuation_t<0> > > >::do_complete(void*, asio::detail::scheduler_operation*, std::error_code const&, unsigned long) ()
#5  0x0000000004ad7a7e in asio::detail::scheduler::do_run_one(asio::detail::conditionally_enabled_mutex::scoped_lock&, asio::detail::scheduler_thread_info&, std::error_code const&) ()
#6  0x0000000004add459 in asio::detail::scheduler::run(std::error_code&) [clone .isra.0] ()
#7  0x0000000004add67d in tubemq::Executor::StartWorker(std::shared_ptr<asio::io_context>) ()
#8  0x0000000004ae173f in std::thread::_State_impl<std::thread::_Invoker<std::tuple<std::_Bind<void (tubemq::Executor::*(tubemq::Executor*, std::shared_ptr<asio::io_context>))(std::shared_ptr<asio::io_context>)> > > >::_M_run() ()
#9  0x000000000539ede0 in execute_native_thread_routine ()
#10 0x00007ffb519a0ea5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffb50da59fd in clone () from /lib64/libc.so.6

What you expected to happen

No callback to shutdown consumer, and client don't abort.

How to reproduce

Create several consumers, consume messages for a while, then shutdown. Start same consumers again, consume messages for a while, shutdown again...

Environment

No response

InLong version

master

InLong Component

InLong TubeMQ

Are you willing to submit PR?

  • Yes, I am willing to submit a PR!

Code of Conduct

@chen9t chen9t added the type/bug Something is wrong label Aug 25, 2022
@github-actions
Copy link

Thanks a lot for opening your first issue with us! 🧡 We'll get back to you shortly! ⏳

@gosonzhang
Copy link
Contributor

@chen9t, thanks!

let me see

@gosonzhang gosonzhang changed the title [Bug] Inlong - Tubemq consumer get callback after shutdown [Bug][TubeMQ] Core file generated while the C++ consumer is closed Aug 25, 2022
@gosonzhang
Copy link
Contributor

After ShutDown() of the C++ SDK is called, the SDK releases resources, at the same time, it may happen that the heartbeat thread is processing the message, and an access exception occurs.

This probability is very small, but when multiple clients are started in a process and the clients are continuously created and released, problems would more easily arise

We'll fix it, thanks @chen9t!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment