Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock on calling Aws::ShutdownAPI via s_registryMutex #3282

Open
1 task
Markus87 opened this issue Jan 31, 2025 · 8 comments
Open
1 task

Deadlock on calling Aws::ShutdownAPI via s_registryMutex #3282

Markus87 opened this issue Jan 31, 2025 · 8 comments
Assignees
Labels
bug This issue is a bug. pending-release This issue will be fixed by an approved PR that hasn't been released yet.

Comments

@Markus87
Copy link

Describe the bug

It seems the issue does not always occur.

Aws::ShutdownAPI holding s_registryMutex and joining thread who needs same mutex to finish
 	[Waiting on Thread 4440, double-click or press enter to switch to thread]	
 	ntdll.dll!00007ffccf5f0054()	Unknown
 	KERNELBASE.dll!00007ffccb930f33()	Unknown
 	msvcp140.dll!00007ffcafaf255f()	Unknown
 	[Inline Frame] aws-cpp-sdk-core.dll!std::thread::join() Line 133	C++
 	aws-cpp-sdk-core.dll!Aws::Utils::Threading::DefaultExecutor::~DefaultExecutor() Line 79	C++
 	aws-cpp-sdk-core.dll!Aws::Utils::Threading::DefaultExecutor::`vector deleting destructor'(unsigned int)	C++
 	[Inline Frame] aws-cpp-sdk-s3.dll!std::_Ref_count_base::_Decref() Line 1163	C++
 	[Inline Frame] aws-cpp-sdk-s3.dll!std::_Ptr_base<Aws::Utils::Threading::Executor>::_Decref() Line 1380	C++
 	[Inline Frame] aws-cpp-sdk-s3.dll!std::shared_ptr<Aws::Utils::Threading::Executor>::{dtor}() Line 1685	C++
 	[Inline Frame] aws-cpp-sdk-s3.dll!std::shared_ptr<Aws::Utils::Threading::Executor>::reset() Line 1732	C++
 	aws-cpp-sdk-s3.dll!Aws::Client::ClientWithAsyncTemplateMethods<Aws::S3::S3Client>::ShutdownSdkClient(void * pThis, __int64 timeoutMs) Line 121	C++
 	aws-cpp-sdk-core.dll!Aws::Utils::ComponentRegistry::TerminateAllComponents() Line 88	C++
>	aws-cpp-sdk-core.dll!Aws::ShutdownAPI(const Aws::SDKOptions & options) Line 207	C++
 	[Inline Frame] Bfx.Abstract.AWSEngine.dll!BitFactory::AWS::Initializer::{dtor}() Line 59	C++
 	[Inline Frame] Bfx.Abstract.AWSEngine.dll!std::_Destroy_in_place(BitFactory::AWS::Initializer & _Obj) Line 293	C++
 	Bfx.Abstract.AWSEngine.dll!std::_Ref_count_obj2<BitFactory::AWS::Initializer>::_Destroy() Line 2113	C++
 	[Inline Frame] Bfx.Abstract.AWSEngine.dll!std::_Ref_count_base::_Decref() Line 1163	C++
 	[Inline Frame] Bfx.Abstract.AWSEngine.dll!std::_Ptr_base<BitFactory::AWS::Initializer>::_Decref() Line 1380	C++
 	[Inline Frame] Bfx.Abstract.AWSEngine.dll!std::shared_ptr<BitFactory::AWS::Initializer>::{dtor}() Line 1685	C++
 	Bfx.Abstract.AWSEngine.dll!BitFactory::AWS::S3::Impl::`scalar deleting destructor'(unsigned int __flags)	C++
 	[Inline Frame] Bfx.Abstract.AWSEngine.dll!std::_Ref_count_base::_Decref() Line 1163	C++
 	[Inline Frame] Bfx.Abstract.AWSEngine.dll!std::_Ptr_base<BitFactory::AWS::S3::Impl>::_Decref() Line 1380	C++
 	[Inline Frame] Bfx.Abstract.AWSEngine.dll!std::shared_ptr<BitFactory::AWS::S3::Impl>::{dtor}() Line 1685	C++
 	[Inline Frame] Bfx.Abstract.AWSEngine.dll!std::_Destroy_in_place(BitFactory::AWS::S3 & _Obj) Line 293	C++
 	Bfx.Abstract.AWSEngine.dll!std::_Ref_count_obj2<BitFactory::AWS::S3>::_Destroy() Line 2113	C++
 	[Inline Frame] Bfx.Abstract.AWSEngine.dll!std::_Ref_count_base::_Decref() Line 1163	C++
 	[Inline Frame] Bfx.Abstract.AWSEngine.dll!std::_Ptr_base<BitFactory::AWS::S3>::_Decref() Line 1380	C++
 	[Inline Frame] Bfx.Abstract.AWSEngine.dll!std::shared_ptr<BitFactory::AWS::S3>::{dtor}() Line 1685	C++
 	Bfx.Abstract.AWSEngine.dll!BitFactory::AWS::VersionChannel::Impl::`scalar deleting destructor'(unsigned int __flags)	C++
 	[Inline Frame] Bfx.Abstract.AWSEngine.dll!std::_Ref_count_base::_Decref() Line 1163	C++
 	[Inline Frame] Bfx.Abstract.AWSEngine.dll!std::_Ptr_base<BitFactory::AWS::EC2::Impl>::_Decref() Line 1380	C++
 	[Inline Frame] Bfx.Abstract.AWSEngine.dll!std::shared_ptr<BitFactory::AWS::EC2::Impl>::{dtor}() Line 1685	C++
 	Bfx.Abstract.AWSEngine.dll!BitFactory::AWS::EC2::~EC2()	C++
 	Bfx.Alex.Server.Updater.Model.dll!`anonymous namespace'::Updater::DoPrepareUpdate() Line 174	C++
 	Bfx.Alex.Server.Updater.Model.dll!`anonymous namespace'::Updater::Check() Line 188	C++
 	[Inline Frame] Bfx.Alex.Server.Updater.Model.dll!BitFactory::UpdateEXE::Check() Line 244	C++
 	[Inline Frame] Bfx.Alex.Server.Updater.Model.dll!`anonymous-namespace'::Daemon::()::__l5::<lambda_1>::operator()() Line 38	C++
 	[Inline Frame] Bfx.Alex.Server.Updater.Model.dll!std::invoke(`anonymous-namespace'::Daemon::()::__l5::<lambda_1> &) Line 1695	C++
 	Bfx.Alex.Server.Updater.Model.dll!std::_Func_impl_no_alloc<``anonymous namespace'::Daemon::operator()'::`5'::<lambda_1>,void>::_Do_call() Line 878	C++
 	[Inline Frame] Bfx.Abstract.Engine.dll!std::_Func_class<void>::operator()() Line 920	C++
 	Bfx.Abstract.Engine.dll!`anonymous namespace'::CoWrappedSimpleFunction(const std::function<void __cdecl(void)> & f) Line 81	C++
 	Bfx.Abstract.Engine.dll!`anonymous namespace'::CoWrappedSimpleFunction$_InitCoro$2() Line 82	C++
 	Bfx.Abstract.Engine.dll!`anonymous namespace'::CoWrappedSimpleFunction(const std::function<void __cdecl(void)> & f) [Ramp]	C++
 	[Inline Frame] Bfx.Abstract.Engine.dll!std::invoke(BitFactory::Continuation<std::nullptr_t>(*)(const std::function<void __cdecl(void)> &) &) Line 1705	C++
 	[Inline Frame] Bfx.Abstract.Engine.dll!std::_Invoker_ret<std::_Unforced>::_Call(BitFactory::Continuation<std::nullptr_t>(*)(const std::function<void __cdecl(void)> &) &) Line 2086	C++
 	[Inline Frame] Bfx.Abstract.Engine.dll!std::_Call_binder(std::_Invoker_ret<std::_Unforced>) Line 2096	C++
 	[Inline Frame] Bfx.Abstract.Engine.dll!std::_Binder<std::_Unforced,BitFactory::Continuation<std::nullptr_t> (__cdecl*)(std::function<void __cdecl(void)> const &),std::function<void __cdecl(void)> const &>::operator()() Line 2150	C++
 	[Inline Frame] Bfx.Abstract.Engine.dll!std::invoke(std::_Binder<std::_Unforced,BitFactory::Continuation<std::nullptr_t> (__cdecl*)(std::function<void __cdecl(void)> const &),std::function<void __cdecl(void)> const &> &) Line 1695	C++
 	Bfx.Abstract.Engine.dll!std::_Func_impl_no_alloc<std::_Binder<std::_Unforced,BitFactory::Continuation<std::nullptr_t> (__cdecl*)(std::function<void __cdecl(void)> const &),std::function<void __cdecl(void)> const &>,BitFactory::Continuation<std::nullptr_t>>::_Do_call() Line 876	C++
 	Bfx.Abstract.Engine.dll!std::_Func_class<BitFactory::Continuation<std::nullptr_t>>::operator()() Line 920	C++
 	Bfx.Abstract.Engine.dll!BitFactory::WatchSimpleAlwaysCo(const std::function<BitFactory::Continuation<std::nullptr_t> __cdecl(void)> & f) Line 130	C++
 	Bfx.Abstract.Engine.dll!BitFactory::WatchSimpleAlwaysCo$_InitCoro$2() Line 145	C++
 	Bfx.Abstract.Engine.dll!BitFactory::WatchSimpleAlwaysCo(const std::function<BitFactory::Continuation<std::nullptr_t> __cdecl(void)> & f) [Ramp]	C++
 	Bfx.Abstract.Engine.dll!`anonymous namespace'::WatchAlwaysUnknownCo(const std::function<BitFactory::Continuation<std::nullptr_t> __cdecl(void)> & f) Line 155	C++
 	Bfx.Abstract.Engine.dll!`anonymous namespace'::WatchAlwaysUnknownCo$_InitCoro$2() Line 161	C++
 	Bfx.Abstract.Engine.dll!`anonymous namespace'::WatchAlwaysUnknownCo(const std::function<BitFactory::Continuation<std::nullptr_t> __cdecl(void)> & f) [Ramp]	C++
 	Bfx.Abstract.Engine.dll!BitFactory::WatchSimpleCo(const std::function<BitFactory::Continuation<std::nullptr_t> __cdecl(void)> & f) Line 176	C++
 	Bfx.Abstract.Engine.dll!BitFactory::WatchSimpleCo$_InitCoro$2() Line 177	C++
 	Bfx.Abstract.Engine.dll!BitFactory::WatchSimpleCo(const std::function<BitFactory::Continuation<std::nullptr_t> __cdecl(void)> & f) [Ramp]	C++
 	Bfx.Abstract.Engine.dll!BitFactory::WatchSimpleCo(const std::function<BitFactory::Continuation<std::nullptr_t> __cdecl(void)> & f, std::function<void __cdecl(basic_super_string<char>)> log) Line 180	C++
 	Bfx.Abstract.Engine.dll!BitFactory::WatchSimpleCo$_InitCoro$2() Line 184	C++
 	Bfx.Abstract.Engine.dll!BitFactory::WatchSimpleCo(const std::function<BitFactory::Continuation<std::nullptr_t> __cdecl(void)> & f, std::function<void __cdecl(basic_super_string<char>)> log) [Ramp]	C++
 	Bfx.Abstract.Engine.dll!BitFactory::WatchSimple(const std::function<void __cdecl(void)> & f, std::function<void __cdecl(basic_super_string<char>)> log) Line 123	C++
 	Bfx.Alex.Server.Updater.Model.dll!`anonymous namespace'::Daemon::DoWatch(const std::function<void __cdecl(void)> & f, const basic_super_string<char> & what) Line 30	C++
 	Bfx.Alex.Server.Updater.Model.dll!`anonymous namespace'::Daemon::operator()(BitFactory::IInterrupted & interrupted) Line 38	C++
 	[Inline Frame] Bfx.Alex.Server.Updater.Model.dll!std::invoke(`anonymous-namespace'::Daemon &) Line 1705	C++
 	Bfx.Alex.Server.Updater.Model.dll!std::_Func_impl_no_alloc<`anonymous namespace'::Daemon,BitFactory::Secs,BitFactory::IInterrupted &>::_Do_call(BitFactory::IInterrupted & <_Args_0>) Line 876	C++
 	[Inline Frame] Bfx.Alex.Server.DaemonImpl.Model.dll!std::_Func_class<BitFactory::Secs,BitFactory::IInterrupted &>::operator()(BitFactory::IInterrupted &) Line 920	C++
 	Bfx.Alex.Server.DaemonImpl.Model.dll!BitFactory::Alex::ServerDaemonClientLoop::operator()(BitFactory::IInterrupted & interrupted) Line 65	C++
 	[Inline Frame] Bfx.Alex.Server.Daemon.Model.dll!std::_Func_class<void,BitFactory::IInterrupted &>::operator()(BitFactory::IInterrupted &) Line 920	C++
 	Bfx.Alex.Server.Daemon.Model.dll!`anonymous-namespace'::DaemonProcesses::GetServerDaemonLoop::__l2::<lambda_1>::()::__l2::<lambda_1>::operator()() Line 145	C++
 	[Inline Frame] Bfx.Abstract.Engine.dll!std::_Func_class<void>::operator()() Line 920	C++
 	Bfx.Abstract.Engine.dll!`anonymous namespace'::CoWrappedSimpleFunction(const std::function<void __cdecl(void)> & f) Line 81	C++
 	Bfx.Abstract.Engine.dll!`anonymous namespace'::CoWrappedSimpleFunction$_InitCoro$2() Line 82	C++
 	Bfx.Abstract.Engine.dll!`anonymous namespace'::CoWrappedSimpleFunction(const std::function<void __cdecl(void)> & f) [Ramp]	C++
 	[Inline Frame] Bfx.Abstract.Engine.dll!std::invoke(BitFactory::Continuation<std::nullptr_t>(*)(const std::function<void __cdecl(void)> &) &) Line 1705	C++
 	[Inline Frame] Bfx.Abstract.Engine.dll!std::_Invoker_ret<std::_Unforced>::_Call(BitFactory::Continuation<std::nullptr_t>(*)(const std::function<void __cdecl(void)> &) &) Line 2086	C++
 	[Inline Frame] Bfx.Abstract.Engine.dll!std::_Call_binder(std::_Invoker_ret<std::_Unforced>) Line 2096	C++
 	[Inline Frame] Bfx.Abstract.Engine.dll!std::_Binder<std::_Unforced,BitFactory::Continuation<std::nullptr_t> (__cdecl*)(std::function<void __cdecl(void)> const &),std::function<void __cdecl(void)> const &>::operator()() Line 2150	C++
 	[Inline Frame] Bfx.Abstract.Engine.dll!std::invoke(std::_Binder<std::_Unforced,BitFactory::Continuation<std::nullptr_t> (__cdecl*)(std::function<void __cdecl(void)> const &),std::function<void __cdecl(void)> const &> &) Line 1695	C++
 	Bfx.Abstract.Engine.dll!std::_Func_impl_no_alloc<std::_Binder<std::_Unforced,BitFactory::Continuation<std::nullptr_t> (__cdecl*)(std::function<void __cdecl(void)> const &),std::function<void __cdecl(void)> const &>,BitFactory::Continuation<std::nullptr_t>>::_Do_call() Line 876	C++
 	Bfx.Abstract.Engine.dll!std::_Func_class<BitFactory::Continuation<std::nullptr_t>>::operator()() Line 920	C++
 	Bfx.Abstract.Engine.dll!BitFactory::WatchSimpleAlwaysCo(const std::function<BitFactory::Continuation<std::nullptr_t> __cdecl(void)> & f) Line 130	C++
 	Bfx.Abstract.Engine.dll!BitFactory::WatchSimpleAlwaysCo$_InitCoro$2() Line 145	C++
 	Bfx.Abstract.Engine.dll!BitFactory::WatchSimpleAlwaysCo(const std::function<BitFactory::Continuation<std::nullptr_t> __cdecl(void)> & f) [Ramp]	C++
 	Bfx.Abstract.Engine.dll!`anonymous namespace'::WatchAlwaysUnknownCo(const std::function<BitFactory::Continuation<std::nullptr_t> __cdecl(void)> & f) Line 155	C++
 	Bfx.Abstract.Engine.dll!`anonymous namespace'::WatchAlwaysUnknownCo$_InitCoro$2() Line 161	C++
 	Bfx.Abstract.Engine.dll!`anonymous namespace'::WatchAlwaysUnknownCo(const std::function<BitFactory::Continuation<std::nullptr_t> __cdecl(void)> & f) [Ramp]	C++
 	Bfx.Abstract.Engine.dll!BitFactory::WatchSimpleCo(const std::function<BitFactory::Continuation<std::nullptr_t> __cdecl(void)> & f) Line 176	C++
 	Bfx.Abstract.Engine.dll!BitFactory::WatchSimpleCo$_InitCoro$2() Line 177	C++
 	Bfx.Abstract.Engine.dll!BitFactory::WatchSimpleCo(const std::function<BitFactory::Continuation<std::nullptr_t> __cdecl(void)> & f) [Ramp]	C++
 	Bfx.Abstract.Engine.dll!BitFactory::WatchSimple(const std::function<void __cdecl(void)> & f) Line 119	C++
 	Bfx.Abstract.Engine.dll!`anonymous namespace'::RunWatchedImpl(std::function<void __cdecl(void)> run) Line 240	C++
 	Bfx.Abstract.Engine.dll!BitFactory::RunWatchedWithWin32Errors(std::function<void __cdecl(void)> run) Line 259	C++
 	[Inline Frame] Bfx.Alex.Server.Updater.Model.dll!std::invoke(void(*)(std::function<void __cdecl(void)>) &) Line 1705	C++
 	Bfx.Alex.Server.Updater.Model.dll!std::_Func_impl_no_alloc<void (__cdecl*)(std::function<void __cdecl(void)>),void,std::function<void __cdecl(void)>>::_Do_call(std::function<void __cdecl(void)> && <_Args_0>) Line 878	C++
 	[Inline Frame] Bfx.Alex.Server.Daemon.Model.dll!std::_Func_class<void,std::function<void __cdecl(void)>>::operator()(std::function<void __cdecl(void)>) Line 920	C++
 	Bfx.Alex.Server.Daemon.Model.dll!`anonymous-namespace'::DaemonProcesses::GetServerDaemonLoop::__l2::<lambda_1>::operator()() Line 139	C++
 	[Inline Frame] Bfx.Alex.Server.Daemon.Model.dll!std::_Func_class<void>::operator()() Line 920	C++
 	Bfx.Alex.Server.Daemon.Model.dll!std::_Packaged_state<void __cdecl(void)>::_Call_immediate() Line 584	C++
 	[Inline Frame] Bfx.Alex.Server.Daemon.Model.dll!std::_Func_class<void>::operator()() Line 920	C++
 	[Inline Frame] Bfx.Alex.Server.Daemon.Model.dll!Concurrency::details::_MakeVoidToUnitFunc::__l2::<lambda_1>::operator()() Line 2363	C++
 	[Inline Frame] Bfx.Alex.Server.Daemon.Model.dll!std::invoke(Concurrency::details::_MakeVoidToUnitFunc::__l2::<lambda_1> &) Line 1695	C++
 	Bfx.Alex.Server.Daemon.Model.dll!std::_Func_impl_no_alloc<`Concurrency::details::_MakeVoidToUnitFunc'::`2'::<lambda_1>,unsigned char>::_Do_call() Line 876	C++
 	[Inline Frame] Bfx.Alex.Server.Daemon.Model.dll!std::_Func_class<unsigned char>::operator()() Line 920	C++
 	[Inline Frame] Bfx.Alex.Server.Daemon.Model.dll!Concurrency::task<unsigned char>::_InitialTaskHandle<void,`std::_Task_async_state<void>::_Task_async_state<void><std::_Fake_no_copy_callable_adapter<std::function<void __cdecl(void)>>>'::`2'::<lambda_1>,Concurrency::details::_TypeSelectorNoAsync>::_LogWorkItemAndInvokeUserLambda(std::function<unsigned char __cdecl(void)>) Line 3528	C++
 	[Inline Frame] Bfx.Alex.Server.Daemon.Model.dll!Concurrency::task<unsigned char>::_InitialTaskHandle<void,`std::_Task_async_state<void>::_Task_async_state<void><std::_Fake_no_copy_callable_adapter<std::function<void __cdecl(void)>>>'::`2'::<lambda_1>,Concurrency::details::_TypeSelectorNoAsync>::_Init(Concurrency::details::_TypeSelectorNoAsync) Line 3548	C++
 	[Inline Frame] Bfx.Alex.Server.Daemon.Model.dll!Concurrency::task<unsigned char>::_InitialTaskHandle<void,`std::_Task_async_state<void>::_Task_async_state<void><std::_Fake_no_copy_callable_adapter<std::function<void __cdecl(void)>>>'::`2'::<lambda_1>,Concurrency::details::_TypeSelectorNoAsync>::_Perform() Line 3533	C++
 	Bfx.Alex.Server.Daemon.Model.dll!Concurrency::details::_PPLTaskHandle<unsigned char,Concurrency::task<unsigned char>::_InitialTaskHandle<void,`std::_Task_async_state<void>::_Task_async_state<void><std::_Fake_no_copy_callable_adapter<std::function<void __cdecl(void)>>>'::`2'::<lambda_1>,Concurrency::details::_TypeSelectorNoAsync>,Concurrency::details::_TaskProcHandle>::invoke() Line 1475	C++
 	Bfx.Alex.Server.Daemon.Model.dll!Concurrency::details::_TaskProcHandle::_RunChoreBridge(void * _Parameter) Line 171	C++
 	Bfx.Alex.Server.Daemon.Model.dll!Concurrency::details::_DefaultPPLTaskScheduler::_PPLTaskChore::_Callback(void * _Args) Line 57	C++
 	msvcp140.dll!00007ffcafaf2b09()	Unknown
 	ntdll.dll!00007ffccf5bbff0()	Unknown
 	ntdll.dll!00007ffccf566964()	Unknown
 	kernel32.dll!00007ffccf057ac4()	Unknown
 	ntdll.dll!00007ffccf5aa8c1()	Unknown
Thread with ~AmazonWebServiceRequest() waiting for s_registryMutex
 	[Waiting on a lock, load symbols for ntdll.dll to show thread lock information]	
 	ntdll.dll!00007ffccf5f38e4()	Unknown
 	ntdll.dll!00007ffccf566121()	Unknown
 	msvcp140.dll!00007ffcafaf2860()	Unknown
 	[Inline Frame] aws-cpp-sdk-core.dll!std::_Mutex_base::lock() Line 52	C++
 	[Inline Frame] aws-cpp-sdk-core.dll!std::unique_lock<std::mutex>::{ctor}(std::mutex &) Line 144	C++
 	aws-cpp-sdk-core.dll!Aws::Utils::ComponentRegistry::DeRegisterComponent(void * pClient) Line 61	C++
 	[Inline Frame] aws-cpp-sdk-s3.dll!Aws::Client::ClientWithAsyncTemplateMethods<Aws::S3::S3Client>::{dtor}() Line 81	C++
>	aws-cpp-sdk-s3.dll!Aws::S3::S3Client::~S3Client() Line 330	C++
 	Bfx.Abstract.AWSEngine.dll!Aws::S3::S3Client::`scalar deleting destructor'(unsigned int)	C++
 	[Inline Frame] aws-cpp-sdk-transfer.dll!std::_Ref_count_base::_Decref() Line 1163	C++
 	[Inline Frame] aws-cpp-sdk-transfer.dll!std::_Ptr_base<Aws::S3::S3Client>::_Decref() Line 1380	C++
 	[Inline Frame] aws-cpp-sdk-transfer.dll!std::shared_ptr<Aws::S3::S3Client>::{dtor}() Line 1685	C++
 	aws-cpp-sdk-transfer.dll!Aws::Transfer::TransferManagerConfiguration::~TransferManagerConfiguration()	C++
 	aws-cpp-sdk-transfer.dll!Aws::Transfer::TransferManager::~TransferManager() Line 133	C++
 	[Inline Frame] aws-cpp-sdk-transfer.dll!std::_Ref_count_base::_Decref() Line 1163	C++
 	[Inline Frame] aws-cpp-sdk-transfer.dll!std::_Ptr_base<Aws::Transfer::TransferManager>::_Decref() Line 1380	C++
 	[Inline Frame] aws-cpp-sdk-transfer.dll!std::shared_ptr<Aws::Transfer::TransferManager>::{dtor}() Line 1685	C++
 	aws-cpp-sdk-transfer.dll!<lambda_58b58022bff5f2f5ded7c57345182668>::~<lambda_58b58022bff5f2f5ded7c57345182668>()	C++
 	aws-cpp-sdk-transfer.dll!std::_Func_impl_no_alloc<<lambda_58b58022bff5f2f5ded7c57345182668>,void,Aws::Http::HttpRequest const *,__int64>::_Delete_this(bool _Dealloc) Line 896	C++
 	[Inline Frame] aws-cpp-sdk-core.dll!std::_Func_class<void,Aws::Http::HttpRequest const *,Aws::Http::HttpResponse *,__int64>::_Tidy() Line 998	C++
 	[Inline Frame] aws-cpp-sdk-core.dll!std::_Func_class<void,Aws::Http::HttpRequest const *,Aws::Http::HttpResponse *,__int64>::{dtor}() Line 924	C++
 	aws-cpp-sdk-core.dll!Aws::AmazonWebServiceRequest::~AmazonWebServiceRequest() Line 47	C++
 	aws-cpp-sdk-s3.dll!std::_Func_impl_no_alloc<std::_Binder<std::_Unforced,<lambda_43c9a830a83e86f58140a91021b47507>>,void>::_Delete_this(bool _Dealloc) Line 896	C++
 	[Inline Frame] aws-cpp-sdk-core.dll!std::_Func_class<void>::_Tidy() Line 998	C++
 	[Inline Frame] aws-cpp-sdk-core.dll!std::_Func_class<void>::{dtor}() Line 924	C++
 	aws-cpp-sdk-core.dll!std::_Func_impl_no_alloc<std::_Binder<std::_Unforced,<lambda_ee704f056ba6e0a44d5d1f274a320335>,std::function<void __cdecl(void)>>,void>::_Delete_this(bool _Dealloc) Line 895	C++
 	[Inline Frame] aws-cpp-sdk-core.dll!std::_Func_class<void>::_Tidy() Line 998	C++
 	[Inline Frame] aws-cpp-sdk-core.dll!std::_Func_class<void>::{dtor}() Line 924	C++
 	[Inline Frame] aws-cpp-sdk-core.dll!std::default_delete<std::tuple<std::function<void __cdecl(void)>>>::operator()(std::tuple<std::function<void __cdecl(void)>> *) Line 3302	C++
 	[Inline Frame] aws-cpp-sdk-core.dll!std::unique_ptr<std::tuple<std::function<void __cdecl(void)>>,std::default_delete<std::tuple<std::function<void __cdecl(void)>>>>::{dtor}() Line 3412	C++
 	aws-cpp-sdk-core.dll!std::thread::_Invoke<std::tuple<std::function<void __cdecl(void)>>,0>(void * _RawVals) Line 62	C++
 	ucrtbase.dll!00007ffccbbb268a()	Unknown
 	kernel32.dll!00007ffccf057ac4()	Unknown
 	ntdll.dll!00007ffccf5aa8c1()	Unknown

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

Aws::ShutdownAPI should complete without deadlocking.

Current Behavior

Aws::ShutdownAPI never completes.

Reproduction Steps

No example to reproduce available.

Possible Solution

No response

Additional Information/Context

No response

AWS CPP SDK version used

1.11.428

Compiler and Version used

msvc 19.42.34435 x64

Operating System and version

Windows Server 2019 / 1809 / 17763.6532

@Markus87 Markus87 added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jan 31, 2025
@sbiscigl
Copy link
Contributor

sbiscigl commented Jan 31, 2025

do you have reproduction code? how frequently do you see the issue? what version of the SDK are you using and how to you install it?

@sbiscigl sbiscigl added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. label Jan 31, 2025
@Markus87
Copy link
Author

Sadly I can't reproduce it (yet).
We have updated the SDK recently via vcpkg and this was the first two cases in the wild.
This version will become in February our stable, then it's more likely to occur more often.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. label Feb 1, 2025
@sbiscigl
Copy link
Contributor

sbiscigl commented Feb 4, 2025

So i think the issue can be summarized by the error log (that i assume you have in your logs)

AWS_LOGSTREAM_FATAL(AwsServiceClientT::GetAllocationTag(), "Service client "
    << AwsServiceClientT::GetServiceName() << " is shutting down while async tasks are present.");

which essentially means yes, the sdk is still processing async requests while being shutdown. this "FATAL" log should be a an exception if we had exceptions enabled. the solution for this would be in your client code to call the ShutdownSdkClient method on the underlying s3 client with a non-negtive time out time, to wait for processing to be completed before destorying the client.

@sbiscigl sbiscigl added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. label Feb 4, 2025
@Markus87
Copy link
Author

Markus87 commented Feb 4, 2025

Thanks for the feedback.
I am afraid we did not have logging enabled for the aws sdk.
The last operation was a download and since the downloaded file was where it was supposed to be and valid and complete, I assumed there was nothing pending.

So if I have understood you correctly, even if the S3Client does call ShutdownSdkClient in its destructor I still have to call it with a timeout to avoid this deadlock situation?

Documentation says -1 would be default which would be requestTimeoutMs which is set to 3000.
In the coredump I have of the deadlocked process m_operationsProcessedof the S3Client is 0, so this should indicate no active operations?
Image

If there is anything I can check in the dump for you please just ask.

@sbiscigl sbiscigl removed the needs-triage This issue or PR still needs to be triaged. label Feb 4, 2025
@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 10 days. label Feb 5, 2025
@SergeyRyabinin
Copy link
Contributor

Hi @Markus87 ,

Thank you a lot for reporting this issue!
This is an interesting case.

Could you please provide a high-level code of how you use TransferManager? Especially for the part of

the downloaded file was where it was supposed to be and valid and complete

My guess is that you Submit a DownloadFile task but don't wait for it's completion. And there is a design issue for the existing implementation of TransferManager: TransferTask self-sustains the TransferManager pointer on the worker thread, resulting in a cyclic reference of
Executor->TransferTask->TransferManager->S3Client->Executor.
When S3Client is getting destroyed at the end of the TransferTask, it attempts to destroy the Executor as well, but executor is waiting for the Worker thread completion (which waits for the Executor destruction completion which waits for......).
Removing this quirky design decision would be a build breaking change, that's why it is still there, but we need to either revisit this design decision or provide a fix for this particular case.

I'd suggest to also call transferManagerSp->WaitUntilAllFinished(); before you release the shared_ptr of the TransferManager, this way TransferManager and S3Client and Executor will outlive the worker thread and avoid the loop dependency issue that may cause this deadlock.

Best regards,
Sergey

@Markus87
Copy link
Author

Markus87 commented Feb 6, 2025

Hello @SergeyRyabinin and thanks for your response.

The code in question looks like this:

class AWS::S3
{
	//...
	std::shared_ptr< Aws::S3::S3Client > _;

	void Download(...)
	{
		auto executor = Aws::MakeShared<Aws::Utils::Threading::DefaultExecutor>( ALLOCATION_TAG );
		Aws::Transfer::TransferManagerConfiguration config( executor.get() );
		config.s3Client = _;
		auto transfer = Aws::Transfer::TransferManager::Create( config );
		auto transferHandle = transfer->DownloadFile( ... );
		transferHandle->WaitUntilFinished();
		if( transferHandle->GetStatus() != Aws::Transfer::TransferStatus::COMPLETED )
			Throw( "Aws::Transfer::DownloadFile", transferHandle->GetLastError() );	
	}
	//...	
};

class AWS::VersionChannel
{
	//...
	std::shared_ptr< AWS::S3 > myS3;

	void Download(...)
	{
		myS3->Download(...);
	}
	//...
};

AWS::VersionChannel GetVersionChannel()
{
	return AWS::VersionChannel(...);
}

void MayDeadlockAfterDownload()
{
	//...
	GetVersionChannel().Download(...);
	//...
}

To avoid confusion, the shared_ptr could(should) be unique_ptr, it is never actually shared anywhere else.

If I follow your suggestion then we need to replace WaitUntilFinished to WaitUntilAllFinished?

@SergeyRyabinin SergeyRyabinin added the pending-release This issue will be fixed by an approved PR that hasn't been released yet. label Feb 7, 2025
@SergeyRyabinin
Copy link
Contributor

Hi @Markus87 ,
thank you for the reply and providing the high-level code example.
I think the code flow confirms my initial idea. There is a potential race condition, because

transferHandle->WaitUntilFinished();

returns on successful file transfer, however, the transfer task may be still active and be finishing on a separate thread, then at the same time your Download function returns, destroying the Executor/TransferManager and rendering the async transfer task to be the only owner of itself.

WaitUntilAllFinished should minimize the risk of race condition in you case.
I have also drafted a PR to handle such case in our default executor class on destruction: #3288, we will merge it soon after additional internal tests.

Best regards,
Sergey

@Markus87
Copy link
Author

@SergeyRyabinin Thank you, we will try that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. pending-release This issue will be fixed by an approved PR that hasn't been released yet.
Projects
None yet
Development

No branches or pull requests

3 participants