Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application using log4cplus hangs on exit #102

Closed
serlin74 opened this issue Aug 4, 2015 · 23 comments
Closed

Application using log4cplus hangs on exit #102

serlin74 opened this issue Aug 4, 2015 · 23 comments
Assignees
Labels

Comments

@serlin74
Copy link

serlin74 commented Aug 4, 2015

With relatively latest log4cplus version (commit from 04122015) I have encountered unexpected hang in my application upon exit. After two days of digging, I have found that ThreadPool destructor is called twice:

Breakpoint 6, ~ThreadPool (this=0x6816d0, __in_chrg=<optimized out>)
    at threadpool/ThreadPool.h:178
178            std::unique_lock<std::mutex> lock(queue_mutex);
(gdb) bt
#0  ~ThreadPool (this=0x6816d0, __in_chrg=<optimized out>)
    at threadpool/ThreadPool.h:178
#1  operator() (this=<optimized out>, __ptr=0x6816d0)
    at /usr/include/c++/5/bits/unique_ptr.h:76
#2  ~unique_ptr (this=0x6815a8, __in_chrg=<optimized out>)
    at /usr/include/c++/5/bits/unique_ptr.h:236
#3  ~DefaultContext (this=0x6812b0, __in_chrg=<optimized out>)
    at src/global-init.cxx:127
#4  log4cplus::(anonymous namespace)::destroy_default_context::~destroy_default_context (
    this=0x7ffff71562d6 <log4cplus::(anonymous namespace)::destroy_default_context_>, __in_chrg=<optimized out>) at src/global-init.cxx:162
#5  0x00007ffff512a21f in __cxa_finalize (d=0x7ffff7151710)
    at cxa_finalize.c:55
#6  0x00007ffff6f05873 in __do_global_dtors_aux ()
   from /usr/lib64/liblog4cplus-2.0.so.3
#7  0x00007fffffffe010 in ?? ()
#8  0x00007ffff7dea31f in _dl_fini () at dl-fini.c:253
(gdb) c
Continuing.
[Thread 0x7fffeeb44700 (LWP 6151) exited]
[Thread 0x7fffef345700 (LWP 6150) exited]

As you can see, all worker threads have gone. And the second one:

Program received signal SIGINT, Interrupt.
__lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
135    2:    movl    %edx, %eax
(gdb) bt
#0  __lll_lock_wait ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007ffff5a40640 in pthread_cond_broadcast@@GLIBC_2.3.2 ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_broadcast.S:136
#2  0x00007ffff576abb9 in __gthread_cond_broadcast (__cond=<optimized out>)
    at /usr/src/debug/gcc-5.1.1-20150612/obj-x86_64-alt-linux/x86_64-alt-linux/libstdc++-v3/include/x86_64-alt-linux/bits/gthr-default.h:852
#3  std::condition_variable::notify_all (this=<optimized out>)
    at ../../../../../libstdc++-v3/src/c++11/condition_variable.cc:73
#4  0x00007ffff6f1c8f3 in ~ThreadPool (this=0x6826d0, 
    __in_chrg=<optimized out>) at threadpool/ThreadPool.h:180
#5  operator() (this=<optimized out>, __ptr=0x6826d0)
    at /usr/include/c++/5/bits/unique_ptr.h:76
#6  reset (__p=0x6826d0, this=<optimized out>)
    at /usr/include/c++/5/bits/unique_ptr.h:344
#7  log4cplus::shutdownThreadPool () at src/global-init.cxx:282
#8  0x00007ffff6f2116b in log4cplus::Hierarchy::shutdown (
    this=this@entry=0x682358) at src/hierarchy.cxx:244
#9  0x00007ffff6f21357 in log4cplus::Hierarchy::~Hierarchy (this=0x682358, 
    __in_chrg=<optimized out>) at src/hierarchy.cxx:81
#10 0x00007ffff6f1cfa5 in ~DefaultContext (this=0x6822b0, 
    __in_chrg=<optimized out>) at src/global-init.cxx:127
#11 log4cplus::(anonymous namespace)::destroy_default_context::~destroy_default_---Type <return> to continue, or q <return> to quit--- 
context (this=<optimized out>, __in_chrg=<optimized out>)
    at src/global-init.cxx:162
#12 0x00007ffff512a21f in __cxa_finalize (d=0x7ffff7151710)
    at cxa_finalize.c:55
#13 0x00007ffff6f05873 in __do_global_dtors_aux ()
   from /usr/lib64/liblog4cplus-2.0.so.3
#14 0x00007fffffffdff0 in ?? ()
#15 0x00007ffff7dea31f in _dl_fini () at dl-fini.c:253

As you can see log4cplus::Hierarchy::shutdown tries to destruct already destructed ThreadPool instance. It happens because of fields order in struct DefaultContext (from global-init.cxx):

struct DefaultContext
{
    log4cplus::thread::Mutex console_mutex;
    helpers::LogLog loglog;
    LogLevelManager log_level_manager;
    helpers::Time TTCCLayout_time_base;
    NDC ndc;
    MDC mdc;
    Hierarchy hierarchy;
    spi::AppenderFactoryRegistry appender_factory_registry;
    spi::LayoutFactoryRegistry layout_factory_registry;
    spi::FilterFactoryRegistry filter_factory_registry;
    spi::LocaleFactoryRegistry locale_factory_registry;
#if ! defined (LOG4CPLUS_SINGLE_THREADED)
    std::unique_ptr<progschj::ThreadPool> thread_pool {new progschj::ThreadPool};
#endif
};

thread_pool member is destructed before hierarchy member. As result, we are in trouble. Several solutions possible:

  1. change member order
  2. remove explicit ThreadPool shutdown from Hierarchy
@wilx wilx self-assigned this Aug 4, 2015
@wilx wilx added the bug label Aug 4, 2015
@wilx
Copy link
Contributor

wilx commented Aug 4, 2015

This is weird. I cannot see how could ThreadPool::~ThreadPool() be called twice on the same DefaultContext::thread_pool instance. If the Hierarchy::shutdown() function is called then it shuts down the thread pool and sets thread_pool to nullptr by invoking .reset(). There is IMHO no way it could be destroyed twice.

The only scenario where it could possibly happen is when your code is logging from static/global variables destructors or similar. In this situation it might happen that DefaultContext is destroyed once and then reincarnated again because some static/global variable destructor tries to do some logging. In this situation you should get a warning on console like

Re-initializing default context after it has already been destroyed.
The memory will be leaked.

You say:

thread_pool member is destructed before hierarchy member.[...]

This is by design. First threads that might be using the hierarchy are shut down and only then the hierarchy itself is destroyed.

  1. change member order
  2. remove explicit ThreadPool shutdown from Hierarchy

I do not think this is a correct solution since it is unclear what is the actual cause.

Please make sure that you are using the Initializer, e.g., like the thread_test does. And make sure that you are not logging when you application exits main(), from static/global variables destructors.

@serlin74
Copy link
Author

serlin74 commented Aug 4, 2015

This is weird. I cannot see how could ThreadPool::~ThreadPool() be called twice on the same DefaultContext::thread_pool instance. If the Hierarchy::shutdown() function is called then it shuts down the thread pool and sets thread_pool to nullptr by invoking .reset(). There is IMHO no way it could be destroyed twice.

My understanding of the problem is the following: thread_pool destructor is invoked before Hierarchy::shutdown() because it's the last member of DefaultContext structure and therefore it's called first.

I definitely don't use logging from static/global destructors. I'll check what thread_test does and make additional check of my application.

@serlin74
Copy link
Author

serlin74 commented Aug 4, 2015

Below is excerpt from my code (works fine with log4cplus 1.x.x) to demonstrate the problem:

#include <log4cplus/logger.h>
#include <log4cplus/loggingmacros.h>
#include <log4cplus/consoleappender.h>

/**
 * Initialize logging framework.
 */
static void
initLogging()
{
    using namespace log4cplus;

    // Initialize log4cplus framework.
    initialize();

    LogLevelManager& log_man = getLogLevelManager();

    Logger root_logger = Logger::getRoot();

    // Manage log level.
    root_logger.setLogLevel(DEBUG_LOG_LEVEL);

    // Configure log output layout.
    tstring pattern = LOG4CPLUS_TEXT("%D{%Y-%m-%d %H:%M:%S.%q} [%t] %-5p %c - %m%n");

    // Manage logging to console.
    SharedAppenderPtr console_ptr(new ConsoleAppender("test"));

    console_ptr->setLayout(std::auto_ptr<Layout>(new PatternLayout(pattern)));
    root_logger.addAppender(console_ptr);

    LOG4CPLUS_INFO(root_logger, LOG4CPLUS_TEXT("Logger initialized"));
}

/**
 *
 */
int main(int     argc, char*   argv[])
{
    initLogging();

    LOG4CPLUS_INFO(log4cplus::Logger::getRoot(), LOG4CPLUS_TEXT("test exited"));

    return (0);
}

Makefile:

test1: test.cpp
     gcc -o test1 -std=gnu++11 -g test.cpp -lpthread -llog4cplus -lstdc++

@wilx
Copy link
Contributor

wilx commented Aug 4, 2015

My understanding of the problem is the following: thread_pool destructor is invoked before Hierarchy::shutdown() because it's the last member of DefaultContext structure and therefore it's called first.

Hmm, this seems as plausible scenario.

My initial thought, when I was writing the previous post, was that I wanted to shut down the thread pool so that possible concurrent logging thread would error out on thread_pool == nullptr. But now that I have looked at the code I can see that enqueueAsyncDoAppend() does not even check that thread_pool is non-nullptr.

Checking git blame brings up revision a87ae37:

MSVC: Avoid deadlock on exit during ThreadPool destruction.

This is deadlock is caused by deadlock in Visual Studio 2015's C++11 threading
and synchronization facilities implementation. The facilities are not usable
after main(), i.e., during the time the ThreadPool was being destroyed.

So, the thread pool shutdown there has a meaning. Now, I wonder how it works on Windows with MSVC and does not on Linux (?) with GCC?

I could remove the line from Hierarchy::shutdown() but then I would have to document stronger requirements for MSVC, that either Hierarchy::shutdown() has to be called before main() exits or log4cplus::Initializer has to be used to do the same.


Checking GCC 4.8's unique_ptr<> implementation on Linux shows this for destructor:

      // Destructor.
      ~unique_ptr() noexcept
      {
        auto& __ptr = std::get<0>(_M_t);
        if (__ptr != nullptr)
          get_deleter()(__ptr);
        __ptr = pointer();
      }

And this for rest() (used by shutdownThreadPool()):

      void
      reset(pointer __p = pointer()) noexcept
      {
        using std::swap;
        swap(std::get<0>(_M_t), __p);
        if (__p != pointer())
          get_deleter()(__p);
      }

IOW, even when the reset() call from shutdownThreadPool() is technically UB (accessing already destroyed instance of unique_ptr<>), there should be no effect, especially the one that you see, IMHO. The unique_ptr<> destructor sets its internal pointer nullptr and reset() checks for it.


TL;DR, I still do not get it. :)

@wilx
Copy link
Contributor

wilx commented Aug 4, 2015

Well, UB is never a good thing, so let's try with this now:

diff --git a/src/hierarchy.cxx b/src/hierarchy.cxx
index dc0f7fd..a9efd25 100644
--- a/src/hierarchy.cxx
+++ b/src/hierarchy.cxx
@@ -235,13 +235,11 @@ Hierarchy::getLoggerFactory()

 // from global-init.cxx
 void waitUntilEmptyThreadPoolQueue ();
-void shutdownThreadPool ();

 void
 Hierarchy::shutdown()
 {
     waitUntilEmptyThreadPoolQueue ();
-    shutdownThreadPool ();

     LoggerList loggers = getCurrentLoggers();

Does this fix your problem?

@serlin74
Copy link
Author

serlin74 commented Aug 5, 2015

Yes, I've forgot to mention that I'm using log4cplus for Linux, gcc version 5.1.1.

Your fix works fine for me.

wilx added a commit to wilx/log4cplus that referenced this issue Aug 5, 2015
Do not tear down thread pool on `Hierarchy::shutdown()`.
@rwightman
Copy link

Irrespective of whether removing the call to shutdownThreadPool was a fix of some sort, I believe serlin74's original suggestion, that the order of the members of the DefaultContext struct needs to be changed, is correct.

Building the master on Visual Studio 2015 64-bit in Debug, most of the test applications crash on exit with an invalid memory access while trying to wait an an already destructed thread pool during destruction of the static DefaultContext instance. The hierarchy member destructor is currently called after the thread pool is destroyed, which calls into the thread pool instance through ~Hierarchy() -> shutdown() -> waitUntilEmptyThreadPoolQueue().

To address this problem and prevent future issues, the thread pool member should be moved above any other member that uses it or is likely to use it in the future, especially during destruction.

Please see http://stackoverflow.com/questions/2254263/order-of-member-constructor-and-destructor-calls

@wilx
Copy link
Contributor

wilx commented Oct 13, 2015

The hierarchy member destructor is currently called after the thread pool is destroyed, which calls into the thread pool instance through ~Hierarchy() -> shutdown() -> waitUntilEmptyThreadPoolQueue().

Is this not fixed by 13027c3?

@rwightman
Copy link

No, that change actually makes this issue more pronounced as waitUntilEmptyThreadPoolQueue now enters into the 'if (dc->thread_pool)' block after the thread_pool is destroyed during DefaultContext destruction (due to thread_pool's order relative to hierarchy).

Where as with the code prior to that change, thread_pool (its unique_ptr) would have been null the second time through if it was destroyed via the reset call.

You should see the current problem pretty quickly with any debug build on a platform that marks freed memory. It's unlikely to show in a release build.

@rwightman
Copy link

Also, FYI, re the original problem. There were (are?) incorrect implementations of unique_ptr on multiple compilers that can cause a double delete.

@wilx
Copy link
Contributor

wilx commented Oct 13, 2015

@rwightman If you are using Visual Studio, then try to change your application so that it conforms to the requirements described at README.md. Does it help?

@wilx
Copy link
Contributor

wilx commented Oct 13, 2015

Also, FYI, re the original problem. There were (are?) incorrect implementations of unique_ptr on multiple compilers that can cause a double delete.

The problem is not with unique_ptr but with log4cplus code that reached the same pointer twice, once when it was already destroyed, I think.

@rwightman
Copy link

I read the README.md, the crash is occurring in debug for most of the included test apps on exit. It is not related to anything I'm doing with my own code.

@wilx
Copy link
Contributor

wilx commented Oct 13, 2015

@rwightman OK, I will have to look again. I thought it was working well after the fixes.

@wilx
Copy link
Contributor

wilx commented Oct 14, 2015

@rwightman I have not reproduced the exit issue that you have mentioned. Unfortunately I have reproduced a different issue on start up. I have pushed two changes to fix it. First, wilx@f34a58f, to fix issue with initialization using APC callbacks when log4cplus was used as a static library. Second, d7aa378, to fix tests to use log4cplus::Initializer.

Does this solve all of your issues?

@wilx
Copy link
Contributor

wilx commented Nov 9, 2015

@rwightman Ping.

@0x61726b 0x61726b mentioned this issue Feb 8, 2016
@wilx
Copy link
Contributor

wilx commented Mar 16, 2016

@rwightman I have recently fixed another at exit crash: #155. Does it by a chance also resolve your issue?

@wilx
Copy link
Contributor

wilx commented Apr 5, 2017

I am closing this issue as fixed. If you think otherwise, please reopen this issue or fill a new one.

@wilx wilx closed this as completed Apr 5, 2017
@federicozanco
Copy link

Hi, same problem here. VS2015, WinForm x64 application (C#), C++ dll that references log4cplus, Debug, crashes (Access violation) on exit, . The workaround was to call explicitly Logger::shutdown and comment waitUntilEmptyThreadPoolQueue in Hierarchy::shutdown() (blindly, I don't know if it leads to any memory leak or other issues). Regards.

    msvcp140d.dll!00007fff7f3c1a53()	Unknown
msvcp140d.dll!00007fff7f3c2075()	Unknown
log4cplusD.dll!std::_Mtx_lockX(_Mtx_internal_imp_t * _Mtx) Line 70	C++
log4cplusD.dll!std::_Mutex_base::lock() Line 44	C++
log4cplusD.dll!std::unique_lock<std::mutex>::unique_lock<std::mutex>(std::mutex & _Mtx) Line 274	C++
log4cplusD.dll!progschj::ThreadPool::wait_until_empty() Line 192	C++
log4cplusD.dll!log4cplus::waitUntilEmptyThreadPoolQueue() Line 298	C++
log4cplusD.dll!log4cplus::Hierarchy::shutdown() Line 244	C++
log4cplusD.dll!log4cplus::Hierarchy::~Hierarchy() Line 81	C++
[External Code]	
log4cplusD.dll!log4cplus::`anonymous namespace'::destroy_default_context::~destroy_default_context() Line 166	C++
[External Code]	

@wilx wilx reopened this Jun 29, 2017
@wilx
Copy link
Contributor

wilx commented Jun 29, 2017

@federicozanco commented on 29. 6. 2017 13:08 SELČ:

Hi, same problem here. VS2015, WinForm x64 application (C#), C++ dll that references log4cplus, Debug, crashes (Access violation) on exit, . The workaround was to call explicitly Logger::shutdown and comment waitUntilEmptyThreadPoolQueue in Hierarchy::shutdown() (blindly, I don't know if it leads to any memory leak or other issues). Regards.

    msvcp140d.dll!00007fff7f3c1a53()	Unknown

msvcp140d.dll!00007fff7f3c2075() Unknown
log4cplusD.dll!std::_Mtx_lockX(_Mtx_internal_imp_t * _Mtx) Line 70 C++
log4cplusD.dll!std::_Mutex_base::lock() Line 44 C++
log4cplusD.dll!std::unique_lockstd::mutex::unique_lockstd::mutex(std::mutex & _Mtx) Line 274 C++
log4cplusD.dll!progschj::ThreadPool::wait_until_empty() Line 192 C++
log4cplusD.dll!log4cplus::waitUntilEmptyThreadPoolQueue() Line 298 C++
log4cplusD.dll!log4cplus::Hierarchy::shutdown() Line 244 C++
log4cplusD.dll!log4cplus::Hierarchy::~Hierarchy() Line 81 C++
[External Code]
log4cplusD.dll!log4cplus::`anonymous namespace'::destroy_default_context::~destroy_default_context() Line 166 C++
[External Code]

Could you please test the solution proposed in the following comment thread? da36fd8#commitcomment-22810107

@federicozanco
Copy link

after a quick test seems working. Didn't tried with Initializer though.

struct DefaultContext
{`
    log4cplus::thread::Mutex console_mutex;`
    helpers::LogLog loglog;
    LogLevelManager log_level_manager;
    helpers::Time TTCCLayout_time_base;
    NDC ndc;
    MDC mdc;
    spi::AppenderFactoryRegistry appender_factory_registry;
    spi::LayoutFactoryRegistry layout_factory_registry;
    spi::FilterFactoryRegistry filter_factory_registry;
    spi::LocaleFactoryRegistry locale_factory_registry;
#if ! defined (LOG4CPLUS_SINGLE_THREADED)
    std::unique_ptr<progschj::ThreadPool> thread_pool {new progschj::ThreadPool};
#endif
    Hierarchy hierarchy;
};

@wilx
Copy link
Contributor

wilx commented Jul 1, 2017

I have pushed dedbec4 to 2.0.x branch to implement this. I really hope this will fix this issue finally. Please reopen this issue if it still is a problem.

@wilx wilx closed this as completed Jul 1, 2017
@jg-at-home
Copy link

jg-at-home commented Sep 19, 2018

I am getting the same error on 2.0.x

If I convert all my projects to static libraries + exes, however, the issue goes away. This isn't ideal, of course.

ETA: Visual Studio 2017, C++14/17

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants