Fixed #202 Conflict with other modules using libapr #222

Closed
wants to merge 1 commit into
from

3 participants

@chaizhenhua
Google PageSpeed member

No description provided.

@jeffkaufman
Google PageSpeed member

Could you give some steps to reproduce the problem, so I can verify that this fixes it?

@chaizhenhua
Google PageSpeed member

complie with this nginx module (https://github.com/chaizhenhua/ngx_apr.git) and run some test cases. then nginx -s quit, without this patch nginx will crash see comments in #202

@oschaaf oschaaf commented on the diff Apr 4, 2013
src/ngx_rewrite_driver_factory.cc
@@ -345,4 +345,13 @@ void NgxRewriteDriverFactory::InitStats(Statistics* statistics) {
statistics->AddVariable(kShutdownCount);
}
+void NgxRewriteDriverFactory::Initialize() {
+ apr_initialize();
+ RewriteDriverFactory::Initialize();
+}
+void NgxRewriteDriverFactory::Terminate() {
+ RewriteDriverFactory::Terminate();
+ apr_terminate();
@oschaaf
Google PageSpeed member
oschaaf added a line comment Apr 4, 2013

I'm no apr expert, but the docs at http://apr.apache.org/docs/apr/0.9/group__apr__library.html mention this:

The APR developers suggest using atexit to ensure this is called.

I remember that apr_initialize() and apr_terminate() perform reference counting, which makes me think that we should call terminate as late as possible, to prevent forwarding this problem to other modules?

@chaizhenhua
Google PageSpeed member
chaizhenhua added a line comment Apr 4, 2013

I got the defination:

APR_DECLARE_NONSTD(void) apr_terminate(void)
{
    initialized--;
    if (initialized) {
        return;
    }
    apr_pool_terminate();

}

reference counting is used.
so if apr_initialize and apr_terminate are paired and no race condition, the terminate sequence is unimportent.

i tried to find apr_initialize in pagespeed but failed, may be the crash is caused by no paired apr_initialize and apr_terminate.

in my opinion, module can call apr_terminate at any time after finish using it although there was a suggestation. atexit() will break the encapsulation.

are you agree?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@oschaaf
Google PageSpeed member

@chaizhenhua
Yes, I agree.

@oschaaf
Google PageSpeed member

@chaizhenhua
Testing with this patch, I now get segfaults when shutting down in more situations than without it.

  • with your minimal case, without hitting the serf fetcher
  • with your minimal case, with hitting the serf fetcher
  • with modsecurity, without hitting the serf fetcher
  • with modsecurity, with hitting the serf fetcher

All with different backtraces.
Thinking more about it, I still think that for ngx_pagespeed, we do need to hook apr_terminate as late as possible for two reasons:

  • When terminating our NgxRewriteDriverFactory, there is no guarantee that there are no more serf fetches running as far as I can tell.
  • When another module behaves like we do regarding to apr initialization/termination, postponing the apr_terminate call to as late as possible decreases the chance of us causing a crash in modules after us.
@chaizhenhua
Google PageSpeed member

@oschaaf i forgotten to tell you that the ngx_apr has the same bug as modsecurity. now i have fixed it. could you please put the backtraces and use new ngx_apr to test again?

I think we can ignore modsecurity, but only pass the tiny test module.

I cant reproduce, i just run all test cases then quit, there is no crash.

did SerfFetchers only run in RewriteDriverFactory lifetime?

I think we should find the reason why crashed in this way. pagespeed can determine when to terminate apr, while other module is out of our control. so we'd better solve it without modify other module.

@oschaaf
Google PageSpeed member

@chaizhenhua
I can confirm the problems are fixed with the latest version of ngx_apr.

@jeffkaufman
It's possible that serf fetchers sometimes continue after the factory is shut down right? If they take too long to cancel? If that is true, I think calling apr_terminate() from NgxRewriteDriverFactory::Terminate() may still cause trouble. Your thoughts?

@jeffkaufman
Google PageSpeed member

Fixed by #601

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment