trunk-tracking-shutdown: improve shutdown code #897
Conversation
- Add a test to make sure all logged output looks sane - Stop our nginx test instances after we are done testing, and - Add a test shuts down under high load (needs ab) - Reduce the number of keepalive requests in the keepalive tests to speed up test runs. - Add notes/questions which need to be sorted out.
…lean" This reverts commit 013ca06. Another approach is needed, this was a bad idea
- Valgrind clean run for the cache manager/loader processes - Undo tracking file descriptors with valgrind, as it seems it isn't possible to suppress messages, and nginx leaves some descriptors to the os. - Disallow [crit] messages in error.log next to [alert]
- Add a test to make sure all logged output looks sane by whitelisting current errors/warnings. - Stop our nginx test instances after we are done testing. - Add tests for shutting down and reloading configuration under high load (depends on ab). - Reduce the number of keepalive requests in the keepalive tests to speed up test runs. - Fix exiting with open file descriptors, fix cleanup in nginx's cache manager/loader processes - Attempt to finish up queued up NgxBaseFetches/requests on shutdown/reload - Under valgrind the blocking rewrite started failing after adding a test for reloading configuration under high load. I've added it to the expected failures for valgrind, looking into this is up next.
I ran the tests for a while, and discovered that the fix for cleaning up when exiting from the cache manager/loader processes needs more work. I'm closing this pull, and will re-open when I have fixed that. Sorry! |
@jeffkaufman Stil working on this, but I think I'm very nearly done. Could you evaluate these PSOL changes? These are required for this patch to work. Index: pagespeed/system/serf_url_async_fetcher.cc
===================================================================
--- pagespeed/system/serf_url_async_fetcher.cc (revision 4537)
+++ pagespeed/system/serf_url_async_fetcher.cc (working copy)
@@ -144,6 +144,7 @@
}
if (pool_ != NULL) {
apr_pool_destroy(pool_);
+ pool_ = NULL;
}
}
@@ -1173,8 +1174,13 @@
if (threaded_fetcher_ != NULL) {
delete threaded_fetcher_;
}
- delete mutex_;
- apr_pool_destroy(pool_); // also calls apr_allocator_destroy on the allocator
+ if (mutex_ != NULL) {
+ delete mutex_;
+ }
+ if (pool_ != NULL) {
+ apr_pool_destroy(pool_); // also calls apr_allocator_destroy on the allocator
+ pool_ = NULL;
+ }
}
void SerfUrlAsyncFetcher::ShutDown() {
Index: pagespeed/system/system_caches.cc
===================================================================
--- pagespeed/system/system_caches.cc (revision 4537)
+++ pagespeed/system/system_caches.cc (working copy)
@@ -76,7 +76,6 @@
}
void SystemCaches::ShutDown(MessageHandler* message_handler) {
- DCHECK(!was_shut_down_);
if (was_shut_down_) {
return;
} |
Setting
Same here, except that now you don't destroy
Why do we need to be able to shut down |
@jeffkaufman As for removing the DCHECK in |
@jeffkaufman Actually, tthe explicit shutdown is no longer needed as we'll be waiting until there are no more active base fetches. So, disregard that too, there are no PSOL changes necessary. |
Sounds good! Let me know when you've got something you'd like me to review. |
current errors/warnings.
load (depends on ab).
up test runs.
manager/loader processes
for reloading configuration under high load.
I've added it to the expected failures for valgrind, looking into this
is up next.