New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cleanallruv hangs shutdown if not all replicas online #1548
Comments
Comment from lkrispen (@elkris) at 2015-07-09 19:49:08 I don't think that the missing shutdown check before sleeping is a big deal, it will be checked at every iteration in the while() conditions, so there is only a small window you miss, but after sleep the loop will be termimnated. Although an extra check befoe sleeping doesn't hurt. The problem was the missing stop_ruv_cleaning() calls, that's ok now. But can the stop_ruv_cleaning() in multimaster_stop be removed ? we could stop the plugin without shutdown (in theory). |
Comment from mreynolds (@mreynolds389) at 2015-07-09 19:57:32 Replying to [comment:2 elkris]:
I was just being overly cautious :-)
Correct.
I'm not sure we need the plugin running for cleanallruv to finish, but I can add it back(it doesn't hurt). New patch in the works... |
Comment from mreynolds (@mreynolds389) at 2015-07-09 20:00:16 revision |
Comment from mreynolds (@mreynolds389) at 2015-07-09 20:01:04 New patch attached... |
Comment from rmeggins (@richm) at 2015-07-09 20:18:18 The problem is if slapd is shutting down while it is waiting on a condvar. There needs to be a way that something can detect shutdown and do a notifycondvar to wake up those waits immediately upon shutdown. |
Comment from lkrispen (@elkris) at 2015-07-09 20:25:07 Replying to [comment:6 richm]:
yes, but Mark's fix does it now. Shutdown was hanging in replica_cleanall_ruv_destructor() and this now calls stop_ruv_cleaning() |
Comment from rmeggins (@richm) at 2015-07-09 20:36:52 Replying to [comment:7 elkris]:
ok |
Comment from mreynolds (@mreynolds389) at 2015-07-09 22:00:21 fdf4681..d6269f2 master -> master 41dff5b..0bb881a 389-ds-base-1.3.4 -> 389-ds-base-1.3.4 |
Comment from nhosoi (@nhosoi) at 2015-07-10 05:46:17 Ticket has been cloned to Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1241723 |
Comment from mreynolds (@mreynolds389) at 2015-09-18 21:32:23 This fix introduced a regression. When the server was stopped during a clean task, the task would think everyone was cleaned(when in fact they were not). Need to properly detect the shutdown at the end of the task. |
Comment from mreynolds (@mreynolds389) at 2015-09-18 22:00:12 Fix regression with server shutdown |
Comment from mreynolds (@mreynolds389) at 2015-09-18 23:27:49 a8130ab..c41d36d master -> master 8cd4f45..d9f03f5 389-ds-base-1.3.4 -> 389-ds-base-1.3.4 |
Comment from mreynolds (@mreynolds389) at 2017-02-11 23:08:48 Metadata Update from @mreynolds389:
|
Cloned from Pagure issue: https://pagure.io/389-ds-base/issue/48217
There are race conditions in some of the cleanallruv code where we can go to sleep without checking if the server is shutting down. Like when checking if replicas are online:
The text was updated successfully, but these errors were encountered: