-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deadlock in replica_write_ruv #344
Comments
Comment from rmeggins (@richm) at 2012-08-14 19:56:22 set default ticket origin to Community |
Comment from nkinder (@nkinder) at 2012-08-28 04:14:22 Added initial screened field value. |
Comment from nhosoi (@nhosoi) at 2012-10-04 05:17:52 Replying to [ticket:344 richm]:
In the process of making the plugins betxn aware, the location of SERIALLOCK is being moved into dblayer_txn_begin and the lock is held regardless of the OP_FLAG_REPL_FIXUP flag as Rich suggested. So, this issue would be solved together with the ticket 351 fix. To verify the bug, what would be the best scenario? I ran quite a heavy stress test add, modify, and delete cases involved against the server which contains the 351 patch for a week. The replication topology is made from the 4 masters + 2 hubs + 4 read-only replicas. Could it be good enough to say this bug is solved? |
Comment from rmeggins (@richm) at 2012-10-04 05:36:43 Replying to [comment:5 nhosoi]:
Yes. |
Comment from nhosoi (@nhosoi) at 2012-10-06 07:27:52 Mark as duplicate of 351. |
Comment from nhosoi (@nhosoi) at 2017-02-11 22:53:14 Metadata Update from @nhosoi:
|
Cloned from Pagure issue: https://pagure.io/389-ds-base/issue/344
replica_write_ruv() does the modify with the OP_FLAG_REPL_FIXUP
replica_create_ruv_tombstone() does too, and so does replica_replace_ruv_tombstone() - the OP_FLAG_REPL_FIXUP flag causes the database to be not locked:
If the event queue fires replica_write_ruv() at the right time, it will conflict with the same RUV update from replica_replace_ruv_tombstone() or (probably not) replica_create_ruv_tombstone().
I think the solution is to always do the database SERIALLOCK. Since inst->inst_db_mutex is now a PRMonitor instead of a plain mutex, it is already re-entrant to the same thread, which was the original intent of the OP_FLAG_REPL_FIXUP flag - to allow the urp database plugins to modify entries. Alternately, change the urp be pre/post op plugins to be betxn pre/post op plugins.
The text was updated successfully, but these errors were encountered: