New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PR - Ticket 49967 - entry cache corruption after failed MODRDN #3048
Comments
Comment from mreynolds (@mreynolds389) at 2018-10-24 23:26:30 This solves another customer's crash. ACK! |
Comment from tbordaz (@tbordaz) at 2018-10-25 11:30:30 rebased onto ab4af68 |
Comment from tbordaz (@tbordaz) at 2018-10-25 11:31:13 Pull-Request has been merged by tbordaz |
Comment from mreynolds (@mreynolds389) at 2018-12-05 15:59:11 @tbordaz so your fix cleans up the DN cache, but it does not perform the same process for entry cache. In the customer case I have its the entry cache, not the dn cache, that causes the crash. I wonder if we also have to do something like: struct backentry *be = cache_find_id(&inst->inst_cache, ec->ep_id); Thoughts? |
Comment from tbordaz (@tbordaz) at 2018-12-05 16:16:25 @mreynolds389, IMHO I think that upon failure, the destination entry in the entrycache is removed few lines below (comment "remove the new entry from the cache if the op failed..."). The entry should be 'cache_is_in_cache' because it is flagged ENTRY_STATE_CREATING at that time. The difficulty here is that the number of possible paths is so large that without a testcase we can not be sure. |
Comment from mreynolds (@mreynolds389) at 2018-12-05 16:22:30
I guess I was wondering if the call below that you mentioned is NOT releasing the correct entry (cache_is_in_cache returning "no"). Perhaps it should be retrieved again (cache_find_id) before we check if its in the cache? Sorry still looking through the code so maybe this is not possible, but since it could happen with the dn cache, then why not the entry cache? :) |
Comment from lkrispen (@elkris) at 2018-12-05 16:33:57 but for the ec it is done just after this, if (ec && inst) .... I think we need to look into possible issues in ldbm_delete |
Comment from lkrispen (@elkris) at 2018-12-05 16:39:46 In ldbm_back_delete we have remove_e_from_cache to trigger a removal from the ec in some conditions, what about dn cache in that case ? Mark, would this be a candiate for your debug logging ? |
Comment from mreynolds (@mreynolds389) at 2018-12-05 16:44:58
But the crash in delete seems to happen because of a blotched modrdn cache entry. When we call id2entry_add in ldbm_delete it's finding the "previous DN" entry in the cache (which was freed but not remove from the table). Well that was the original crash - not sure hows it's crashing now with Thierry's fix. The new debug patch will tell us more... |
Comment from lkrispen (@elkris) at 2018-12-05 16:48:09 but I didn't see a MODRDN for the entry in the latest data set :-( |
Comment from mreynolds (@mreynolds389) at 2018-12-05 16:55:50
Unfortunately it looks like that does not come into play until after the crash. It's outside of the txn at that point. But it doesn't hurt to add some logging to see what DN it is... |
Comment from mreynolds (@mreynolds389) at 2018-12-05 16:56:58
Hmmm, well I guess we need to see what the new debug patch tells us... |
Patch |
Cloned from Pagure Pull-Request: https://pagure.io/389-ds-base/pull-request/49989
Bug Description:
During a MODRDN the DN cache is updated to replace
source DN with the target DN (modrdn_rename_entry_update_indexes)
If later a failure occurs (for example if BETXN_POSTOP fails) and
the txn is aborted, the target DN (for the specific entryID) remains
in the DN cache.
Fix Description:
In case of failure of the operation, the entry (from the entryID)
need to be cleared from the DN cache
Resolves: #3026
Reviewed by: ?
Platforms tested: F27
Flag Day: no
Doc impact: no
The text was updated successfully, but these errors were encountered: