-
Notifications
You must be signed in to change notification settings - Fork 103
Issue 6136 - failure in freeipa tests #6137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
vlv CI test is hanging on lmdb standalone instance is looping withing dbmdb_rm_db_file, |
@long-entryrdn was not open by dbmdb_open_all_files this leaded to failure when trying to open it in a read operation because at dblayer level, it is not possible to open write txn within a read txn - and although it is possible at lmdb level, the new file will not be visible within the read txn but we may need to access it. So the open failed, and entryrdn attrinfo should then be released before returning an error to avoid keeping entryrdn busy. That is what trigger the hang when removing a backend. Added some conditionnal debug code to understand why the server hang. Also added a missing dblayer_release_index_file in vlvIndex_checkforindex that may be the reason while there is a hang when removing vlv on bdb.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides the minor issues, looks good!
| inst->inst_parent_dir_name, | ||
| inst->inst_dir_name); | ||
| struct ldbminfo *li = inst->inst_li; | ||
| char *fname = slapi_ch_smprintf("%s/.import_%s", li->li_directory, inst->inst_name); | ||
| slapi_log_err(SLAPI_LOG_DEBUG, "bdb__import_file_name", "DBG: fname=%s\n", fname); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not related exactly to this PR, but it still caught my eye. Do you know of the double underscore (bdb__) is intentional in this function name or just a typo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no name conflict so IMHO it is a typo, I will change it.
| log.info(f'Adding {users_num} users') | ||
| for i in range(0, users_num): | ||
| uid = STARTING_UID_INDEX + i | ||
| add_user(inst, users, uid) | ||
|
|
||
|
|
||
| def create_vlv_search_and_index(inst, hyphen='', basedn=DEFAULT_SUFFIX): | ||
| def create_vlv_search_and_index(inst, basedn=DEFAULT_SUFFIX, bename='userRoot', scope=2, prefix="vlv"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For better readability, I think we should use ldap.LDAP_SCOPE_SUBTREE for the scope here, instead of magic number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, will use ldap.SCOPE_SUBTREE instead of 2
| # Configure vlv | ||
| vlv_search, vlv_index = create_vlv_search_and_index( | ||
| inst, basedn=suffix, | ||
| bename=bename, scope=1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here regarding the scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, will use ldap.SCOPE_ONELEVEL instead of 1
|
I reviewed the C files and it more or less makes sense to me. Re the CI test, it must be the attribute info db layer count that is not being decremented, surely... Ack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
few questions
| @@ -456,6 +456,7 @@ dbmdb_open_all_files(dbmdb_ctx_t *ctx, backend *be) | |||
| TST(add_dbi(&octx, be, special_names[i], ctxflags)); | |||
| sn_dbis[i] = octx.dbi; | |||
| } | |||
| TST(add_dbi(&octx, be, LDBM_LONG_ENTRYRDN_STR, ctxflags)); | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why having not added LDBM_LONG_ENTRYRDN_STR in special_names ?
| entry_address addr; | ||
| addr.sdn = p->vlv_base; | ||
| addr.uniqueid = NULL; | ||
| e = find_entry(pb, be, &addr, txn, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the base entry does not exist at that time, it will stay vlv_initialized=0. That looks fine to me. Does it require documentation ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting point. IMHO the right thing to do would be to check the uninitialized vlv when adding an entry ...
| @@ -776,6 +776,7 @@ vlvIndex_checkforindex(struct vlvIndex *p, backend *be) | |||
| /* In lmdb case, always open the dbi */ | |||
| if (li->li_flags & LI_LMDB_IMPL) { | |||
| (void) dblayer_get_index_file(be, p->vlv_attrinfo, &db, 0) ; | |||
| dblayer_release_index_file(be, p->vlv_attrinfo, db); | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks open/close. What is it for ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dblayer_release_index_file decrease a ref count to tell that the index is no more used.
On bdb it could lead to db instance closure but that is not the case with lmdb (db instances are not closed until closing the whole db environment (i.e at shutdown time) )
So the open/release partern just create/open the database instance and keep it open (but not busy).
Having it open is important because it cannot be open when needed by a search operation:
because there is an already open read txn and there is no way we can open a new file within that txn:
We can open the file with a new write txn, but in that case the previously created read txn has no visibility of the new file.
The big rule to remember with LMDB is is that ALL DATABASE INSTANCES MUST BE OPEN when bringing a backend on-line. The only case we could open/create db instance is when changing the config.
* Issue 6136 - failure in freeipa tests Several issue detected when adding a CI test that mimic one of freeipa nightly test : bdb - offline import fail when trying to create the guardian file because instance is not yet fully initialized and the generated path is wrong - fixed by using the directory from ldbminfo and the instance names that are defined. mdb - vlv index are not generated because for one level scoped vlv, the entryid is not properly set. should use vlv_grok_new_import_entry to reset the vlv filter when the base entry is added (as it is done in bdb). also added a function to mark the vlv_grok_new_import_entry as uninitialized before the import mdb- crash while trying to import an entry without parent (i.e a suffix entry) that does not belong to the backend fixed by avoiding the null pointer exception in that case Issue: #6136 Reviewed by: @droideck, @jchapma (Thanks!) * Fix vlv CI test deadlock @long-entryrdn was not open by dbmdb_open_all_files this leaded to failure when trying to open it in a read operation because at dblayer level, it is not possible to open write txn within a read txn - and although it is possible at lmdb level, the new file will not be visible within the read txn but we may need to access it. So the open failed, and entryrdn attrinfo should then be released before returning an error to avoid keeping entryrdn busy. That is what trigger the hang when removing a backend. Added some conditionnal debug code to understand why the server hang. Also added a missing dblayer_release_index_file in vlvIndex_checkforindex that may be the reason while there is a hang when removing vlv on bdb. * Issue 6136 - failure in freeipa tests - Fix review comments
* Issue 6136 - failure in freeipa tests Several issue detected when adding a CI test that mimic one of freeipa nightly test : bdb - offline import fail when trying to create the guardian file because instance is not yet fully initialized and the generated path is wrong - fixed by using the directory from ldbminfo and the instance names that are defined. mdb - vlv index are not generated because for one level scoped vlv, the entryid is not properly set. should use vlv_grok_new_import_entry to reset the vlv filter when the base entry is added (as it is done in bdb). also added a function to mark the vlv_grok_new_import_entry as uninitialized before the import mdb- crash while trying to import an entry without parent (i.e a suffix entry) that does not belong to the backend fixed by avoiding the null pointer exception in that case Issue: #6136 Reviewed by: @droideck, @jchapma (Thanks!) * Fix vlv CI test deadlock @long-entryrdn was not open by dbmdb_open_all_files this leaded to failure when trying to open it in a read operation because at dblayer level, it is not possible to open write txn within a read txn - and although it is possible at lmdb level, the new file will not be visible within the read txn but we may need to access it. So the open failed, and entryrdn attrinfo should then be released before returning an error to avoid keeping entryrdn busy. That is what trigger the hang when removing a backend. Added some conditionnal debug code to understand why the server hang. Also added a missing dblayer_release_index_file in vlvIndex_checkforindex that may be the reason while there is a hang when removing vlv on bdb. * Issue 6136 - failure in freeipa tests - Fix review comments (cherry picked from commit cc3a864)
Several issue detected when adding a CI test that mimic one of freeipa nightly test :
bdb - offline import fail when trying to create the guardian file because instance is not yet fully initialized and the generated path is wrong - fixed by using the directory from ldbminfo and the instance names that are defined.
mdb - vlv index are not generated because for one level scoped vlv, the entryid is not properly set.
should use vlv_grok_new_import_entry to reset the vlv filter when the base entry is added (as it is done in bdb).
also added a function to mark the vlv_grok_new_import_entry as uninitialized before the import
mdb- crash while trying to import an entry without parent (i.e a suffix entry) that does not belong to the backend
fixed by avoiding the null pointer exception in that case
Issue: #6136
Reviewed by: @droideck, @jchapma (Thanks!)