Skip to content

Issue 6136 - failure in freeipa tests #6137

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 5, 2024
Merged

Conversation

progier389
Copy link
Contributor

@progier389 progier389 commented Mar 28, 2024

Several issue detected when adding a CI test that mimic one of freeipa nightly test :

bdb - offline import fail when trying to create the guardian file because instance is not yet fully initialized and the generated path is wrong - fixed by using the directory from ldbminfo and the instance names that are defined.

mdb - vlv index are not generated because for one level scoped vlv, the entryid is not properly set.
should use vlv_grok_new_import_entry to reset the vlv filter when the base entry is added (as it is done in bdb).
also added a function to mark the vlv_grok_new_import_entry as uninitialized before the import

mdb- crash while trying to import an entry without parent (i.e a suffix entry) that does not belong to the backend
fixed by avoiding the null pointer exception in that case

Issue: #6136

Reviewed by: @droideck, @jchapma (Thanks!)

@progier389 progier389 linked an issue Mar 28, 2024 that may be closed by this pull request
@progier389 progier389 self-assigned this Mar 28, 2024
@progier389 progier389 added this to the 3.0.0 milestone Mar 28, 2024
@progier389 progier389 added the work in progress Work in Progress - can be reviewed, but not ready for merge. label Mar 28, 2024
@progier389
Copy link
Contributor Author

vlv CI test is hanging on lmdb standalone instance is looping withing dbmdb_rm_db_file,
looks like there is a problem with ai_dblayer_count (The surprising thing is that I do not hit it when running on a non root PREFIXed build)

@long-entryrdn was not open by dbmdb_open_all_files
this leaded to failure when trying to open it in a read operation
because at dblayer level, it is not possible to open write txn
within a read txn - and although it is possible at lmdb level,
the new file will not be visible within the read txn
but we may need to access it.
So the open failed, and entryrdn attrinfo should then be released
before returning an error to avoid keeping entryrdn busy.
That is what trigger the hang when removing a backend.

Added some conditionnal debug code to understand why the server hang.

Also added a missing dblayer_release_index_file in vlvIndex_checkforindex
that may be the reason while there is a hang when removing vlv on bdb.
@progier389 progier389 removed the work in progress Work in Progress - can be reviewed, but not ready for merge. label Apr 2, 2024
Copy link
Member

@droideck droideck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides the minor issues, looks good!

inst->inst_parent_dir_name,
inst->inst_dir_name);
struct ldbminfo *li = inst->inst_li;
char *fname = slapi_ch_smprintf("%s/.import_%s", li->li_directory, inst->inst_name);
slapi_log_err(SLAPI_LOG_DEBUG, "bdb__import_file_name", "DBG: fname=%s\n", fname);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not related exactly to this PR, but it still caught my eye. Do you know of the double underscore (bdb__) is intentional in this function name or just a typo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no name conflict so IMHO it is a typo, I will change it.

log.info(f'Adding {users_num} users')
for i in range(0, users_num):
uid = STARTING_UID_INDEX + i
add_user(inst, users, uid)


def create_vlv_search_and_index(inst, hyphen='', basedn=DEFAULT_SUFFIX):
def create_vlv_search_and_index(inst, basedn=DEFAULT_SUFFIX, bename='userRoot', scope=2, prefix="vlv"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For better readability, I think we should use ldap.LDAP_SCOPE_SUBTREE for the scope here, instead of magic number.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, will use ldap.SCOPE_SUBTREE instead of 2

# Configure vlv
vlv_search, vlv_index = create_vlv_search_and_index(
inst, basedn=suffix,
bename=bename, scope=1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here regarding the scope.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, will use ldap.SCOPE_ONELEVEL instead of 1

@jchapma
Copy link
Contributor

jchapma commented Apr 4, 2024

I reviewed the C files and it more or less makes sense to me. Re the CI test, it must be the attribute info db layer count that is not being decremented, surely...

Ack

@progier389 progier389 merged commit cc3a864 into 389ds:main Apr 5, 2024
Copy link
Contributor

@tbordaz tbordaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
few questions

@@ -456,6 +456,7 @@ dbmdb_open_all_files(dbmdb_ctx_t *ctx, backend *be)
TST(add_dbi(&octx, be, special_names[i], ctxflags));
sn_dbis[i] = octx.dbi;
}
TST(add_dbi(&octx, be, LDBM_LONG_ENTRYRDN_STR, ctxflags));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why having not added LDBM_LONG_ENTRYRDN_STR in special_names ?

entry_address addr;
addr.sdn = p->vlv_base;
addr.uniqueid = NULL;
e = find_entry(pb, be, &addr, txn, NULL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the base entry does not exist at that time, it will stay vlv_initialized=0. That looks fine to me. Does it require documentation ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting point. IMHO the right thing to do would be to check the uninitialized vlv when adding an entry ...

@@ -776,6 +776,7 @@ vlvIndex_checkforindex(struct vlvIndex *p, backend *be)
/* In lmdb case, always open the dbi */
if (li->li_flags & LI_LMDB_IMPL) {
(void) dblayer_get_index_file(be, p->vlv_attrinfo, &db, 0) ;
dblayer_release_index_file(be, p->vlv_attrinfo, db);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks open/close. What is it for ?

Copy link
Contributor Author

@progier389 progier389 Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dblayer_release_index_file decrease a ref count to tell that the index is no more used.
On bdb it could lead to db instance closure but that is not the case with lmdb (db instances are not closed until closing the whole db environment (i.e at shutdown time) )
So the open/release partern just create/open the database instance and keep it open (but not busy).
Having it open is important because it cannot be open when needed by a search operation:
because there is an already open read txn and there is no way we can open a new file within that txn:
We can open the file with a new write txn, but in that case the previously created read txn has no visibility of the new file.
The big rule to remember with LMDB is is that ALL DATABASE INSTANCES MUST BE OPEN when bringing a backend on-line. The only case we could open/create db instance is when changing the config.

jchapma pushed a commit that referenced this pull request Apr 10, 2024
* Issue 6136 - failure in freeipa tests
Several issue detected when adding a CI test that mimic one of freeipa nightly test :

bdb - offline import fail when trying to create the guardian file because instance is not yet fully initialized and the generated path is wrong - fixed by using the directory from ldbminfo and the instance names that are defined.

mdb - vlv index are not generated because for one level scoped vlv, the entryid is not properly set.
should use vlv_grok_new_import_entry to reset the vlv filter when the base entry is added (as it is done in bdb).
also added a function to mark the vlv_grok_new_import_entry as uninitialized before the import

mdb- crash while trying to import an entry without parent (i.e a suffix entry) that does not belong to the backend
fixed by avoiding the null pointer exception in that case

Issue: #6136

Reviewed by: @droideck, @jchapma (Thanks!)

* Fix vlv CI test deadlock

@long-entryrdn was not open by dbmdb_open_all_files
this leaded to failure when trying to open it in a read operation
because at dblayer level, it is not possible to open write txn
within a read txn - and although it is possible at lmdb level,
the new file will not be visible within the read txn
but we may need to access it.
So the open failed, and entryrdn attrinfo should then be released
before returning an error to avoid keeping entryrdn busy.
That is what trigger the hang when removing a backend.

Added some conditionnal debug code to understand why the server hang.

Also added a missing dblayer_release_index_file in vlvIndex_checkforindex
that may be the reason while there is a hang when removing vlv on bdb.

* Issue 6136 - failure in freeipa tests - Fix review comments
progier389 added a commit that referenced this pull request May 30, 2024
* Issue 6136 - failure in freeipa tests
Several issue detected when adding a CI test that mimic one of freeipa nightly test :

bdb - offline import fail when trying to create the guardian file because instance is not yet fully initialized and the generated path is wrong - fixed by using the directory from ldbminfo and the instance names that are defined.

mdb - vlv index are not generated because for one level scoped vlv, the entryid is not properly set.
should use vlv_grok_new_import_entry to reset the vlv filter when the base entry is added (as it is done in bdb).
also added a function to mark the vlv_grok_new_import_entry as uninitialized before the import

mdb- crash while trying to import an entry without parent (i.e a suffix entry) that does not belong to the backend
fixed by avoiding the null pointer exception in that case

Issue: #6136

Reviewed by: @droideck, @jchapma (Thanks!)

* Fix vlv CI test deadlock

@long-entryrdn was not open by dbmdb_open_all_files
this leaded to failure when trying to open it in a read operation
because at dblayer level, it is not possible to open write txn
within a read txn - and although it is possible at lmdb level,
the new file will not be visible within the read txn
but we may need to access it.
So the open failed, and entryrdn attrinfo should then be released
before returning an error to avoid keeping entryrdn busy.
That is what trigger the hang when removing a backend.

Added some conditionnal debug code to understand why the server hang.

Also added a missing dblayer_release_index_file in vlvIndex_checkforindex
that may be the reason while there is a hang when removing vlv on bdb.

* Issue 6136 - failure in freeipa tests - Fix review comments

(cherry picked from commit cc3a864)
@progier389 progier389 deleted the i6136 branch May 20, 2025 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

failure in freeipa test
4 participants