Issue 6057 - vlv search may result wrong result with lmdb #6091

progier389 · 2024-02-13T14:42:00Z

Different issue related to vlv index and import/bulk import:

vlv sub database was not open when the backend was started
vlv index was not cleaned by import/bulk import
vlv index was not rebuilt by import/bulk import
vlv index not rebuilt by explicit vlv reindex.
vlv index not rebuilt by explicit vlv reindex if vlv name contains hyphen.
vlv index not rebuilt if basedn is not the suffix.

In fact all theses issues had the same cause: the backend vlv search list is empty after the server get restarted.

Solution:
[For 1,2 and 3] Fix the test_vlv_cache_subdb_names to ensure that vlv index are properly cleaned
and recreated by a bulk import
Initialize the vlv search list if it is not yet initialized when starting an instance (just before opening
all the sub databases associated with the backend) rather than doing it before restarting the instance after the import.
[For 4] Add a new member for vlv in the import context and handle it properly.
[For 5] Convert the vlv name as a dbname and store it is a separate list - compare the dbname when checking if vlv is reindexed.
[for 6] Rebuild the proper entry dn (in case of reindex) to be able to evaluate the vlv scope
to rebuild the dn I used the entry_info data (stored in a temporary database) that contains the rdn/nrdn/
and ancestors IDs (used to to rebuild the entryrdn index) and now also store the dn which is simply
propagated by adding the entry rdn to the parent entry dn.

Issue: #6057

Reviewed by: @tbordaz , @droideck (Thanks!)

progier389 · 2024-02-16T19:29:43Z

A bit tedious, but this time, the freeipa test pass ...

progier389 · 2024-02-19T13:43:28Z

But unfortunately CI tests show regressions both in LDBM and BDB 😞
On bdb it is likely an issue in the CI test as only lmdb code is impacted by this fix.

progier389 · 2024-02-20T11:15:13Z

Fixed the LMDB regression about missing entries because ancestorid index was corrupted. (I forgot to realign a part of the code that was also using the entry_info record)
About BDB, one of the failure cause is due to an unrelated regression that impacts the main branch
(some weird messages about paged search timeout are logged and the presence of these error messages
cause failure in some tests)
And the other one is due to #6029
(Solved the issue by skipping test_vlv_reindex and test_vlv_cache_subdb_names when using bdb)

progier389 · 2024-02-21T10:47:31Z

This time the CI tests look good (FYI: the import tests fail because of Issue #6103)

jchapma · 2024-02-21T13:41:30Z

This time the CI tests look good (FYI: the import tests fail because of Issue #6103)

Fixed

tbordaz

The overall fix is difficult to review without a deep dive in vlv indexes. I only looked at cosmetic and local logical without understanding the complete view.
LGTM

tbordaz · 2024-02-22T18:07:44Z

ldap/servers/slapd/back-ldbm/db-mdb/mdb_import_threads.c

@@ -3111,7 +3155,8 @@ process_vlv_index(backentry *ep, ImportWorkerInfo *info)
        struct vlvIndex *vlv_index = ps->vlv_index;
        Slapi_PBlock *pb = slapi_pblock_new();
        slapi_pblock_set(pb, SLAPI_BACKEND, be);
-        if (vlv_index && (ctx->indexAttrs==NULL || attr_in_list(vlv_index->vlv_name, ctx->indexAttrs))) {
+        if (vlv_index && vlv_index->vlv_attrinfo &&
+            is_reindexed_attr(vlv_index->vlv_attrinfo->ai_type , ctx, ctx->indexVlvs)) {


Any reason why relaying on vlv_index->vlv_attrinfo->ai_type rather than vlv_index->vlv_name

Yes (BTW: I fall in that pitfall too! (I first used vlvName then got in trouble) 😉
That is because in truncate_index_dbi (that deletes the vlv index that are reindexed) the vlv_index is not available but only the ai.

tbordaz · 2024-02-22T18:34:32Z

ldap/servers/slapd/back-ldbm/db-mdb/mdb_import_threads.c

+            memcpy(dn, rdn, rdnlen);
+            if (pinfo[INFO_IDX_DN_LEN]) {
+                dn[rdnlen] = ',';
+                memcpy(dn+rdnlen+1, INFO_DN(pinfo), pinfo[INFO_IDX_DN_LEN]);


Code quite difficult to understand, sorry in advance if my question is confusing.
Few lines above you computed 'dnlen' with +1 and I understood it was for '\0'. Here we are adding ',' between 'rdn' and '<rest_of_dn>'. I am concerned if that added ',' was already account in the dnlen.

Both the space for ',' and final \0 are accounted in dnlen:
dnlen = rdnlen + 1 + pinfo[4]; /* dn len (including final \0) */
in other words: strlen(rdb) + 1 (for the comma) + parent dn len (including the \0)

BTW there is a minor glitch: should better be INFO_IDX_DN_LEN instead of 4:
dnlen = rdnlen + 1 + pinfo[INFO_IDX_DN_LEN];

tbordaz · 2024-02-22T18:45:59Z

ldap/servers/slapd/back-ldbm/db-mdb/mdb_import.c

-    }
-    return 0;
-}
-
 /* vlv_getindices callback that truncate vlv index (in reindex case) */
 static int
 truncate_index_dbi(struct attrinfo *ai, ImportCtx_t *ctx)


The function truncate_index_dbi looks specific to vlv. Would it make sense to rename it with 'vlv' in its name ?

droideck

I see that dirsrvtests/tests/suites/import/import_test.py::test_entry_with_escaped_characters_fails_to_import_and_index fails (but it doesn't fail on main). Could it be related?

progier389 · 2024-02-22T19:33:58Z

FYI: About the test_entry_with_escaped_characters_fails_to_import_and_index test: a look at the error log shows that it is #6103 That James fixed it yesterday but I have not rebased the branch with that fix. ==> The problem should be fixed by the merge.

progier389 · 2024-02-22T20:22:02Z

Fixed the minor glitch and rebased

droideck

Some minor comments and questions, but besides that - looks good! Ack

droideck · 2024-02-22T21:38:49Z

dirsrvtests/tests/suites/vlv/regression_test.py

-    log.info(f'Adding {users_num} users')
-    for i in range(0, users_num):
-        uid = STARTING_UID_INDEX + i
+def add_an_user(inst, users, uid):


Minor nitpick - the correct grammar will be add_a_user. But probably, add_user will be even better for the function name here.

droideck · 2024-02-22T21:53:52Z

ldap/servers/slapd/back-ldbm/db-mdb/mdb_import_threads.c

+            dnlen = rdnlen + 1 + pinfo[INFO_IDX_DN_LEN];  /* dn len (including final \0) */
+        }
+
+        len = rdnlen + strlen(nrdn) + 2 + dnlen + (INFO_IDX_ANCESTORS + 1 + pinfo[INFO_IDX_NB_ANCESTORS]) * sizeof(ID);


It's a very minor thing, but why not calculate strlen(nrdn) one time in the beginning the same way as it's done for rdnlen? :)

I agree: it is cleaner - I added rdnlen because I needed it to compute the dn size but I did not need to use nrdnlen for the dn size computation so I just did not think about it ...

droideck · 2024-02-22T21:58:57Z

ldap/servers/slapd/back-ldbm/db-mdb/mdb_import_threads.c

+#define INFO_NRDN(info)         ((char*)(&((ID*)(info))[INFO_IDX_ANCESTORS+((ID*)(info))[INFO_IDX_NB_ANCESTORS]]))
+#define INFO_RDN(info)          (INFO_NRDN(info)+((ID*)(info))[INFO_IDX_NRDN_LEN])
+#define INFO_DN(info)           (INFO_RDN(info)+((ID*)(info))[INFO_IDX_RDN_LEN])
+#define INFO_RECORD_LEN(info)   ((INFO_DN(info)-(char*)(info))+(info)[INFO_IDX_DN_LEN])


What is INFO_RECORD_LEN? I don't see it used anywhere (or mentioned in the comment above)

It looks like it's just a LEN of the data, but it could be nice to describe.

droideck · 2024-02-22T22:06:12Z

ldap/servers/slapd/back-ldbm/db-mdb/mdb_import_threads.c

@@ -3111,7 +3155,8 @@ process_vlv_index(backentry *ep, ImportWorkerInfo *info)
        struct vlvIndex *vlv_index = ps->vlv_index;
        Slapi_PBlock *pb = slapi_pblock_new();
        slapi_pblock_set(pb, SLAPI_BACKEND, be);
-        if (vlv_index && (ctx->indexAttrs==NULL || attr_in_list(vlv_index->vlv_name, ctx->indexAttrs))) {
+        if (vlv_index && vlv_index->vlv_attrinfo &&
+            is_reindexed_attr(vlv_index->vlv_attrinfo->ai_type , ctx, ctx->indexVlvs)) {


It's not related to this change, but it could be nice to fix/clarify. In this function (just a few lines below) there's error message slapi_log_err(SLAPI_LOG_ERR, "process_regular_index", "index_addordel_values_ext_sv failed.\n");.

I think it could be a typo, as it has the wrong function name.

Indeed! looks like a cut and paste issue! 😉

droideck · 2024-02-22T22:13:46Z

ldap/servers/slapd/back-ldbm/db-mdb/mdb_import_threads.c

-                slapi_task_log_notice(job->task, "%s: Indexing attribute: %s", job->inst->inst_name, mii->name);
+            if (ii->ai->ai_indexmask == INDEX_VLV) {
+                if (job->task) {
+                    slapi_task_log_notice(job->task, "%s: Indexing VLV: %s", job->inst->inst_name, mii->name);


Out of curiosity, where mii is freed? I see one if -

if (a->offset_dbi) { *(MdbIndexInfo_t**)(((char*)ctx)+a->offset_dbi) = mii; }

But if we are not a->offset_dbi - what then?

mii is stored in an avl hashtable so I think it is freed by: avl_free(ctx->indexes, (IFP) free_ii);

But it's put there only in case if (a->offset_dbi) {, right?

Anyway, I probably miss something, as I'm not sure I fully understand what really happens here with ii, ctx->job and how we build the index list in general. I guess I need dive deeper here :)

No: it is always stored in ctx->indexes map (last line of the function is: avl_insert(&ctx->indexes, mii, cmp_mii, NULL); )

@droideck

) * Issue 6057 - vlv search may result wrong result with lmdb - Fix 2 * Issue i6057 - Fix2 - Fix review comment Previous fix is failing after a restart because of a chicken and egg issue related to vlv_init and backend initialization. vlv_init requires that the backend get initialized to be able to generate the vlvSearch struct. Because of deadlocks, and to be able to roll back the database instance open transaction I found it easier to avoid using vlv_getindices if vlv is not initialized but rather perform a search on cn=config to build a list of all possible vlv indexes filenames (ignoring the configuration errors) and use that list to open the database files for vlv indices and their cache. Also fixed some minor issues: @droideck minor remarks done about #6091 after the merge a typo while logging info about the database environment parameters Issue: #6057 Reviewed by: @tbordaz, @droideck , @mreynolds389 (Thanks!)

@tbordaz

* Issue 6057 - vlv search may result wrong result with lmdb Different issue related to vlv index and import/bulk import: vlv sub database was not open when the backend was started vlv index was not cleaned by import/bulk import vlv index was not rebuilt by import/bulk import vlv index not rebuilt by explicit vlv reindex. vlv index not rebuilt by explicit vlv reindex if vlv name contains hyphen. vlv index not rebuilt if basedn is not the suffix. In fact all theses issues had the same cause: the backend vlv search list is empty after the server get restarted. Solution: [For 1,2 and 3] Fix the test_vlv_cache_subdb_names to ensure that vlv index are properly cleaned and recreated by a bulk import Initialize the vlv search list if it is not yet initialized when starting an instance (just before opening all the sub databases associated with the backend) rather than doing it before restarting the instance after the import. [For 4] Add a new member for vlv in the import context and handle it properly. [For 5] Convert the vlv name as a dbname and store it is a separate list - compare the dbname when checking if vlv is reindexed. [for 6] Rebuild the proper entry dn (in case of reindex) to be able to evaluate the vlv scope to rebuild the dn I used the entry_info data (stored in a temporary database) that contains the rdn/nrdn/ and ancestors IDs (used to to rebuild the entryrdn index) and now also store the dn which is simply propagated by adding the entry rdn to the parent entry dn. Issue: #6057 Reviewed by: @tbordaz , @droideck (Thanks!)

@droideck

) * Issue 6057 - vlv search may result wrong result with lmdb - Fix 2 * Issue i6057 - Fix2 - Fix review comment Previous fix is failing after a restart because of a chicken and egg issue related to vlv_init and backend initialization. vlv_init requires that the backend get initialized to be able to generate the vlvSearch struct. Because of deadlocks, and to be able to roll back the database instance open transaction I found it easier to avoid using vlv_getindices if vlv is not initialized but rather perform a search on cn=config to build a list of all possible vlv indexes filenames (ignoring the configuration errors) and use that list to open the database files for vlv indices and their cache. Also fixed some minor issues: @droideck minor remarks done about #6091 after the merge a typo while logging info about the database environment parameters Issue: #6057 Reviewed by: @tbordaz, @droideck , @mreynolds389 (Thanks!)

progier389 self-assigned this Feb 13, 2024

progier389 added this to the 3.0.0 milestone Feb 13, 2024

progier389 linked an issue Feb 13, 2024 that may be closed by this pull request

vlv search may result wrong result with lmdb #6057

Open

progier389 added the work in progress Work in Progress - can be reviewed, but not ready for merge. label Feb 13, 2024

progier389 force-pushed the i6057 branch from 8123bd2 to 0a29361 Compare February 16, 2024 19:27

progier389 removed the work in progress Work in Progress - can be reviewed, but not ready for merge. label Feb 16, 2024

progier389 added the work in progress Work in Progress - can be reviewed, but not ready for merge. label Feb 19, 2024

progier389 force-pushed the i6057 branch from 0a29361 to a42141f Compare February 20, 2024 10:40

progier389 force-pushed the i6057 branch from a42141f to a665b74 Compare February 20, 2024 14:54

progier389 removed the work in progress Work in Progress - can be reviewed, but not ready for merge. label Feb 21, 2024

tbordaz approved these changes Feb 22, 2024

View reviewed changes

droideck reviewed Feb 22, 2024

View reviewed changes

progier389 added 4 commits February 22, 2024 21:20

Issue 6057 - vlv search may result wrong result with lmdb

67c4142

Issue 6057 - Reindex fails to rebuild VLV

0c159db

Issue 6057 - Reindex fails to rebuild VLV if basedn is not the suffix

1256c73

Issue 6057 - Fix a minor glitch

b531fd4

progier389 force-pushed the i6057 branch from 17b39cd to b531fd4 Compare February 22, 2024 20:20

progier389 merged commit a380deb into 389ds:main Feb 22, 2024
193 of 195 checks passed

droideck approved these changes Feb 22, 2024

View reviewed changes

progier389 mentioned this pull request Mar 12, 2024

Issue 6057 - vlv search may result wrong result with lmdb - Fix 2 #6121

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue 6057 - vlv search may result wrong result with lmdb #6091

Issue 6057 - vlv search may result wrong result with lmdb #6091

progier389 commented Feb 13, 2024 •

edited

progier389 commented Feb 16, 2024

progier389 commented Feb 19, 2024

progier389 commented Feb 20, 2024 •

edited

progier389 commented Feb 21, 2024

jchapma commented Feb 21, 2024

tbordaz left a comment

tbordaz Feb 22, 2024

progier389 Feb 22, 2024

tbordaz Feb 22, 2024

progier389 Feb 22, 2024

tbordaz Feb 22, 2024

droideck left a comment

progier389 commented Feb 22, 2024

progier389 commented Feb 22, 2024

droideck left a comment

droideck Feb 22, 2024

droideck Feb 22, 2024

progier389 Feb 22, 2024

droideck Feb 22, 2024

droideck Feb 22, 2024

droideck Feb 22, 2024

progier389 Feb 22, 2024

droideck Feb 22, 2024

progier389 Feb 22, 2024 •

edited

droideck Feb 22, 2024

progier389 Mar 11, 2024

Issue 6057 - vlv search may result wrong result with lmdb #6091

Issue 6057 - vlv search may result wrong result with lmdb #6091

Conversation

progier389 commented Feb 13, 2024 • edited

progier389 commented Feb 16, 2024

progier389 commented Feb 19, 2024

progier389 commented Feb 20, 2024 • edited

progier389 commented Feb 21, 2024

jchapma commented Feb 21, 2024

tbordaz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

droideck left a comment

Choose a reason for hiding this comment

progier389 commented Feb 22, 2024

progier389 commented Feb 22, 2024

droideck left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

progier389 Feb 22, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

progier389 commented Feb 13, 2024 •

edited

progier389 commented Feb 20, 2024 •

edited

progier389 Feb 22, 2024 •

edited