Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFE - Monitor the current DB locks ( nsslapd-db-current-locks ) #4623

Closed
droideck opened this issue Feb 15, 2021 · 3 comments · Fixed by #4762
Closed

RFE - Monitor the current DB locks ( nsslapd-db-current-locks ) #4623

droideck opened this issue Feb 15, 2021 · 3 comments · Fixed by #4762
Assignees
Labels
In JIRA ticket is in JIRA priority_high need urgent fix / highly valuable / easy to fix
Milestone

Comments

@droideck
Copy link
Member

Is your feature request related to a problem? Please describe.
db lock gets exhausted because of unindexed internal searches (under a transaction). Indexing those searches is the way to prevent exhaustion.

Describe the solution you'd like
To prevent db lock exhaustion and help admin task a possible solutions would be:

  • If db lock get exhausted during a txn, it leads to db panic and the later recovery can possibly fail. That leads to a full reinit of the instance where the db locks got exhausted. The server should monitor the db lock and trigger server shutdown (similar to disk full) if the db lock is close to be exhausted. Because of the performance impact, the monitoring should be limited to unindexed (allid(candidate)) internal searches (under a txn) and periodically (after each 1000 evaluated candidate). unindexed should be flagged in ldbm_back_search. transaction can be tested with pblock(SLAPI_TXN), monitoring should be done in iterate, internal op is an operation flag OP_FLAG_INTERNAL.

  • To help indexing the appropriate attributes, unindexed internal search (under txn) should log a warning with the search filter.

  • a config parameter should toggle monitoring/shutdown. By default it should be enabled.

  • Monitoring returns value that may be not exact. The threshold to trigger the shutdown should take into account that the value is not perfect.

@droideck droideck added the needs triage The issue will be triaged during scrum label Feb 15, 2021
@mreynolds389 mreynolds389 removed the needs triage The issue will be triaged during scrum label Feb 18, 2021
@mreynolds389 mreynolds389 added this to the 1.4.3 milestone Feb 18, 2021
@tbordaz tbordaz added priority_high need urgent fix / highly valuable / easy to fix In JIRA ticket is in JIRA labels Mar 25, 2021
@droideck droideck self-assigned this Apr 9, 2021
@droideck
Copy link
Member Author

droideck commented Apr 9, 2021

droideck added a commit that referenced this issue May 20, 2021
* Issue 4623 - RFE - Monitor the current DB locks

Description: DB lock gets exhausted because of unindexed internal searches
(under a transaction). Indexing those searches is the way to prevent exhaustion.
If db lock get exhausted during a txn, it leads to db panic and the later recovery
can possibly fail. That leads to a full reinit of the instance where the db locks
got exhausted.

Add three attributes to global BDB config: "nsslapd-db-locks-monitoring-enabled",
 "nsslapd-db-locks-monitoring-threshold" and "nsslapd-db-locks-monitoring-pause".
By default, nsslapd-db-locks-monitoring-enabled is turned on, nsslapd-db-locks-monitoring-threshold is set to 90% and nsslapd-db-locks-monitoring-threshold is 500ms.

When current locks are close to the maximum locks value of 90% - returning
the next candidate will fail until the maximum of locks won't be
increased or current locks are released.
The monitoring thread runs with the configurable interval of 500ms.

Add the setting to UI and CLI tools.

Fixes: #4623

Reviewed by: @Firstyear, @tbordaz, @jchapma, @mreynolds389 (Thank you!!)
droideck added a commit that referenced this issue May 26, 2021
Description: DB lock gets exhausted because of unindexed internal searches
(under a transaction). Indexing those searches is the way to prevent exhaustion.
If db lock get exhausted during a txn, it leads to db panic and the later recovery
can possibly fail. That leads to a full reinit of the instance where the db locks
got exhausted.

Add three attributes to global BDB config: "nsslapd-db-locks-monitoring-enabled",
 "nsslapd-db-locks-monitoring-threshold" and "nsslapd-db-locks-monitoring-pause".
By default, nsslapd-db-locks-monitoring-enabled is turned on, nsslapd-db-locks-monitoring-threshold is set to 90% and nsslapd-db-locks-monitoring-threshold is 500ms.

When current locks are close to the maximum locks value of 90% - returning
the next candidate will fail until the maximum of locks won't be
increased or current locks are released.
The monitoring thread runs with the configurable interval of 500ms.

Add the setting to UI and CLI tools.

Fixes: #4623

Reviewed by: @Firstyear, @tbordaz, @jchapma, @mreynolds389 (Thank you!!)
droideck added a commit that referenced this issue May 26, 2021
Description: DB lock gets exhausted because of unindexed internal searches
(under a transaction). Indexing those searches is the way to prevent exhaustion.
If db lock get exhausted during a txn, it leads to db panic and the later recovery
can possibly fail. That leads to a full reinit of the instance where the db locks
got exhausted.

Add three attributes to global BDB config: "nsslapd-db-locks-monitoring-enabled",
 "nsslapd-db-locks-monitoring-threshold" and "nsslapd-db-locks-monitoring-pause".
By default, nsslapd-db-locks-monitoring-enabled is turned on, nsslapd-db-locks-monitoring-threshold is set to 90% and nsslapd-db-locks-monitoring-threshold is 500ms.

When current locks are close to the maximum locks value of 90% - returning
the next candidate will fail until the maximum of locks won't be
increased or current locks are released.
The monitoring thread runs with the configurable interval of 500ms.

Add the setting to UI and CLI tools.

Fixes: #4623

Reviewed by: @Firstyear, @tbordaz, @jchapma, @mreynolds389 (Thank you!!)
@droideck
Copy link
Member Author

e05afab..a69c215 389-ds-base-1.4.3 -> 389-ds-base-1.4.3
50606d8..bba519c 389-ds-base-1.4.4 -> 389-ds-base-1.4.4

@droideck
Copy link
Member Author

Related issue: #4803

bsimonova added a commit to bsimonova/389-ds-base that referenced this issue Aug 3, 2021
…locks )

Description:
Added additional tests for DB locks monitoring to check if invalid
values are correctly rejected for nsslapd-db-locks and
nsslapd-db-locks-monitoring-threshold.

Relates: 389ds#4623

Reviewed by: droideck (Thanks!)
bsimonova added a commit that referenced this issue Aug 4, 2021
…locks )

Description:
Added additional tests for DB locks monitoring to check if invalid
values are correctly rejected for nsslapd-db-locks and
nsslapd-db-locks-monitoring-threshold.

Relates: #4623

Reviewed by: droideck (Thanks!)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
In JIRA ticket is in JIRA priority_high need urgent fix / highly valuable / easy to fix
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants