RFE: improve the performance of evaluation of filter component when tested against a large valueset (like group members) #6172

tbordaz · 2024-05-15T09:14:28Z

Is your feature request related to a problem? Please describe.
Before returning an entry (to a SRCH) the server check that the SRCH filter matches the entry. If a component of the filter is testing a large valueset in the candidate entry, it is expensive to check the matching.
A typical issue is with a component like '(uniquemember=foo)' and the candidate entry is a group containing thousands of 'uniquemember'.

Steps to reproduce the behavior:

Create 10000 groups and 10000 users (with a user 'foo')
Make all groups containing all users.
do search with filter "(&(objectclass=groupofuniquenames)(uniqueMember=foo))"
response time is long because of cache priming
do search with filter "(&(objectclass=groupofuniquenames)(uniqueMember=foo))" is better but still slow

Once primed the second search should be much faster

Describe the solution you'd like
Proposed an option to use the sorted valueset when checking the match (ava test)

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
I initially thought that the performance issue was related to a filter bypass (on indexed searches) #6030 but I doubt that the bypass is always possible. So I prefer to make the test of the match faster.

…ponent when tested against a large valueset (like group members) Bug description: Before returning an entry (to a SRCH) the server checks that the entry matches the SRCH filter. If a filter component (equality) is testing the value (ava) against a large valueset (like uniquemember values), it takes a long time because of large number of value and required normalization of the values. This can be improved taking benefit of sorted valueset. Those sorted valueset were created to improve updates of large valueset (groups) but not used in SRCH path. Fix description: if config param 'nsslapd-filter-match-use-sorted-vs = on' then it uses slapi_valueset_find (that tries to use sorted valueset) rather than plugin_call_syntax_filter_ava. In both case sorted valueset and plugin_call_syntax_filter_ava, ava and values are normalized. In sorted valueset, the values have been normalized to insert the index in the sorted array and then comparison is done on normalized values. In plugin_call_syntax_filter_ava, all values in valuearray (of valueset) are normalized before comparison. Likely this optimization should be dropped for extended search relates: 389ds#6172 Reviewed by:

…ponent when tested against a large valueset (like group members) Bug description: Before returning an entry (to a SRCH) the server checks that the entry matches the SRCH filter. If a filter component (equality) is testing the value (ava) against a large valueset (like uniquemember values), it takes a long time because of the large number of values and required normalization of the values. This can be improved taking benefit of sorted valueset. Those sorted valueset were created to improve updates of large valueset (groups) but at that time not implemented in SRCH path. Fix description: In case of LDAP_FILTER_EQUALITY component, the server can get benefit of the sorted valuearray. To limit the risk of regression, we use the sorted valuearray only for the DN syntax attribute. Indeed the sorted valuearray was designed for those type of attribute. With those two limitations, there is no need of a toggle and the call to plugin_call_syntax_filter_ava can be replaced by a call to slapi_valueset_find. In both cases, sorted valueset and plugin_call_syntax_filter_ava, ava and values are normalized. In sorted valueset, the values have been normalized to insert the index in the sorted array and then comparison is done on normalized values. In plugin_call_syntax_filter_ava, all values in valuearray (of valueset) are normalized before comparison. relates: 389ds#6172 Reviewed by: Pierre Rogier (Big Thanks !!!)

…ponent when tested against a large valueset (like group members) Bug description: Before returning an entry (to a SRCH) the server checks that the entry matches the SRCH filter. If a filter component (equality) is testing the value (ava) against a large valueset (like uniquemember values), it takes a long time because of the large number of values and required normalization of the values. This can be improved taking benefit of sorted valueset. Those sorted valueset were created to improve updates of large valueset (groups) but at that time not implemented in SRCH path. Fix description: In case of LDAP_FILTER_EQUALITY component, the server can get benefit of the sorted valuearray. To limit the risk of regression, we use the sorted valuearray only for the DN syntax attribute. Indeed the sorted valuearray was designed for those type of attribute. With those two limitations, there is no need of a toggle and the call to plugin_call_syntax_filter_ava can be replaced by a call to slapi_valueset_find. In both cases, sorted valueset and plugin_call_syntax_filter_ava, ava and values are normalized. In sorted valueset, the values have been normalized to insert the index in the sorted array and then comparison is done on normalized values. In plugin_call_syntax_filter_ava, all values in valuearray (of valueset) are normalized before comparison. relates: 389ds#6172 Reviewed by: Pierre Rogier, Simon Pichugin (Big Thanks !!!)

…ponent when tested against a large valueset (like group members) (#6173) Bug description: Before returning an entry (to a SRCH) the server checks that the entry matches the SRCH filter. If a filter component (equality) is testing the value (ava) against a large valueset (like uniquemember values), it takes a long time because of the large number of values and required normalization of the values. This can be improved taking benefit of sorted valueset. Those sorted valueset were created to improve updates of large valueset (groups) but at that time not implemented in SRCH path. Fix description: In case of LDAP_FILTER_EQUALITY component, the server can get benefit of the sorted valuearray. To limit the risk of regression, we use the sorted valuearray only for the DN syntax attribute. Indeed the sorted valuearray was designed for those type of attribute. With those two limitations, there is no need of a toggle and the call to plugin_call_syntax_filter_ava can be replaced by a call to slapi_valueset_find. In both cases, sorted valueset and plugin_call_syntax_filter_ava, ava and values are normalized. In sorted valueset, the values have been normalized to insert the index in the sorted array and then comparison is done on normalized values. In plugin_call_syntax_filter_ava, all values in valuearray (of valueset) are normalized before comparison. relates: #6172 Reviewed by: Pierre Rogier, Simon Pichugin (Big Thanks !!!)

tbordaz · 2024-05-22T09:35:42Z

884deb6..904dc99 main
4d10f39..d7e2d86 389-ds-base-3.0
ad6e23c..c09c0f2 389-ds-base-2.5
f08f008..45e14d6 389-ds-base-2.4
04254a0..90156ca 389-ds-base-2.3
9366edd..ab44858 389-ds-base-2.2
e78e793..f46e3b1 389-ds-base-2.1
635fdda..048e67f 389-ds-base-2.0
ab2e232..27643ce 389-ds-base-1.4.4
ce4a554..adc92a9 389-ds-base-1.4.3

…ponent when tested against a large valueset (like group members) (#6173) Bug description: Before returning an entry (to a SRCH) the server checks that the entry matches the SRCH filter. If a filter component (equality) is testing the value (ava) against a large valueset (like uniquemember values), it takes a long time because of the large number of values and required normalization of the values. This can be improved taking benefit of sorted valueset. Those sorted valueset were created to improve updates of large valueset (groups) but at that time not implemented in SRCH path. Fix description: In case of LDAP_FILTER_EQUALITY component, the server can get benefit of the sorted valuearray. To limit the risk of regression, we use the sorted valuearray only for the DN syntax attribute. Indeed the sorted valuearray was designed for those type of attribute. With those two limitations, there is no need of a toggle and the call to plugin_call_syntax_filter_ava can be replaced by a call to slapi_valueset_find. In both cases, sorted valueset and plugin_call_syntax_filter_ava, ava and values are normalized. In sorted valueset, the values have been normalized to insert the index in the sorted array and then comparison is done on normalized values. In plugin_call_syntax_filter_ava, all values in valuearray (of valueset) are normalized before comparison. relates: #6172 Reviewed by: Pierre Rogier, Simon Pichugin (Big Thanks !!!)

tbordaz added performance Issue impacts performance work in progress Work in Progress - can be reviewed, but not ready for merge. priority_high need urgent fix / highly valuable / easy to fix labels May 15, 2024

tbordaz self-assigned this May 15, 2024

tbordaz mentioned this issue May 15, 2024

Issue 6172 - RFE: improve the performance of evaluation of filter com… #6173

Merged

tbordaz added this to the 1.4.3 milestone May 22, 2024

tbordaz closed this as completed May 27, 2024

vashirov mentioned this issue May 31, 2024

Test failure: test_match_large_valueset #6192

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFE: improve the performance of evaluation of filter component when tested against a large valueset (like group members) #6172

RFE: improve the performance of evaluation of filter component when tested against a large valueset (like group members) #6172

tbordaz commented May 15, 2024

tbordaz commented May 22, 2024 •

edited

Loading

RFE: improve the performance of evaluation of filter component when tested against a large valueset (like group members) #6172

RFE: improve the performance of evaluation of filter component when tested against a large valueset (like group members) #6172

Comments

tbordaz commented May 15, 2024

tbordaz commented May 22, 2024 • edited Loading

tbordaz commented May 22, 2024 •

edited

Loading