Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 6172 - RFE: improve the performance of evaluation of filter com… #6173

Merged
merged 1 commit into from
May 22, 2024

Conversation

tbordaz
Copy link
Contributor

@tbordaz tbordaz commented May 15, 2024

…ponent when tested against a large valueset (like group members)

Bug description:
Before returning an entry (to a SRCH) the server checks that the entry matches the SRCH filter. If a filter component (equality) is testing the value (ava) against a large valueset (like uniquemember values), it takes a long time because of large number of value and required normalization of the values. This can be improved taking benefit of sorted valueset. Those sorted valueset were created to improve updates of large valueset (groups) but not used in SRCH path.

Fix description:
if config param 'nsslapd-filter-match-use-sorted-vs = on' then it uses slapi_valueset_find (that tries to use sorted valueset) rather than plugin_call_syntax_filter_ava.
In both case sorted valueset and plugin_call_syntax_filter_ava, ava and values are normalized.
In sorted valueset, the values have been normalized to insert the index in the sorted array and then comparison is done on normalized values. In plugin_call_syntax_filter_ava, all values in valuearray (of valueset) are normalized before comparison.

Likely this optimization should be dropped for extended search

relates: #6172

Reviewed by:

Copy link
Contributor

@progier389 progier389 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of using directly the valueset is good.
but it can only be done if ftype == LDAP_FILTER_EQUALITY
and in this case I do not think that using yet another config parameter is really needed (in 389ds, equality is implement by comparing normalized values anyway)
That said, IMHO we should also write some CI tests for that one:

  • to check the scenario you describe
  • and also a scenario storing data on a case ignore string attribute then searching for uppercase then for lowercase (To double check that the filter values are properly normalized)

Copy link
Contributor

@progier389 progier389 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good except a minor point:
I would change the "else" comment to something easier to understand, for example:
When sorted valuearray optimization cannot be used lets filter the value according to its syntax"

Copy link
Member

@droideck droideck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@tbordaz
Copy link
Contributor Author

tbordaz commented May 16, 2024

@progier389 , @droideck , Thanks for the review. I will update the patch accordingly in addition with a testcase I am working on.

…ponent when tested against a large valueset (like group members)

Bug description:
	Before returning an entry (to a SRCH) the server checks that the entry matches the SRCH filter.
	If a filter component (equality) is testing the value (ava) against a
	large valueset (like uniquemember values), it takes a long time because
	of the large number of values and required normalization of the values.
	This can be improved taking benefit of sorted valueset. Those sorted
	valueset were created to improve updates of large valueset (groups) but
	at that time not implemented in SRCH path.

Fix description:
	In case of LDAP_FILTER_EQUALITY component, the server can get
	benefit of the sorted valuearray.
	To limit the risk of regression, we use the sorted valuearray
	only for the DN syntax attribute. Indeed the sorted valuearray was
	designed for those type of attribute.
	With those two limitations, there is no need of a toggle and
	the call to plugin_call_syntax_filter_ava can be replaced by
	a call to slapi_valueset_find.
	In both cases, sorted valueset and plugin_call_syntax_filter_ava, ava and
	values are normalized.
	In sorted valueset, the values have been normalized to insert the index
	in the sorted array and then comparison is done on normalized values.
	In plugin_call_syntax_filter_ava, all values in valuearray (of valueset) are normalized
	before comparison.

relates: 389ds#6172

Reviewed by: Pierre Rogier, Simon Pichugin (Big Thanks !!!)
@tbordaz
Copy link
Contributor Author

tbordaz commented May 17, 2024

I added a testcase to the patch

@tbordaz tbordaz merged commit 904dc99 into 389ds:main May 22, 2024
185 of 195 checks passed
tbordaz added a commit that referenced this pull request May 27, 2024
…ponent when tested against a large valueset (like group members) (#6173)

Bug description:
	Before returning an entry (to a SRCH) the server checks that the entry matches the SRCH filter.
	If a filter component (equality) is testing the value (ava) against a
	large valueset (like uniquemember values), it takes a long time because
	of the large number of values and required normalization of the values.
	This can be improved taking benefit of sorted valueset. Those sorted
	valueset were created to improve updates of large valueset (groups) but
	at that time not implemented in SRCH path.

Fix description:
	In case of LDAP_FILTER_EQUALITY component, the server can get
	benefit of the sorted valuearray.
	To limit the risk of regression, we use the sorted valuearray
	only for the DN syntax attribute. Indeed the sorted valuearray was
	designed for those type of attribute.
	With those two limitations, there is no need of a toggle and
	the call to plugin_call_syntax_filter_ava can be replaced by
	a call to slapi_valueset_find.
	In both cases, sorted valueset and plugin_call_syntax_filter_ava, ava and
	values are normalized.
	In sorted valueset, the values have been normalized to insert the index
	in the sorted array and then comparison is done on normalized values.
	In plugin_call_syntax_filter_ava, all values in valuearray (of valueset) are normalized
	before comparison.

relates: #6172

Reviewed by: Pierre Rogier, Simon Pichugin (Big Thanks !!!)
tbordaz added a commit that referenced this pull request May 27, 2024
…ponent when tested against a large valueset (like group members) (#6173)

Bug description:
	Before returning an entry (to a SRCH) the server checks that the entry matches the SRCH filter.
	If a filter component (equality) is testing the value (ava) against a
	large valueset (like uniquemember values), it takes a long time because
	of the large number of values and required normalization of the values.
	This can be improved taking benefit of sorted valueset. Those sorted
	valueset were created to improve updates of large valueset (groups) but
	at that time not implemented in SRCH path.

Fix description:
	In case of LDAP_FILTER_EQUALITY component, the server can get
	benefit of the sorted valuearray.
	To limit the risk of regression, we use the sorted valuearray
	only for the DN syntax attribute. Indeed the sorted valuearray was
	designed for those type of attribute.
	With those two limitations, there is no need of a toggle and
	the call to plugin_call_syntax_filter_ava can be replaced by
	a call to slapi_valueset_find.
	In both cases, sorted valueset and plugin_call_syntax_filter_ava, ava and
	values are normalized.
	In sorted valueset, the values have been normalized to insert the index
	in the sorted array and then comparison is done on normalized values.
	In plugin_call_syntax_filter_ava, all values in valuearray (of valueset) are normalized
	before comparison.

relates: #6172

Reviewed by: Pierre Rogier, Simon Pichugin (Big Thanks !!!)
tbordaz added a commit that referenced this pull request May 27, 2024
…ponent when tested against a large valueset (like group members) (#6173)

Bug description:
	Before returning an entry (to a SRCH) the server checks that the entry matches the SRCH filter.
	If a filter component (equality) is testing the value (ava) against a
	large valueset (like uniquemember values), it takes a long time because
	of the large number of values and required normalization of the values.
	This can be improved taking benefit of sorted valueset. Those sorted
	valueset were created to improve updates of large valueset (groups) but
	at that time not implemented in SRCH path.

Fix description:
	In case of LDAP_FILTER_EQUALITY component, the server can get
	benefit of the sorted valuearray.
	To limit the risk of regression, we use the sorted valuearray
	only for the DN syntax attribute. Indeed the sorted valuearray was
	designed for those type of attribute.
	With those two limitations, there is no need of a toggle and
	the call to plugin_call_syntax_filter_ava can be replaced by
	a call to slapi_valueset_find.
	In both cases, sorted valueset and plugin_call_syntax_filter_ava, ava and
	values are normalized.
	In sorted valueset, the values have been normalized to insert the index
	in the sorted array and then comparison is done on normalized values.
	In plugin_call_syntax_filter_ava, all values in valuearray (of valueset) are normalized
	before comparison.

relates: #6172

Reviewed by: Pierre Rogier, Simon Pichugin (Big Thanks !!!)
tbordaz added a commit that referenced this pull request May 27, 2024
…ponent when tested against a large valueset (like group members) (#6173)

Bug description:
	Before returning an entry (to a SRCH) the server checks that the entry matches the SRCH filter.
	If a filter component (equality) is testing the value (ava) against a
	large valueset (like uniquemember values), it takes a long time because
	of the large number of values and required normalization of the values.
	This can be improved taking benefit of sorted valueset. Those sorted
	valueset were created to improve updates of large valueset (groups) but
	at that time not implemented in SRCH path.

Fix description:
	In case of LDAP_FILTER_EQUALITY component, the server can get
	benefit of the sorted valuearray.
	To limit the risk of regression, we use the sorted valuearray
	only for the DN syntax attribute. Indeed the sorted valuearray was
	designed for those type of attribute.
	With those two limitations, there is no need of a toggle and
	the call to plugin_call_syntax_filter_ava can be replaced by
	a call to slapi_valueset_find.
	In both cases, sorted valueset and plugin_call_syntax_filter_ava, ava and
	values are normalized.
	In sorted valueset, the values have been normalized to insert the index
	in the sorted array and then comparison is done on normalized values.
	In plugin_call_syntax_filter_ava, all values in valuearray (of valueset) are normalized
	before comparison.

relates: #6172

Reviewed by: Pierre Rogier, Simon Pichugin (Big Thanks !!!)
tbordaz added a commit that referenced this pull request May 27, 2024
…ponent when tested against a large valueset (like group members) (#6173)

Bug description:
	Before returning an entry (to a SRCH) the server checks that the entry matches the SRCH filter.
	If a filter component (equality) is testing the value (ava) against a
	large valueset (like uniquemember values), it takes a long time because
	of the large number of values and required normalization of the values.
	This can be improved taking benefit of sorted valueset. Those sorted
	valueset were created to improve updates of large valueset (groups) but
	at that time not implemented in SRCH path.

Fix description:
	In case of LDAP_FILTER_EQUALITY component, the server can get
	benefit of the sorted valuearray.
	To limit the risk of regression, we use the sorted valuearray
	only for the DN syntax attribute. Indeed the sorted valuearray was
	designed for those type of attribute.
	With those two limitations, there is no need of a toggle and
	the call to plugin_call_syntax_filter_ava can be replaced by
	a call to slapi_valueset_find.
	In both cases, sorted valueset and plugin_call_syntax_filter_ava, ava and
	values are normalized.
	In sorted valueset, the values have been normalized to insert the index
	in the sorted array and then comparison is done on normalized values.
	In plugin_call_syntax_filter_ava, all values in valuearray (of valueset) are normalized
	before comparison.

relates: #6172

Reviewed by: Pierre Rogier, Simon Pichugin (Big Thanks !!!)
tbordaz added a commit that referenced this pull request May 27, 2024
…ponent when tested against a large valueset (like group members) (#6173)

Bug description:
	Before returning an entry (to a SRCH) the server checks that the entry matches the SRCH filter.
	If a filter component (equality) is testing the value (ava) against a
	large valueset (like uniquemember values), it takes a long time because
	of the large number of values and required normalization of the values.
	This can be improved taking benefit of sorted valueset. Those sorted
	valueset were created to improve updates of large valueset (groups) but
	at that time not implemented in SRCH path.

Fix description:
	In case of LDAP_FILTER_EQUALITY component, the server can get
	benefit of the sorted valuearray.
	To limit the risk of regression, we use the sorted valuearray
	only for the DN syntax attribute. Indeed the sorted valuearray was
	designed for those type of attribute.
	With those two limitations, there is no need of a toggle and
	the call to plugin_call_syntax_filter_ava can be replaced by
	a call to slapi_valueset_find.
	In both cases, sorted valueset and plugin_call_syntax_filter_ava, ava and
	values are normalized.
	In sorted valueset, the values have been normalized to insert the index
	in the sorted array and then comparison is done on normalized values.
	In plugin_call_syntax_filter_ava, all values in valuearray (of valueset) are normalized
	before comparison.

relates: #6172

Reviewed by: Pierre Rogier, Simon Pichugin (Big Thanks !!!)
tbordaz added a commit that referenced this pull request May 27, 2024
…ponent when tested against a large valueset (like group members) (#6173)

Bug description:
	Before returning an entry (to a SRCH) the server checks that the entry matches the SRCH filter.
	If a filter component (equality) is testing the value (ava) against a
	large valueset (like uniquemember values), it takes a long time because
	of the large number of values and required normalization of the values.
	This can be improved taking benefit of sorted valueset. Those sorted
	valueset were created to improve updates of large valueset (groups) but
	at that time not implemented in SRCH path.

Fix description:
	In case of LDAP_FILTER_EQUALITY component, the server can get
	benefit of the sorted valuearray.
	To limit the risk of regression, we use the sorted valuearray
	only for the DN syntax attribute. Indeed the sorted valuearray was
	designed for those type of attribute.
	With those two limitations, there is no need of a toggle and
	the call to plugin_call_syntax_filter_ava can be replaced by
	a call to slapi_valueset_find.
	In both cases, sorted valueset and plugin_call_syntax_filter_ava, ava and
	values are normalized.
	In sorted valueset, the values have been normalized to insert the index
	in the sorted array and then comparison is done on normalized values.
	In plugin_call_syntax_filter_ava, all values in valuearray (of valueset) are normalized
	before comparison.

relates: #6172

Reviewed by: Pierre Rogier, Simon Pichugin (Big Thanks !!!)
tbordaz added a commit that referenced this pull request May 27, 2024
…ponent when tested against a large valueset (like group members) (#6173)

Bug description:
	Before returning an entry (to a SRCH) the server checks that the entry matches the SRCH filter.
	If a filter component (equality) is testing the value (ava) against a
	large valueset (like uniquemember values), it takes a long time because
	of the large number of values and required normalization of the values.
	This can be improved taking benefit of sorted valueset. Those sorted
	valueset were created to improve updates of large valueset (groups) but
	at that time not implemented in SRCH path.

Fix description:
	In case of LDAP_FILTER_EQUALITY component, the server can get
	benefit of the sorted valuearray.
	To limit the risk of regression, we use the sorted valuearray
	only for the DN syntax attribute. Indeed the sorted valuearray was
	designed for those type of attribute.
	With those two limitations, there is no need of a toggle and
	the call to plugin_call_syntax_filter_ava can be replaced by
	a call to slapi_valueset_find.
	In both cases, sorted valueset and plugin_call_syntax_filter_ava, ava and
	values are normalized.
	In sorted valueset, the values have been normalized to insert the index
	in the sorted array and then comparison is done on normalized values.
	In plugin_call_syntax_filter_ava, all values in valuearray (of valueset) are normalized
	before comparison.

relates: #6172

Reviewed by: Pierre Rogier, Simon Pichugin (Big Thanks !!!)
tbordaz added a commit that referenced this pull request May 27, 2024
…ponent when tested against a large valueset (like group members) (#6173)

Bug description:
	Before returning an entry (to a SRCH) the server checks that the entry matches the SRCH filter.
	If a filter component (equality) is testing the value (ava) against a
	large valueset (like uniquemember values), it takes a long time because
	of the large number of values and required normalization of the values.
	This can be improved taking benefit of sorted valueset. Those sorted
	valueset were created to improve updates of large valueset (groups) but
	at that time not implemented in SRCH path.

Fix description:
	In case of LDAP_FILTER_EQUALITY component, the server can get
	benefit of the sorted valuearray.
	To limit the risk of regression, we use the sorted valuearray
	only for the DN syntax attribute. Indeed the sorted valuearray was
	designed for those type of attribute.
	With those two limitations, there is no need of a toggle and
	the call to plugin_call_syntax_filter_ava can be replaced by
	a call to slapi_valueset_find.
	In both cases, sorted valueset and plugin_call_syntax_filter_ava, ava and
	values are normalized.
	In sorted valueset, the values have been normalized to insert the index
	in the sorted array and then comparison is done on normalized values.
	In plugin_call_syntax_filter_ava, all values in valuearray (of valueset) are normalized
	before comparison.

relates: #6172

Reviewed by: Pierre Rogier, Simon Pichugin (Big Thanks !!!)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants