Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise idlistscanlimit by default #2435

Closed
389-ds-bot opened this issue Sep 13, 2020 · 19 comments · Fixed by #5639
Closed

Raise idlistscanlimit by default #2435

389-ds-bot opened this issue Sep 13, 2020 · 19 comments · Fixed by #5639
Labels
priority_high need urgent fix / highly valuable / easy to fix
Milestone

Comments

@389-ds-bot
Copy link

Cloned from Pagure issue: https://pagure.io/389-ds-base/issue/49376


Issue Description

With https://pagure.io/389-ds-base/issue/49372 we try to avoid hitting large IDLs by fixing the filters.

There are some cases left we may access IDLs and in those cases, we probably want to return an IDL anyway.

One case is if we hit the idlistscanlimit in an OR query. In this case we return ALLIDS to the parent filter, which likely inturn raises this back to the parent. This means that we do a very expensive id2entry scan. Had we of return an IDL, even if the IDL had say 100,000 items, that's still better than doing a full id2entry on a db with 200,000.

Another case is broad AND queries. If we do a subtree search of objectClass=group under say ou=Groups, if this returns ALLIDS, we have just wasted a lot of time again falling back to a full db search when we could have loaded the IDL and intersected it with parentid.

I think that we should examine the current idlistscanlimit and either raise it, or consider removing it all together.

@389-ds-bot 389-ds-bot added this to the 1.4 backlog milestone Sep 13, 2020
@389-ds-bot
Copy link
Author

389-ds-bot commented Sep 13, 2020

Comment from firstyear (@Firstyear) at 2017-09-11 04:07:06

Here is an example:

idlistscanlimit: 4000 (default)
[11/Sep/2017:12:03:33.524447131 +1000] conn=2 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(&(objectClass=inetOrgPerson)(objectClass=person)(objectClass=posixAccount)(uid>=test0053715))" attrs="uid"
[11/Sep/2017:12:03:54.236090743 +1000] conn=2 op=1 RESULT err=0 tag=101 nentries=46286 etime=20.1288196523 notes=A

idlistscanlimit: 99999999999
[11/Sep/2017:12:01:34.797114055 +1000] conn=4 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(&(objectClass=inetOrgPerson)(objectClass=person)(objectClass=posixAccount)(uid>=test0053715))" attrs="uid"
[11/Sep/2017:12:01:45.866876775 +1000] conn=4 op=1 RESULT err=0 tag=101 nentries=46286 etime=11.0069939325

So despite being over 40,000 entries, using the indexes is still faster.

In fact, it looks like the idlistscan limit is an attempt to optimise the "(&(uid=x)(objectClass=y))" case to improve this search, but the recent changes to filter_optimise in #2431 solve this much more completely.

I think we should remove the idlistscanlimit, as no other database (IE postgresql) has such an odd concept of bypassing indexes. We can gain more from improve filter optimise than trying to hand tune these numbers for specific cases,

@389-ds-bot
Copy link
Author

Comment from firstyear (@Firstyear) at 2017-09-11 04:07:07

Metadata Update from @Firstyear:

  • Custom field component adjusted to None
  • Custom field origin adjusted to None
  • Custom field reviewstatus adjusted to None
  • Custom field type adjusted to None
  • Custom field version adjusted to None

@389-ds-bot
Copy link
Author

Comment from firstyear (@Firstyear) at 2017-09-11 04:24:28

Following this, this would actually be a case to raise the filter test threshold, because this would allow better shortcutting of genuinely small sets that want to avoid large indexes. IE if we set test thrushold to say 32, but then if the set is larger than this, it's probably better to actually use the index. We could do some testing of the numbers here to be sure, but 32 - 64 could be good guesses.

@389-ds-bot
Copy link
Author

Comment from firstyear (@Firstyear) at 2017-09-11 04:46:26

Another point is that to maek filter optimisation easier, perhaps we should change their structure from linked list to something easier to sort / re-arrange?

@389-ds-bot
Copy link
Author

389-ds-bot commented Sep 13, 2020

Comment from firstyear (@Firstyear) at 2017-09-12 01:55:51

#811

@389-ds-bot
Copy link
Author

Comment from firstyear (@Firstyear) at 2017-09-12 01:57:20

Metadata Update from @Firstyear:

  • Issue assigned to Firstyear

@389-ds-bot
Copy link
Author

Comment from mreynolds (@mreynolds389) at 2017-09-14 17:35:25

Metadata Update from @mreynolds389:

  • Issue set to the milestone: 1.4 backlog

@389-ds-bot
Copy link
Author

389-ds-bot commented Sep 13, 2020

Comment from firstyear (@Firstyear) at 2017-10-03 06:18:10

0001-Ticket-49376-raise-idscanlimit.patch

Relies on #2431

@389-ds-bot
Copy link
Author

Comment from firstyear (@Firstyear) at 2017-10-03 06:18:12

Metadata Update from @Firstyear:

  • Custom field reviewstatus adjusted to review (was: None)

@389-ds-bot
Copy link
Author

Comment from lkrispen (@elkris) at 2017-10-04 11:02:22

I don't think we should change the default.

There are some scenarios wher it might be better to always return an idl. It is always a decision what is more costly: to handle and manage a very large idlist (it also has a lot of database accesses) or to handle all entries (which might be already in the entry cache).
The scanlimit is configurable and GSS knows to suggest proper settings.
If we change index lookup and idl handling (in other tickets) we need to document the effects it might have an the use of scan limits.
If we are really confident that always returning and idl would be the right thing, then we should remove the scan limit at all.

In general I think for changes like this we should have a set of reliable performance benchmarks first.

@389-ds-bot
Copy link
Author

Comment from firstyear (@Firstyear) at 2017-10-05 00:56:47

@elkris I have been performing a number of tests with the filter optimisation patch enabled, and in all cases the idlistscanlimit of 999999999 exceeded the performance of the current default. I will prepare a set of numbers for you.

As much as I love and respect GSS, they do NOT know how to suggest a proper setting here. This is a deeply confusing setting that relies on intimate knowledge of our internal search procedure that most users don't have.

I agree that we should remove the scan limit completely, but I would like to do this first to give us a clean backout strategy if it does fail on us,

@389-ds-bot
Copy link
Author

Comment from lkrispen (@elkris) at 2017-11-17 09:39:17

waiting for the set of numbers

@389-ds-bot
Copy link
Author

Comment from firstyear (@Firstyear) at 2017-11-20 10:31:21

Yep! I need to make the numbers for you! I've been quite busy as you might know, and I want to work on this with @vashirov to make our performance test framework? If you are going to do something, may as well do it correctly.

@mreynolds389
Copy link
Contributor

I don't have any official numbers but I also have not seen a performance hit when having a very large value for idscanlimit. I do recall with DSEE that it was worse, but in 389 I did not see same behavior. I'm not sure about LMDB, but increasing this with bdb has more pro's than con's

@mreynolds389 mreynolds389 added the priority_high need urgent fix / highly valuable / easy to fix label Feb 9, 2022
@mreynolds389 mreynolds389 modified the milestones: 1.4 backlog, 2.x backlog Feb 9, 2022
@Firstyear
Copy link
Contributor

@mreynolds389 This change is predicated on query optimisation existing, and I ran into waaayyyy too many issues with VLV to get it to "stick" in the past.

@Firstyear
Copy link
Contributor

@mreynolds389 @progier389 Can we review this again? We have the optimiser code now, and the IDL scan limit does nothing for us but actively harm performance of large queries. We should increase this value, but better yet, we should remove it entirely.

@mreynolds389
Copy link
Contributor

@mreynolds389 @progier389 Can we review this again? We have the optimiser code now, and the IDL scan limit does nothing for us but actively harm performance of large queries. We should increase this value, but better yet, we should remove it entirely.

I would love to drop it entirely, but I know others on the team are not so sure. Need to prove that it won't hurt anything...

@progier389
Copy link
Contributor

My gut feeling is that @Firstyear is right and that that limit is obsolete (probably since the idl_new index format was added), but we have an easy way to mitigate the risks by doing the change in two phases:

  1. Let set the default value to INT_MAX but keep the current code for now.
    In practice it disable the limit but allow easy reversal in case of trouble.
  2. And if nobody complains about it, we can always remove the code a few years later.

@Firstyear
Copy link
Contributor

Sounds like a reasonable plan to me :)

@progier389 It's not because of the idl_new format, but the query optimiser.

summary is if we have "(&(objectClass=person)(uid=william))" then we want to hit the "filter test threshold" as fast as possible to minimise work.

The idlscanlimit does this by loading the idl for objectClass=person then discarding if it is too long, then "hoping" what ever is next is going to be more precise.

The optimiser does this by swapping the terms to "(&(uid=william)(objectClass=person))" meaning the idl for uid=william is loaded, is seen to be length 1, and the shortcut out is taken and filter test applied.

The idlscanlimit has a single pathological bad case, which is where you are scanning large sets for example, "(&(objectClass=person)(country=australia))". In the idlscanlimit case, if both idls are > scanlimit, this becomes fully unindexed and we filter test the whole DB. By raising the idlscanlimit we avoid this by still ending up loading two large idls, but that is a fraction of cost compared to a full table scan. Since the optimiser now does the work to ensure queries are in "shortcut likely" arrangements, the idlscanlimit now just causes a degradation, not an improvement.

Firstyear added a commit to Firstyear/389-ds-base that referenced this issue Feb 8, 2023
Bug Description: The IDL scan limit existed as a naive attempt
at query optimisation to force longer IDLs to be skipped in favour
of short IDLs. However, since we now have a true query optimiser
that can handle this correctl, the IDL scan limit is redundant and
not functional. The only condition the IDL scan limit can now impact
is making queries slower by rejecting the use of longer IDLs when
shortcut conditions are not possible.

Fix Description: Raise the IDL Scan Limit to INT_MAX to observe
the results. In the future we can remove the code entirely.

fixes: 389ds#2435

Author: William Brown <william@blackhats.net.au>

Review by: ???
Firstyear added a commit that referenced this issue Feb 9, 2023
Bug Description: The IDL scan limit existed as a naive attempt
at query optimisation to force longer IDLs to be skipped in favour
of short IDLs. However, since we now have a true query optimiser
that can handle this correctl, the IDL scan limit is redundant and
not functional. The only condition the IDL scan limit can now impact
is making queries slower by rejecting the use of longer IDLs when
shortcut conditions are not possible.

Fix Description: Raise the IDL Scan Limit to INT_MAX to observe
the results. In the future we can remove the code entirely.

fixes: #2435

Author: William Brown <william@blackhats.net.au>

Review by: @progier389 @mreynolds389
mreynolds389 pushed a commit that referenced this issue Feb 10, 2023
Bug Description: The IDL scan limit existed as a naive attempt
at query optimisation to force longer IDLs to be skipped in favour
of short IDLs. However, since we now have a true query optimiser
that can handle this correctl, the IDL scan limit is redundant and
not functional. The only condition the IDL scan limit can now impact
is making queries slower by rejecting the use of longer IDLs when
shortcut conditions are not possible.

Fix Description: Raise the IDL Scan Limit to INT_MAX to observe
the results. In the future we can remove the code entirely.

fixes: #2435

Author: William Brown <william@blackhats.net.au>

Review by: @progier389 @mreynolds389
mreynolds389 pushed a commit that referenced this issue Feb 10, 2023
Bug Description: The IDL scan limit existed as a naive attempt
at query optimisation to force longer IDLs to be skipped in favour
of short IDLs. However, since we now have a true query optimiser
that can handle this correctl, the IDL scan limit is redundant and
not functional. The only condition the IDL scan limit can now impact
is making queries slower by rejecting the use of longer IDLs when
shortcut conditions are not possible.

Fix Description: Raise the IDL Scan Limit to INT_MAX to observe
the results. In the future we can remove the code entirely.

fixes: #2435

Author: William Brown <william@blackhats.net.au>

Review by: @progier389 @mreynolds389
mreynolds389 pushed a commit that referenced this issue Feb 10, 2023
Bug Description: The IDL scan limit existed as a naive attempt
at query optimisation to force longer IDLs to be skipped in favour
of short IDLs. However, since we now have a true query optimiser
that can handle this correctl, the IDL scan limit is redundant and
not functional. The only condition the IDL scan limit can now impact
is making queries slower by rejecting the use of longer IDLs when
shortcut conditions are not possible.

Fix Description: Raise the IDL Scan Limit to INT_MAX to observe
the results. In the future we can remove the code entirely.

fixes: #2435

Author: William Brown <william@blackhats.net.au>

Review by: @progier389 @mreynolds389
mreynolds389 pushed a commit that referenced this issue Feb 10, 2023
Bug Description: The IDL scan limit existed as a naive attempt
at query optimisation to force longer IDLs to be skipped in favour
of short IDLs. However, since we now have a true query optimiser
that can handle this correctl, the IDL scan limit is redundant and
not functional. The only condition the IDL scan limit can now impact
is making queries slower by rejecting the use of longer IDLs when
shortcut conditions are not possible.

Fix Description: Raise the IDL Scan Limit to INT_MAX to observe
the results. In the future we can remove the code entirely.

fixes: #2435

Author: William Brown <william@blackhats.net.au>

Review by: @progier389 @mreynolds389
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority_high need urgent fix / highly valuable / easy to fix
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants