Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework NetmaskTree for better CPU and memory efficiency. #8355

Merged
merged 31 commits into from
Feb 12, 2020

Conversation

stephanbosch
Copy link
Contributor

@stephanbosch stephanbosch commented Sep 29, 2019

Short description

The existing implementation of class NetmaskTree was inefficient because:

  • it created a new tree node for each bit in the Netmask
  • used a redundant std::set to provide iterator support.

These changes mend both these problems, yielding an overall decrease in CPU usage and memory requirements.

Checklist

I have:

  • read the CONTRIBUTING.md document
  • compiled this code
  • tested this code
  • included documentation (including possible behaviour changes)
  • documented the code
  • added or modified regression test(s)
  • added or modified unit test(s)

@stephanbosch
Copy link
Contributor Author

oh, seems like I didn't actually compile the recursor. need to fix a few things.

@stephanbosch stephanbosch force-pushed the netmask-tree branch 3 times, most recently from dc73b7b to 7e133b3 Compare September 29, 2019 22:19
@lgtm-com
Copy link

lgtm-com bot commented Sep 30, 2019

This pull request introduces 1 alert when merging 7e133b3 into ef9ed3b - view on LGTM.com

new alerts:

  • 1 for Inconsistent definition of copy constructor and assignment ('Rule of Two')

@stephanbosch
Copy link
Contributor Author

stephanbosch commented Sep 30, 2019

For some reason, GitHub doesn't list the commits in the proper sequence. Not practical for review.

Proper order from last to first commit:

3d12b855d iputils.hh: NetmaskTree: Drop the internal std::set.
7061cda48 iputils.hh: NetmaskTree: Add iterator class and use it for begin() and end() methods.
e1c261ca4 iputils.hh: NetmaskTree: Keep track of the left-most node in the tree.
94085eed0 iputils.hh: NetmaskTree: Copy the tree using tree traversal.
23992be69 iputils.hh: NetmaskTree::TreeNode: Implement tree traversal methods.
85fe4f504 iputils.hh: NetmaskTree: Make tree cleanup mandatory.
462366721 iputils.hh: NetmaskTree: Reduce the number of tree nodes.
a503e85a5 iputils.hh: NetmaskTree: Initialize TreeNode::node immediately.
ed2e08993 iputils.hh: NetmaskTree: Use for loops instead of while loops.
e8783ead0 iputils.hh: NetmaskTree: Restructure the tree with separate branches for for IPv4 and IPv6
51ab542b4 iputils.hh: NetmaskTree: Rename field "root" to "d_root"
39599b68b iputils.hh: NetmaskTree: Use ComboAddress::getBit() and Netmask::getBit()
28319d29c iputils.hh: Netmask: Add masked copy constructor.
422b693b7 iputils.hh: Netmask: Prevent the construction of a 128-bit IPv4 netmask.
25d2498a5 iputils.hh: Netmask: Add getBit()
5d3168541 iputils.hh: Netmask: Add getAddressBits()
ab4073c00 iputils.hh: ComboAddress: Add getBit()
af340ccc9 iputils.hh: ComboAddress: Add getBits()
61a508dc7 test-iputils_hh.cc: Add tests for NetmaskTree copy, swap and iterator operations.
0313b0e1a test-iputils_hh.cc: Test 0.0.0.0 address at serveral network bit ranges.
9c405db9a test-iputils_hh.cc: Better verify the NetmaskTree container size during tests.
90b0422cc Reformat test-iputils_hh.cc.
ca717c62f Reformat iputils.hh

@stephanbosch
Copy link
Contributor Author

Fixing the LGTM alert.

@stephanbosch stephanbosch force-pushed the netmask-tree branch 3 times, most recently from 37eb1fc to 017bf46 Compare September 30, 2019 08:36
@stephanbosch
Copy link
Contributor Author

OK, worked around the GitHub commit order bug by rebasing and rewording every commit.

Copy link
Member

@rgacogne rgacogne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, a few comments but nothing bad.
A very quick test shows a very nice decrease in memory usage and insertion/lookup time, with a very small increase of the deletion time (even when compared to the existing "cleaning up" case) that probably doesn't matter to us.
I'll do more tests later.

pdns/iputils.hh Show resolved Hide resolved
pdns/iputils.hh Outdated Show resolved Hide resolved
pdns/iputils.hh Show resolved Hide resolved
pdns/iputils.hh Show resolved Hide resolved
pdns/iputils.hh Show resolved Hide resolved
pdns/iputils.hh Outdated Show resolved Hide resolved
pdns/iputils.hh Outdated Show resolved Hide resolved
pdns/iputils.hh Outdated Show resolved Hide resolved
pdns/iputils.hh Outdated Show resolved Hide resolved
@stephanbosch
Copy link
Contributor Author

OK, I'll address these soonish. I also have a few other small things I'd like to change still.

@stephanbosch
Copy link
Contributor Author

I addressed most comments and changed a few things:

  • The original pr added a Netmask(other, bits) copy constructor that could used to make a copy with equal or less network bits. That method is called getSuper() now.
  • Netmasks are now stored normalized in the tree (bits outside the netmask are forced to zero). So, the pair->first values returned by the insert() and lookup() are also always normalized.
  • I added a new commit that makes the iterator API use a proper reference type (not a pointer). We can do this now that there is no std::set anymore that dictates the iterator type. This makes the iterator more standard, less cumbersome and particularly less confusing. This commit removes a question comment that is illustrative of the problem this solves.
  • Made the key (first) value in the node_type const, so that the application cannot mess with it. That could have corrupted the tree.

Copy link
Member

@rgacogne rgacogne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Just a nit but amending the commits makes it hard for me to spot what actually changed since my initial review :-)

pdns/recursor_cache.cc Outdated Show resolved Hide resolved
pdns/iputils.hh Show resolved Hide resolved
pdns/iputils.hh Show resolved Hide resolved
@stephanbosch
Copy link
Contributor Author

Thanks! Just a nit but amending the commits makes it hard for me to spot what actually changed since my initial review :-)

Yeah, that is a bit easier in GitLab it seems...

@pieterlexis pieterlexis mentioned this pull request Nov 14, 2019
7 tasks
@Habbie
Copy link
Member

Habbie commented Nov 28, 2019

This branch has conflicts that must be resolved

@Habbie Habbie added this to the auth-4.3.0-alpha2 milestone Dec 6, 2019
@omoerbeek
Copy link
Member

Loving this PR and would even love it more if it got finished :-) @stephanbosch do you think you'll have some time to spend to make it ready for merge?

@stephanbosch
Copy link
Contributor Author

Rebased and CI seems to like it still.

…for for IPv4 and IPv6

This simplifies the code considerably.
Makes using `continue' easier in later commit.
Before, it created a tree node for every network bit in the netmask. Now, it
only creates a tree node when necessary (only for values and branches).
This makes address bits below the network mask all zero, which is consistent
with the tree's behavior.

This change addresses one sensitivity to this behavioral change in the recursor
cache.
Potentially leaving branches full of unassigned nodes unnecessarily complicates
tree algorithms. Disabling tree cleanup was not used anywhere, except for a unit
test. Note that, after this change, individual branch nodes can still be
unassigned, but not the whole branch. So, when e.g. the left sub-branch of a
node exists, algorithms can rely on the fact that there is at least one assigned
node in there.
Needed to provide a begin() iterator in constant time.
pdns/iputils.hh Outdated Show resolved Hide resolved
pdns/iputils.hh Outdated Show resolved Hide resolved
Copy link
Member

@rgacogne rgacogne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, many thanks! Just one nit but we can fix it later.

…an a pointer.

It no longer needs to be allocated separately.
…ce rather than a pointer.

This is possible now that the iterator type is no longer dictated by the
internal std::set. This changes the NetmaskTree::iterator API, but it makes it
more standard, less cumbersome, and less confusing.
This prevents changing the key used by the tree, which would otherwise provide
an opportunity to corrupt the tree.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants