Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

addrman: improve performance by using more suitable containers #18722

Merged
merged 1 commit into from Jun 12, 2021

Conversation

vasild
Copy link
Contributor

@vasild vasild commented Apr 21, 2020

CAddrMan uses std::map internally even though it does not require
that the map's elements are sorted. std::map's access time is
O(log(map size)). std::unordered_map is more suitable as it has a
O(1) access time.

This patch lowers the execution times of CAddrMan's methods as follows
(as per src/bench/addrman.cpp):

AddrMan::Add(): -3.5%
AddrMan::GetAddr(): -76%
AddrMan::Good(): -0.38%
AddrMan::Select(): -45%

@vasild
Copy link
Contributor Author

vasild commented Apr 21, 2020

Full output from the benchmark (./src/bench/bench_bitcoin -filter="AddrMan.*"):

std::map

Benchmark evals iterations total min max median
AddrManAdd 5 5 4.45807 0.176704 0.179945 0.178443
AddrManGetAddr 5 500 4.09847 0.00163663 0.00164301 0.0016392
AddrManGood 5 2 4.66502 0.460193 0.474046 0.467368
AddrManSelect 5 1000000 2.07299 4.12892e-07 4.15564e-07 4.147e-07

std::unordered_map

Benchmark evals iterations total min max median
AddrManAdd 5 5 4.30032 0.169753 0.1738 0.172047
AddrManGetAddr 5 500 0.967398 0.000384406 0.00038996 0.000387176
AddrManGood 5 2 4.64742 0.460599 0.471216 0.464143
AddrManSelect 5 1000000 1.14085 2.23899e-07 2.36962e-07 2.26767e-07
raw output
map$ ./src/bench/bench_bitcoin -filter="AddrMan.*"
# Benchmark, evals, iterations, total, min, max, median
AddrManAdd, 5, 5, 4.45807, 0.176704, 0.179945, 0.178443
AddrManGetAddr, 5, 500, 4.09847, 0.00163663, 0.00164301, 0.0016392
AddrManGood, 5, 2, 4.66502, 0.460193, 0.474046, 0.467368
AddrManSelect, 5, 1000000, 2.07299, 4.12892e-07, 4.15564e-07, 4.147e-07

unordered_map$ ./src/bench/bench_bitcoin -filter="AddrMan.*"
# Benchmark, evals, iterations, total, min, max, median
AddrManAdd, 5, 5, 4.30032, 0.169753, 0.1738, 0.172047
AddrManGetAddr, 5, 500, 0.967398, 0.000384406, 0.00038996, 0.000387176
AddrManGood, 5, 2, 4.64742, 0.460599, 0.471216, 0.464143
AddrManSelect, 5, 1000000, 1.14085, 2.23899e-07, 2.36962e-07, 2.26767e-07

@fanquake fanquake added the P2P label Apr 21, 2020
@vasild vasild changed the title addrman: use unordered_map instead of map addrman: improve performance by using more suitable containers Apr 21, 2020
src/addrman.h Outdated Show resolved Hide resolved
src/addrman.h Show resolved Hide resolved
@MarcoFalke
Copy link
Member

MarcoFalke commented Apr 22, 2020

Would it be possible to submit the benchmark upfront as a separate pull, so that https://codespeed.bitcoinperf.com/ can start running it?

@vasild
Copy link
Contributor Author

vasild commented Apr 24, 2020

Done: #18754

@vasild
Copy link
Contributor Author

vasild commented Apr 27, 2020

Removed the first commit (which introduced the benchmark), now that it is already merged into master and rebased.

src/netaddress.h Outdated Show resolved Hide resolved
src/addrman.h Outdated Show resolved Hide resolved
@vasild vasild force-pushed the addrman_map branch 2 times, most recently from 2225590 to 7cc3172 Compare May 5, 2020
@DrahtBot
Copy link
Contributor

DrahtBot commented May 20, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this issue Jun 9, 2020
`CAddrMan` uses `std::map` internally even though it does not require
that the map's elements are sorted. `std::map`'s access time is
`O(log(map size))`. `std::unordered_map` is more suitable as it has a
`O(1)` access time.

This patch lowers the execution times of `CAddrMan`'s methods as follows
(as per `src/bench/addrman.cpp`):

```
AddrMan::Add(): -3.5%
AddrMan::GetAddr(): -76%
AddrMan::Good(): -0.38%
AddrMan::Select(): -45%
```

Github-Pull: bitcoin#18722
Rebased-From: 7cc3172
@sipa
Copy link
Member

sipa commented Jun 10, 2020

utACK 7cc3172

@hebasto
Copy link
Member

hebasto commented Jun 12, 2020

Concept ACK.

Copy link
Member

@hebasto hebasto left a comment

Benchmarking results:

$ ./src/bench/bench_bitcoin -filter="AddrMan.*"
# Benchmark, evals, iterations, total, min, max, median
AddrManAdd, 5, 5, 1.15889, 0.0458257, 0.0469068, 0.0464069
AddrManGetAddr, 5, 500, 1.94467, 0.000772949, 0.000789915, 0.000775272
AddrManGood, 5, 2, 5.69597, 0.563135, 0.57652, 0.570197
AddrManSelect, 5, 1000000, 1.17994, 2.32975e-07, 2.37942e-07, 2.36478e-07
  • master + this PR
$ ./src/bench/bench_bitcoin -filter="AddrMan.*"
# Benchmark, evals, iterations, total, min, max, median
AddrManAdd, 5, 5, 0.972042, 0.0383488, 0.0396571, 0.0386554
AddrManGetAddr, 5, 500, 0.649199, 0.000256967, 0.000262957, 0.00025942
AddrManGood, 5, 2, 5.35751, 0.531294, 0.540723, 0.534681
AddrManSelect, 5, 1000000, 0.657766, 1.29313e-07, 1.36428e-07, 1.30873e-07

Nice :)

Copy link
Member

@hebasto hebasto left a comment

nit: If #includes in addrman.h are already touched, mind moving

#include <fs.h>
#include <hash.h>
#include <streams.h>

into the first group?

src/addrman.cpp Outdated Show resolved Hide resolved
src/addrman.cpp Outdated Show resolved Hide resolved
src/netaddress.h Outdated Show resolved Hide resolved
@vasild
Copy link
Contributor Author

vasild commented Jun 15, 2020

nit: If #includes in addrman.h are already touched, mind moving...

Good catch, they should be moved! Given that there is already one ACK, I will move the headers if I alter this PR or as a followup PR if this gets merged as is now.

Copy link
Member

@hebasto hebasto left a comment

ACK 7cc3172

@vasild feel free to ignore my nits :)

src/netaddress.h Outdated
CSipHasher hasher(salt_k0, salt_k1);
hasher.Write(a.ip, sizeof(a.ip));
return (size_t)hasher.Finalize();
Copy link
Member

@hebasto hebasto Jun 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: All calls could be chained:

Suggested change
CSipHasher hasher(salt_k0, salt_k1);
hasher.Write(a.ip, sizeof(a.ip));
return (size_t)hasher.Finalize();
return CSipHasher(salt_k0, salt_k1).Write(a.ip, sizeof(a.ip)).Finalize();

Copy link
Contributor Author

@vasild vasild Jun 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually find the 3-line variant easier to read (that could be subjective). The 3-line variant is more manageable wrt setting a breakpoint or reading a backtrace (e.g. if you only see it crashed on that multi-chained-line it wouldn't be clear whether it crashed in the constructor, in the Write() method or in the Finalize() method).

I would prefer to leave it as is.

src/netaddress.h Outdated Show resolved Hide resolved
@vasild
Copy link
Contributor Author

vasild commented Jun 25, 2020

Prefixed salt_k0 and salt_k1 with m_ to abide to the naming convention. diff.

@vasild
Copy link
Contributor Author

vasild commented May 21, 2021

6f0c9f5390...6a1a28f2ad: rebased due to conflicts, plus add m_net to the CNetAddr hashing.

src/netaddress.h Outdated
size_t operator()(const CNetAddr& a) const noexcept
{
CSipHasher hasher(m_salt_k0, m_salt_k1);
hasher.Write(reinterpret_cast<const uint8_t*>(&a.m_net), sizeof(a.m_net));
Copy link
Member

@sipa sipa May 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Micronit: this reinterpret cast is a bit ugly. You could avoid it by just doing hasher.Write(a.m_net);; it'll interpret a.m_net as a uint64_t and use the optimized integer hasher.

Copy link
Contributor Author

@vasild vasild May 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I hope the implicit conversion from int to uint64_t will not upset some compiler.

Copy link
Member

@jonatack jonatack May 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ugliness could maybe be considered a virtue; it's explicit that this is an "on your head be it!" type of cast and highlights that the code could ideally be improved. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#es49-if-you-must-use-a-cast-use-a-named-cast

Copy link
Contributor Author

@vasild vasild May 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we can do without a cast since there is Write() method that takes an integer argument (modulo int -> uint64 conversion which is ok because we never pass negative values).

@sipa
Copy link
Member

sipa commented May 21, 2021

utACK 6a1a28f

@vasild
Copy link
Contributor Author

vasild commented May 24, 2021

6a1a28f2ad...c22e98b30d: address review suggestion

Copy link
Member

@jonatack jonatack left a comment

ACK c22e98b

Copy link
Member

@hebasto hebasto left a comment

ACK c22e98b

src/addrman.h Outdated Show resolved Hide resolved
@vasild
Copy link
Contributor Author

vasild commented May 28, 2021

c22e98b30d...1f4ee9c905: rebase due to conflicts and address review suggestion

Copy link
Member

@jonatack jonatack left a comment

ACK 1f4ee9c

src/addrman.h Outdated Show resolved Hide resolved
`CAddrMan` uses `std::map` internally even though it does not require
that the map's elements are sorted. `std::map`'s access time is
`O(log(map size))`. `std::unordered_map` is more suitable as it has a
`O(1)` access time.

This patch lowers the execution times of `CAddrMan`'s methods as follows
(as per `src/bench/addrman.cpp`):

```
AddrMan::Add(): -3.5%
AddrMan::GetAddr(): -76%
AddrMan::Good(): -0.38%
AddrMan::Select(): -45%
```
@vasild
Copy link
Contributor Author

vasild commented May 28, 2021

1f4ee9c905...a92485b2c2: address review suggestion

@jonatack
Copy link
Member

jonatack commented May 28, 2021

ACK a92485b

Copy link
Member

@hebasto hebasto left a comment

re-ACK a92485b, only suggested changes and rebased since my previous review.

@achow101
Copy link
Member

achow101 commented Jun 11, 2021

ACK a92485b

@fanquake fanquake merged commit 1a369f0 into bitcoin:master Jun 12, 2021
15 checks passed
@vasild vasild deleted the addrman_map branch Jun 13, 2021
sidhujag pushed a commit to syscoin/syscoin that referenced this issue Jun 13, 2021
gwillen pushed a commit to ElementsProject/elements that referenced this issue Jun 1, 2022
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Aug 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet