Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

addrman: Fix new table bucketing during unserialization #20557

Merged
merged 7 commits into from
Feb 9, 2021

Conversation

jnewbery
Copy link
Contributor

@jnewbery jnewbery commented Dec 3, 2020

This fixes three issues in addrman unserialization.

  1. An addrman entry can appear in up to 8 new table buckets. We store this entry->bucket indexing during shutdown so that on restart we can restore the entries to their correct buckets. Commit ec45646 broke the deserialization code so that each entry could only be put in up to one new bucket.

  2. Unserialization may result in an entry appearing in a 9th bucket. If the entry already appears in 8 buckets don't try to place it in another bucket.

  3. We unnecessarily rebucket when reading a peers.dat with file version 1. Don't do that.

@jnewbery
Copy link
Contributor Author

jnewbery commented Dec 3, 2020

The code currently rebuckets entries (losing the additional bucket positions) if the asmap checksum changes. I'm not sure if that's necessary and don't see any downside to keeping the same bucketing if restarting with a new asmap. If others agree, I can update this PR to not do that rebucketing.

@jnewbery
Copy link
Contributor Author

jnewbery commented Dec 3, 2020

We should also add a regression test for this. I think a good test would be to add a round trip serialize/unserialize to the addrman fuzz test, and compare the addrman before and after that round trip. There's some internal state that isn't reconstructed from serialize/unserialize (eg the vRandom ordering), but a comparator function that checked number of entries and bucket->entry pairs should work.

If anyone wants to add that fuzz test or other tests, I'm happy to add them to this PR.

@laanwj laanwj added the P2P label Dec 3, 2020
src/addrman.h Outdated Show resolved Hide resolved
@jnewbery jnewbery added the Bug label Dec 3, 2020
@jnewbery jnewbery force-pushed the 2020-12-fix-addrman-bucketing branch from 595e9e5 to 0fff708 Compare December 3, 2020 12:59
@DrahtBot
Copy link
Contributor

DrahtBot commented Dec 3, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

No conflicts as of last run.

@naumenkogs
Copy link
Member

Concept ACK.

The code currently rebuckets entries (losing the additional bucket positions) if the asmap checksum changes. I'm not sure if that's necessary and don't see any downside to keeping the same bucketing if restarting with a new asmap. If others agree, I can update this PR to not do that rebucketing.

That would mean that asmap is applied to only newly learned addresses, which is a downside (assuming asmap is an improvement). On the other hand, losing the existing 8 buckets is unfortunate.
I'm fine either way.

We could do the 8 existing + 1 asmap bucket if we really want... but that also has some consequences of potentially overriding existing buckets etc.

Also, we should keep in mind that non-rebucketing on a new asmap may allow existing records to appear in 16 buckets now instead of 8 (probably all existing records on first asmap appearance). Which is... also not particularly terrible but should be considered?

@jnewbery
Copy link
Contributor Author

jnewbery commented Dec 4, 2020

That would mean that asmap is applied to only newly learned addresses, which is a downside (assuming asmap is an improvement)

Currently the bucket positions for existing new addresses are lost entirely when changing asmap, so I don't understand how this is a downside. The choice is between throwing away those bucket positions or not.

Also, we should keep in mind that non-rebucketing on a new asmap may allow existing records to appear in 16 buckets now instead of 8 (probably all existing records on first asmap appearance). Which is... also not particularly terrible but should be considered?

No. We won't add an entry to more than 8 buckets:

bitcoin/src/addrman.cpp

Lines 294 to 296 in 257cf05

// do not update if the max reference count is reached
if (pinfo->nRefCount == ADDRMAN_NEW_BUCKETS_PER_ADDRESS)
return false;
.

@naumenkogs
Copy link
Member

naumenkogs commented Dec 4, 2020

Currently the bucket positions for existing new addresses are lost entirely when changing asmap, so I don't understand how this is a downside. The choice is between throwing away those bucket positions or not.

I expressed my agreement with you in the following sentence: "On the other hand, losing the existing 8 buckets is unfortunate.". Still, ideally, we would apply a new asmap to existing addrs without losing the data. Unfortunately, since we don't store addr source, we have to make this choice between the two. And I agree that of these 2 choices, the one you're suggesting is probably better.

No. We won't add an entry to more than 8 buckets:

Good to know! That was just my wrong assumption without looking deep into code :)

src/addrman.h Outdated Show resolved Hide resolved
src/addrman.h Outdated Show resolved Hide resolved
@jnewbery jnewbery force-pushed the 2020-12-fix-addrman-bucketing branch from 0fff708 to dbdf63c Compare December 10, 2020 11:01
@jnewbery
Copy link
Contributor Author

Thanks for the review @naumenkogs. I've addressed all your comments.

@laanwj
Copy link
Member

laanwj commented Dec 10, 2020

Concept ACK, and the code change looks overall good to me.

#20557 (comment)

Agree that a test would be useful, to make it easier to see that this fixes the mentioned problems and prevent future regression.

@naumenkogs
Copy link
Member

utACK dbdf63c

Copy link
Member

@glozow glozow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code review ACK dbdf63c

Can see that it fixes the bug from ec45646, previously looped over each entry, retrieving the first bucket we recorded in entryToBucket, and now creates a list of <bucket, index> pairs and loops over all of those instead. Some clarifying questions because I'm super unfamiliar with this code and stared at it for a long time.

src/addrman.h Outdated Show resolved Hide resolved
src/addrman.h Show resolved Hide resolved
@jnewbery jnewbery force-pushed the 2020-12-fix-addrman-bucketing branch from dbdf63c to 36f3d9d Compare January 18, 2021 13:23
An addrman entry can appear in up to 8 new table buckets. We store this
entry->bucket indexing during shutdown so that on restart we can restore
the entries to their correct buckets.

Commit ec45646 broke the
deserialization code so that each entry could only be put in up to one
new bucket. Fix that.
Version implies that higher numbers take precendence. This is really a
checksum, to check whether the provided asmap is the same as the one
used when the peers.dat file was serialized.

Also update the comments to explain where/why this is used.
Only rebucket if the asmap checksum has changed, not if the file format
has changed but no asmap is provided.

Also, don't try to add an entry to another bucket if it already appears
in ADDRMAN_NEW_BUCKETS_PER_ADDRESS buckets.
@jnewbery
Copy link
Contributor Author

@jnewbery jnewbery force-pushed the 2020-12-fix-addrman-bucketing branch from 36f3d9d to ac3547e Compare January 18, 2021 14:34
@jnewbery
Copy link
Contributor Author

Rebased on master

@naumenkogs
Copy link
Member

ACK ac3547e

src/addrman.h Outdated Show resolved Hide resolved
src/addrman.h Show resolved Hide resolved
src/addrman.h Outdated Show resolved Hide resolved
@vasild
Copy link
Contributor

vasild commented Jan 26, 2021

I think a good test would be to add a round trip serialize/unserialize to the addrman fuzz test, and compare the addrman before and after that round trip. There's some internal state that isn't reconstructed from serialize/unserialize (eg the vRandom ordering), but a comparator function that checked number of entries and bucket->entry pairs should work.

Created such test at: jnewbery#18.

It does not compare bucket->entry though because the ids (the keys into mapInfo) could be different after ser/unser.

@jnewbery
Copy link
Contributor Author

Thanks for the review @vasild. I've resolved your review comments.

I haven't taken your fuzz test yet since it seems like there's a bit more work to make sure that it's getting reliable coverage.

@jnewbery jnewbery force-pushed the 2020-12-fix-addrman-bucketing branch from 9a0760d to 195a12d Compare January 29, 2021 12:39
Thanks to Vasil Dimov <vd@FreeBSD.org> for these suggestions
Thanks to Vasil Dimov <vd@FreeBSD.org> for these suggestions
@jnewbery jnewbery force-pushed the 2020-12-fix-addrman-bucketing branch from 195a12d to 4676a4f Compare January 29, 2021 12:40
@vasild
Copy link
Contributor

vasild commented Jan 29, 2021

ACK 4676a4f

@glozow
Copy link
Member

glozow commented Jan 29, 2021

re-ACK 4676a4f, changes were a rename, comments, and removing repeat-logging.

@naumenkogs
Copy link
Member

ACK 4676a4f

@jnewbery
Copy link
Contributor Author

jnewbery commented Feb 3, 2021

@MarcoFalke or @laanwj - this PR now has 3 ACKs. If one of you re-reviewed, then it might be ready for merge.

Copy link
Contributor

@dhruv dhruv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 4676a4f

A couple nits and questions below.

src/addrman.h Show resolved Hide resolved
src/addrman.h Show resolved Hide resolved
src/addrman.h Show resolved Hide resolved
src/addrman.h Outdated Show resolved Hide resolved
Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 4676a4f. I'm not previously familiar with this code but all the changes here do make sense and seem like improvements. Left some notes and comments, but they aren't important so feel to ignore.

src/addrman.h Outdated
Comment on lines 528 to 530
for (auto bucket_entry : bucket_entries) {
int bucket{bucket_entry.first};
const int n{bucket_entry.second};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "[addrman] Fix new table bucketing during unserialization" (b4c5fda)

Could avoid some verbosity here with

-        for (auto bucket_entry : bucket_entries) {
-            int bucket{bucket_entry.first};
-            const int n{bucket_entry.second};
+        for (auto [bucket, n] : bucket_entries) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ structured bindings. Good suggestion. I'll do in a follow-up.

std::map<int, int> entryToBucket; // Represents which entry belonged to which bucket when serializing
// An entry may appear in up to ADDRMAN_NEW_BUCKETS_PER_ADDRESS buckets,
// so we store all bucket-entry_index pairs to iterate through later.
std::vector<std::pair<int, int>> bucket_entries;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "[addrman] Fix new table bucketing during unserialization" (b4c5fda)

Seems like having a data structure here is superfluous if it's just going to be filled up then immediately iterated over and discarded. Maybe it would make sense to remove this vector later.

Copy link
Contributor

@vasild vasild Feb 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that vector could be avoided only if we could seek into the stream. Then we could change the current:

read new buckets
read asmap checksum
process the new buckets that were read, depending on the asmap checksum

to

seek forward and read the asmap checksum (last 32 bytes)
seek back and read+process the new bucketing data, depending on the asmap checksum

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only know how we're going to use the vector after we've deserialized the asmap_checksum field, so I don't think we can get rid of it.

src/addrman.h Outdated
@@ -509,7 +511,7 @@ friend class CAddrManTest;
int nIndex = 0;
s >> nIndex;
if (nIndex >= 0 && nIndex < nNew) {
entryToBucket[nIndex] = bucket;
bucket_entries.emplace_back(bucket, nIndex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "[addrman] Fix new table bucketing during unserialization" (b4c5fda)

It would be good to have some test coverage for this. A functional test might be hard but I would think a c++ unit test should be straightforward. Past bugs predict future bugs so a test here could steer coverage in a useful direction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test is in the baking at jnewbery#18.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely agree. @vasild has a test here: jnewbery#18 which can be added after this PR.

src/addrman.h Show resolved Hide resolved
src/addrman.h Outdated
int nUBucketPos = info.GetBucketPosition(nKey, true, bucket);
if (format >= Format::V2_ASMAP && nUBuckets == ADDRMAN_NEW_BUCKET_COUNT && vvNew[bucket][nUBucketPos] == -1 &&
info.nRefCount < ADDRMAN_NEW_BUCKETS_PER_ADDRESS && serialized_asmap_checksum == supplied_asmap_checksum) {
if (restore_bucketing && vvNew[bucket][nUBucketPos] == -1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "[addrman] Don't rebucket new table entries unnecessarily" (a5c9b04)

Just some notes to make sure I'm following this correctly.

  • The restore_bucketing change here is aesthetic and the only actual change this line is dropping the format >= Format::V2_ASMAP condition.
  • The effect of dropping format condition is this will now keep existing bucket positions instead of rebucketing when an old format is loaded and there is no asmap. ("Only rebucket if" in commit description means "Use existing bucket positions if")
  • In the case where an old format is loaded an there is an asmap, this logic isn't triggered because the serialized_asmap_checksum == supplied_asmap_checksum condition is false.
  • Maybe could be a unit test checking old format number + no asmap uses existing bucket positions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The restore_bucketing change here is aesthetic and the only actual change this line is dropping the format >= Format::V2_ASMAP condition.

Right, this condition doesn't change during loop iterations, so it's clearer to move it out of the loop, mark it const, and only do a conditional check inside the loop on what can change over iterations.

The effect of dropping format condition is this will now keep existing bucket positions instead of rebucketing when an old format is loaded and there is no asmap. ("Only rebucket if" in commit description means "Use existing bucket positions if")

That's right. If the file format is old (no asmap), and we don't have an asmap, then we gain nothing by dropping new bucket positions.

In the case where an old format is loaded an there is an asmap, this logic isn't triggered because the serialized_asmap_checksum == supplied_asmap_checksum condition is false.

Right.

Maybe could be a unit test checking old format number + no asmap uses existing bucket positions

Perhaps. It's a bit difficult to test because we can't easily serialize addrman in the old format.

@jnewbery
Copy link
Contributor Author

jnewbery commented Feb 9, 2021

Thanks @dhruv and @ryanofsky for the reviews. I think this is ready for merge now, and the remaining review comments can be done in follow-ups:

@laanwj
Copy link
Member

laanwj commented Feb 9, 2021

Code review ACK 4676a4f

@laanwj laanwj merged commit b847f49 into bitcoin:master Feb 9, 2021
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Feb 9, 2021
…ization

4676a4f [addrman] Don't repeat "Bucketing method was updated" log multiple times (John Newbery)
4362923 [addrman] Improve serialization comments (John Newbery)
ac3547e [addrman] Improve variable naming/code style of touched code. (John Newbery)
a5c9b04 [addrman] Don't rebucket new table entries unnecessarily (John Newbery)
8062d92 [addrman] Rename asmap version to asmap checksum (John Newbery)
009b8e0 [addrman] Improve variable naming/code style of touched code. (John Newbery)
b4c5fda [addrman] Fix new table bucketing during unserialization (John Newbery)

Pull request description:

  This fixes three issues in addrman unserialization.

  1. An addrman entry can appear in up to 8 new table buckets. We store this entry->bucket indexing during shutdown so that on restart we can restore the entries to their correct buckets. Commit ec45646 broke the deserialization code so that each entry could only be put in up to one new bucket.

  2. Unserialization may result in an entry appearing in a 9th bucket. If the entry already appears in 8 buckets don't try to place it in another bucket.

  3. We unnecessarily rebucket when reading a peers.dat with file version 1. Don't do that.

ACKs for top commit:
  vasild:
    ACK 4676a4f
  glozow:
    re-ACK bitcoin@4676a4f, changes were a rename, comments, and removing repeat-logging.
  naumenkogs:
    ACK 4676a4f
  laanwj:
    Code review ACK 4676a4f
  dhruv:
    ACK 4676a4f
  ryanofsky:
    Code review ACK 4676a4f. I'm not previously familiar with this code but all the changes here do make sense and seem like improvements. Left some notes and comments, but they aren't important so feel to ignore.

Tree-SHA512: b228984f6dec5910be23c3740ae20258da33bcf66ceb7edb10e5a53163450f743bab349e47f09808b7e8d40f27143119ec3e0981d7e678aa494d8559a1c99c23
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jun 27, 2021
…ization

4676a4f [addrman] Don't repeat "Bucketing method was updated" log multiple times (John Newbery)
4362923 [addrman] Improve serialization comments (John Newbery)
ac3547e [addrman] Improve variable naming/code style of touched code. (John Newbery)
a5c9b04 [addrman] Don't rebucket new table entries unnecessarily (John Newbery)
8062d92 [addrman] Rename asmap version to asmap checksum (John Newbery)
009b8e0 [addrman] Improve variable naming/code style of touched code. (John Newbery)
b4c5fda [addrman] Fix new table bucketing during unserialization (John Newbery)

Pull request description:

  This fixes three issues in addrman unserialization.

  1. An addrman entry can appear in up to 8 new table buckets. We store this entry->bucket indexing during shutdown so that on restart we can restore the entries to their correct buckets. Commit ec45646 broke the deserialization code so that each entry could only be put in up to one new bucket.

  2. Unserialization may result in an entry appearing in a 9th bucket. If the entry already appears in 8 buckets don't try to place it in another bucket.

  3. We unnecessarily rebucket when reading a peers.dat with file version 1. Don't do that.

ACKs for top commit:
  vasild:
    ACK 4676a4f
  glozow:
    re-ACK bitcoin@4676a4f, changes were a rename, comments, and removing repeat-logging.
  naumenkogs:
    ACK 4676a4f
  laanwj:
    Code review ACK 4676a4f
  dhruv:
    ACK 4676a4f
  ryanofsky:
    Code review ACK 4676a4f. I'm not previously familiar with this code but all the changes here do make sense and seem like improvements. Left some notes and comments, but they aren't important so feel to ignore.

Tree-SHA512: b228984f6dec5910be23c3740ae20258da33bcf66ceb7edb10e5a53163450f743bab349e47f09808b7e8d40f27143119ec3e0981d7e678aa494d8559a1c99c23
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jun 28, 2021
…ization

4676a4f [addrman] Don't repeat "Bucketing method was updated" log multiple times (John Newbery)
4362923 [addrman] Improve serialization comments (John Newbery)
ac3547e [addrman] Improve variable naming/code style of touched code. (John Newbery)
a5c9b04 [addrman] Don't rebucket new table entries unnecessarily (John Newbery)
8062d92 [addrman] Rename asmap version to asmap checksum (John Newbery)
009b8e0 [addrman] Improve variable naming/code style of touched code. (John Newbery)
b4c5fda [addrman] Fix new table bucketing during unserialization (John Newbery)

Pull request description:

  This fixes three issues in addrman unserialization.

  1. An addrman entry can appear in up to 8 new table buckets. We store this entry->bucket indexing during shutdown so that on restart we can restore the entries to their correct buckets. Commit ec45646 broke the deserialization code so that each entry could only be put in up to one new bucket.

  2. Unserialization may result in an entry appearing in a 9th bucket. If the entry already appears in 8 buckets don't try to place it in another bucket.

  3. We unnecessarily rebucket when reading a peers.dat with file version 1. Don't do that.

ACKs for top commit:
  vasild:
    ACK 4676a4f
  glozow:
    re-ACK bitcoin@4676a4f, changes were a rename, comments, and removing repeat-logging.
  naumenkogs:
    ACK 4676a4f
  laanwj:
    Code review ACK 4676a4f
  dhruv:
    ACK 4676a4f
  ryanofsky:
    Code review ACK 4676a4f. I'm not previously familiar with this code but all the changes here do make sense and seem like improvements. Left some notes and comments, but they aren't important so feel to ignore.

Tree-SHA512: b228984f6dec5910be23c3740ae20258da33bcf66ceb7edb10e5a53163450f743bab349e47f09808b7e8d40f27143119ec3e0981d7e678aa494d8559a1c99c23
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jun 29, 2021
…ization

4676a4f [addrman] Don't repeat "Bucketing method was updated" log multiple times (John Newbery)
4362923 [addrman] Improve serialization comments (John Newbery)
ac3547e [addrman] Improve variable naming/code style of touched code. (John Newbery)
a5c9b04 [addrman] Don't rebucket new table entries unnecessarily (John Newbery)
8062d92 [addrman] Rename asmap version to asmap checksum (John Newbery)
009b8e0 [addrman] Improve variable naming/code style of touched code. (John Newbery)
b4c5fda [addrman] Fix new table bucketing during unserialization (John Newbery)

Pull request description:

  This fixes three issues in addrman unserialization.

  1. An addrman entry can appear in up to 8 new table buckets. We store this entry->bucket indexing during shutdown so that on restart we can restore the entries to their correct buckets. Commit ec45646 broke the deserialization code so that each entry could only be put in up to one new bucket.

  2. Unserialization may result in an entry appearing in a 9th bucket. If the entry already appears in 8 buckets don't try to place it in another bucket.

  3. We unnecessarily rebucket when reading a peers.dat with file version 1. Don't do that.

ACKs for top commit:
  vasild:
    ACK 4676a4f
  glozow:
    re-ACK bitcoin@4676a4f, changes were a rename, comments, and removing repeat-logging.
  naumenkogs:
    ACK 4676a4f
  laanwj:
    Code review ACK 4676a4f
  dhruv:
    ACK 4676a4f
  ryanofsky:
    Code review ACK 4676a4f. I'm not previously familiar with this code but all the changes here do make sense and seem like improvements. Left some notes and comments, but they aren't important so feel to ignore.

Tree-SHA512: b228984f6dec5910be23c3740ae20258da33bcf66ceb7edb10e5a53163450f743bab349e47f09808b7e8d40f27143119ec3e0981d7e678aa494d8559a1c99c23
@jnewbery jnewbery deleted the 2020-12-fix-addrman-bucketing branch July 1, 2021 17:21
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jul 1, 2021
…ization

4676a4f [addrman] Don't repeat "Bucketing method was updated" log multiple times (John Newbery)
4362923 [addrman] Improve serialization comments (John Newbery)
ac3547e [addrman] Improve variable naming/code style of touched code. (John Newbery)
a5c9b04 [addrman] Don't rebucket new table entries unnecessarily (John Newbery)
8062d92 [addrman] Rename asmap version to asmap checksum (John Newbery)
009b8e0 [addrman] Improve variable naming/code style of touched code. (John Newbery)
b4c5fda [addrman] Fix new table bucketing during unserialization (John Newbery)

Pull request description:

  This fixes three issues in addrman unserialization.

  1. An addrman entry can appear in up to 8 new table buckets. We store this entry->bucket indexing during shutdown so that on restart we can restore the entries to their correct buckets. Commit ec45646 broke the deserialization code so that each entry could only be put in up to one new bucket.

  2. Unserialization may result in an entry appearing in a 9th bucket. If the entry already appears in 8 buckets don't try to place it in another bucket.

  3. We unnecessarily rebucket when reading a peers.dat with file version 1. Don't do that.

ACKs for top commit:
  vasild:
    ACK 4676a4f
  glozow:
    re-ACK bitcoin@4676a4f, changes were a rename, comments, and removing repeat-logging.
  naumenkogs:
    ACK 4676a4f
  laanwj:
    Code review ACK 4676a4f
  dhruv:
    ACK 4676a4f
  ryanofsky:
    Code review ACK 4676a4f. I'm not previously familiar with this code but all the changes here do make sense and seem like improvements. Left some notes and comments, but they aren't important so feel to ignore.

Tree-SHA512: b228984f6dec5910be23c3740ae20258da33bcf66ceb7edb10e5a53163450f743bab349e47f09808b7e8d40f27143119ec3e0981d7e678aa494d8559a1c99c23
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jul 1, 2021
…ization

4676a4f [addrman] Don't repeat "Bucketing method was updated" log multiple times (John Newbery)
4362923 [addrman] Improve serialization comments (John Newbery)
ac3547e [addrman] Improve variable naming/code style of touched code. (John Newbery)
a5c9b04 [addrman] Don't rebucket new table entries unnecessarily (John Newbery)
8062d92 [addrman] Rename asmap version to asmap checksum (John Newbery)
009b8e0 [addrman] Improve variable naming/code style of touched code. (John Newbery)
b4c5fda [addrman] Fix new table bucketing during unserialization (John Newbery)

Pull request description:

  This fixes three issues in addrman unserialization.

  1. An addrman entry can appear in up to 8 new table buckets. We store this entry->bucket indexing during shutdown so that on restart we can restore the entries to their correct buckets. Commit ec45646 broke the deserialization code so that each entry could only be put in up to one new bucket.

  2. Unserialization may result in an entry appearing in a 9th bucket. If the entry already appears in 8 buckets don't try to place it in another bucket.

  3. We unnecessarily rebucket when reading a peers.dat with file version 1. Don't do that.

ACKs for top commit:
  vasild:
    ACK 4676a4f
  glozow:
    re-ACK bitcoin@4676a4f, changes were a rename, comments, and removing repeat-logging.
  naumenkogs:
    ACK 4676a4f
  laanwj:
    Code review ACK 4676a4f
  dhruv:
    ACK 4676a4f
  ryanofsky:
    Code review ACK 4676a4f. I'm not previously familiar with this code but all the changes here do make sense and seem like improvements. Left some notes and comments, but they aren't important so feel to ignore.

Tree-SHA512: b228984f6dec5910be23c3740ae20258da33bcf66ceb7edb10e5a53163450f743bab349e47f09808b7e8d40f27143119ec3e0981d7e678aa494d8559a1c99c23
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jul 15, 2021
…ization

4676a4f [addrman] Don't repeat "Bucketing method was updated" log multiple times (John Newbery)
4362923 [addrman] Improve serialization comments (John Newbery)
ac3547e [addrman] Improve variable naming/code style of touched code. (John Newbery)
a5c9b04 [addrman] Don't rebucket new table entries unnecessarily (John Newbery)
8062d92 [addrman] Rename asmap version to asmap checksum (John Newbery)
009b8e0 [addrman] Improve variable naming/code style of touched code. (John Newbery)
b4c5fda [addrman] Fix new table bucketing during unserialization (John Newbery)

Pull request description:

  This fixes three issues in addrman unserialization.

  1. An addrman entry can appear in up to 8 new table buckets. We store this entry->bucket indexing during shutdown so that on restart we can restore the entries to their correct buckets. Commit ec45646 broke the deserialization code so that each entry could only be put in up to one new bucket.

  2. Unserialization may result in an entry appearing in a 9th bucket. If the entry already appears in 8 buckets don't try to place it in another bucket.

  3. We unnecessarily rebucket when reading a peers.dat with file version 1. Don't do that.

ACKs for top commit:
  vasild:
    ACK 4676a4f
  glozow:
    re-ACK bitcoin@4676a4f, changes were a rename, comments, and removing repeat-logging.
  naumenkogs:
    ACK 4676a4f
  laanwj:
    Code review ACK 4676a4f
  dhruv:
    ACK 4676a4f
  ryanofsky:
    Code review ACK 4676a4f. I'm not previously familiar with this code but all the changes here do make sense and seem like improvements. Left some notes and comments, but they aren't important so feel to ignore.

Tree-SHA512: b228984f6dec5910be23c3740ae20258da33bcf66ceb7edb10e5a53163450f743bab349e47f09808b7e8d40f27143119ec3e0981d7e678aa494d8559a1c99c23
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jul 16, 2021
…ization

4676a4f [addrman] Don't repeat "Bucketing method was updated" log multiple times (John Newbery)
4362923 [addrman] Improve serialization comments (John Newbery)
ac3547e [addrman] Improve variable naming/code style of touched code. (John Newbery)
a5c9b04 [addrman] Don't rebucket new table entries unnecessarily (John Newbery)
8062d92 [addrman] Rename asmap version to asmap checksum (John Newbery)
009b8e0 [addrman] Improve variable naming/code style of touched code. (John Newbery)
b4c5fda [addrman] Fix new table bucketing during unserialization (John Newbery)

Pull request description:

  This fixes three issues in addrman unserialization.

  1. An addrman entry can appear in up to 8 new table buckets. We store this entry->bucket indexing during shutdown so that on restart we can restore the entries to their correct buckets. Commit ec45646 broke the deserialization code so that each entry could only be put in up to one new bucket.

  2. Unserialization may result in an entry appearing in a 9th bucket. If the entry already appears in 8 buckets don't try to place it in another bucket.

  3. We unnecessarily rebucket when reading a peers.dat with file version 1. Don't do that.

ACKs for top commit:
  vasild:
    ACK 4676a4f
  glozow:
    re-ACK bitcoin@4676a4f, changes were a rename, comments, and removing repeat-logging.
  naumenkogs:
    ACK 4676a4f
  laanwj:
    Code review ACK 4676a4f
  dhruv:
    ACK 4676a4f
  ryanofsky:
    Code review ACK 4676a4f. I'm not previously familiar with this code but all the changes here do make sense and seem like improvements. Left some notes and comments, but they aren't important so feel to ignore.

Tree-SHA512: b228984f6dec5910be23c3740ae20258da33bcf66ceb7edb10e5a53163450f743bab349e47f09808b7e8d40f27143119ec3e0981d7e678aa494d8559a1c99c23
gades pushed a commit to cosanta/cosanta-core that referenced this pull request May 8, 2022
…ization

4676a4f [addrman] Don't repeat "Bucketing method was updated" log multiple times (John Newbery)
4362923 [addrman] Improve serialization comments (John Newbery)
ac3547e [addrman] Improve variable naming/code style of touched code. (John Newbery)
a5c9b04 [addrman] Don't rebucket new table entries unnecessarily (John Newbery)
8062d92 [addrman] Rename asmap version to asmap checksum (John Newbery)
009b8e0 [addrman] Improve variable naming/code style of touched code. (John Newbery)
b4c5fda [addrman] Fix new table bucketing during unserialization (John Newbery)

Pull request description:

  This fixes three issues in addrman unserialization.

  1. An addrman entry can appear in up to 8 new table buckets. We store this entry->bucket indexing during shutdown so that on restart we can restore the entries to their correct buckets. Commit ec45646 broke the deserialization code so that each entry could only be put in up to one new bucket.

  2. Unserialization may result in an entry appearing in a 9th bucket. If the entry already appears in 8 buckets don't try to place it in another bucket.

  3. We unnecessarily rebucket when reading a peers.dat with file version 1. Don't do that.

ACKs for top commit:
  vasild:
    ACK 4676a4f
  glozow:
    re-ACK bitcoin@4676a4f, changes were a rename, comments, and removing repeat-logging.
  naumenkogs:
    ACK 4676a4f
  laanwj:
    Code review ACK 4676a4f
  dhruv:
    ACK 4676a4f
  ryanofsky:
    Code review ACK 4676a4f. I'm not previously familiar with this code but all the changes here do make sense and seem like improvements. Left some notes and comments, but they aren't important so feel to ignore.

Tree-SHA512: b228984f6dec5910be23c3740ae20258da33bcf66ceb7edb10e5a53163450f743bab349e47f09808b7e8d40f27143119ec3e0981d7e678aa494d8559a1c99c23
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Aug 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants