Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

addrman serialize: nNew was wrong, oh ow #22450

Closed
maflcko opened this issue Jul 15, 2021 · 11 comments · Fixed by #22455
Closed

addrman serialize: nNew was wrong, oh ow #22450

maflcko opened this issue Jul 15, 2021 · 11 comments · Fixed by #22455
Milestone

Comments

@maflcko
Copy link
Member

maflcko commented Jul 15, 2021

Made observable by the fuzz tests in commit e0a2b39

Crash in

assert(nIds != nNew); // this means nNew was wrong, oh ow

OSS-Fuzz reproducer: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=36208

Can be reproduced locally in a few CPU hours via FUZZ=addrman_deserialize ./src/test/fuzz/fuzz -use_value_profile=1 (using the qa-assets seed as starting point).

@maflcko maflcko added this to the 22.0 milestone Jul 15, 2021
@jonatack
Copy link
Contributor

Another win for fuzzing, oh wow.

@fanquake
Copy link
Member

cc @vasild

@vasild
Copy link
Contributor

vasild commented Jul 15, 2021

Fuzzer rulez!

@jnewbery
Copy link
Contributor

The repositioning/rebucketing approach of ResetI2PPorts() introduced in #22112 seems very brittle. I think it would make much more sense just to remove any I2P addresses which don't have port 0, and then perhaps add them back through the public (and well-tested) Add() interface.

I don't know the exact bug that leads to this internal addrman inconsistency, but this loop seems suspect to me:

    for (int bucket = 0; bucket < ADDRMAN_NEW_BUCKET_COUNT; ++bucket) {
        for (int i = 0; i < ADDRMAN_BUCKET_SIZE; ++i) {
            const auto id = vvNew[bucket][i];
            if (id == -1) {
                continue;
            }
            auto it = mapInfo.find(id);
            if (it == mapInfo.end()) {
                return;
            }
            auto& addr_info = it->second;
            if (!addr_info.IsI2P() || addr_info.GetPort() == I2P_SAM31_PORT) {
                continue;
            }

            auto addr_info_newport = addr_info;
            // The below changes addr_info_newport.GetKey(), which is used in finding a
            // bucket and a position within that bucket. So a re-bucketing may be necessary.
            addr_info_newport.port = I2P_SAM31_PORT;

            // Reposition entries of vvNew within the same bucket because we don't know the source
            // address which led to the decision to store the entry in vvNew[bucket] so we can't
            // re-evaluate that decision, but even if we could, CAddrInfo::GetNewBucket() does not
            // use CAddrInfo::GetKey() so it would end up in the same bucket as before the port
            // change.
            const auto i_target = addr_info_newport.GetBucketPosition(nKey, true, bucket);

            if (i_target == i) { // No need to re-position.
                addr_info = addr_info_newport;
                continue;
            }

            // Reposition from i to i_target, removing the entry from i_target (if any).
            ClearNew(bucket, i_target);
            vvNew[bucket][i_target] = id;
            vvNew[bucket][i] = -1;
            addr_info = addr_info_newport;
        }
    }

If the same I2P address appears in multiple new buckets, then the first time this loop finds that address, it'll update the port and reposition it in that bucket. Any subsequent times that the address is encountered, it will already have had its port updated so the inner loop will continue due to addr_info.GetPort() == I2P_SAM31_PORT, but it'll be in the wrong position in its bucket. If we later try to promote the entry from new to tried, then this loop in MakeTried() will fail to find the entry in those buckets:

    // remove the entry from all new buckets
    for (int bucket = 0; bucket < ADDRMAN_NEW_BUCKET_COUNT; bucket++) {
        int pos = info.GetBucketPosition(nKey, true, bucket);
        if (vvNew[bucket][pos] == nId) {
            vvNew[bucket][pos] = -1;
            info.nRefCount--;
        }
    }
    nNew--;

I'm not sure whether this is the exact cause of this assert, but it seems like it could at least lead to inconsistencies. That's always a risk when reaching into the internal data structures of a complex class like addrman.

@maflcko
Copy link
Member Author

maflcko commented Jul 15, 2021

More crashes:

$ FUZZ=addrman_deserialize ./src/test/fuzz/fuzz ./crash-ef7d4a4f940b8fdb4999f6a92969260cf8e1405f.log 
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 2455641637
INFO: Loaded 1 modules   (228950 inline 8-bit counters): 228950 [0x55bb030060f8, 0x55bb0303df4e), 
INFO: Loaded 1 PC tables (228950 PCs): 228950 [0x55bb0303df50,0x55bb033bc4b0), 
./src/test/fuzz/fuzz: Running 1 inputs 1 time(s) each.
Running: ./crash-ef7d4a4f940b8fdb4999f6a92969260cf8e1405f.log
fuzz: addrman.cpp:149: void CAddrMan::SwapRandom(unsigned int, unsigned int): Assertion `nRndPos1 < vRandom.size() && nRndPos2 < vRandom.size()' failed.
==412116== ERROR: libFuzzer: deadly signal
...
    #9 0x55bb026c65f9 in CAddrMan::SwapRandom(unsigned int, unsigned int) addrman.cpp:149:5
    #10 0x55bb026c7758 in CAddrMan::Delete(int) addrman.cpp:173:5
    #11 0x55bb026d1c0b in CAddrMan::ClearNew(int, int) addrman.cpp:192:13
    #12 0x55bb026d1c0b in CAddrMan::ResetI2PPorts() addrman.cpp:819:17
    #13 0x55bb025d122a in void CAddrMan::Unserialize<CDataStream>(CDataStream&) ./addrman.h:455:9
    #14 0x55bb025b068d in void Unserialize<CDataStream, CAddrMan&>(CDataStream&, CAddrMan&) ./serialize.h:676:7
    #15 0x55bb025b068d in CDataStream& CDataStream::operator>><CAddrMan&>(CAddrMan&) ./streams.h:428:9
    #16 0x55bb025b068d in void (anonymous namespace)::DeserializeFromFuzzingInput<CAddrMan>(Span<unsigned char const>, CAddrMan&, std::optional<int>, int) test/fuzz/deserialize.cpp:87:12
    #17 0x55bb025b068d in addrman_deserialize_fuzz_target(Span<unsigned char const>) test/fuzz/deserialize.cpp:201:1
...

crash-ef7d4a4f940b8fdb4999f6a92969260cf8e1405f.log

@vasild
Copy link
Contributor

vasild commented Jul 15, 2021

This is what the fuzzed addrman data produces at the end of Unserialize(), before RemoveInvalid() and before ResetI2PPorts():

nNew=-1, nTried=5
id=3, addr_info=[::]:0, in tried=1
id=2, addr_info=4ucsbzifedsqkihfauqokbja4ucsbzifedsqkihfauqokbja4ucq.b32.i2p:8192, in tried=1
id=1, addr_info=[::]:229, in tried=1
id=0, addr_info=4ucsbzifedsqkihfauqokbja4ucsbzifedsqkihfauqokbja4ucq.b32.i2p:256, in tried=1
id=-1, addr_info=[::]:8421, in tried=1
vRandom=-1,0,1,2,3,

After RemoveInvalid():

nNew=-1, nTried=3
id=2, addr_info=4ucsbzifedsqkihfauqokbja4ucsbzifedsqkihfauqokbja4ucq.b32.i2p:8192, in tried=1
id=0, addr_info=4ucsbzifedsqkihfauqokbja4ucsbzifedsqkihfauqokbja4ucq.b32.i2p:256, in tried=1
id=-1, addr_info=[::]:8421, in tried=1
vRandom=-1,0,2,

After ResetI2PPorts():

nNew=0, nTried=2
id=2, addr_info=4ucsbzifedsqkihfauqokbja4ucsbzifedsqkihfauqokbja4ucq.b32.i2p:0, in tried=0
id=0, addr_info=4ucsbzifedsqkihfauqokbja4ucsbzifedsqkihfauqokbja4ucq.b32.i2p:0, in tried=1
id=-1, addr_info=[::]:8421, in tried=1
vRandom=-1,0,2,

Later Serialize() chokes on this because there is at least one entry in mapInfo[] with nRefCount > 0 and nNew is 0. The problem originated from the unserialized negative nNew. ResetI2PPorts() correctly moved one entry from "tried" to "new" and incremented nNew (from -1 to 0).

TODO:

  • Why RemoveInvalid() did not remove id=-1, addr_info=[::]:8421, in tried=1?
  • Resolve the problem @jnewbery highlighted above.
  • See what is the other crash reported by @MarcoFalke above.

@jnewbery

remove any I2P addresses which don't have port 0, and then perhaps add them back through the public (and well-tested) Add() interface

I considered this approach and ditched it for some reason, but I do not remember why :/ It could have been that deleting an entry has to be done again by fiddling directly with the internal structures and was about as complicated as the current code. Will look at this again.

@maflcko
Copy link
Member Author

maflcko commented Jul 15, 2021

Why RemoveInvalid() did not remove id=-1, addr_info=[::]:8421, in tried=1?

Maybe related: #22362

@vasild
Copy link
Contributor

vasild commented Jul 16, 2021

Just for reference, attaching the fuzzer input to reproduce the problem that is reported in the OP/description of this issue.

clusterfuzz-testcase-minimized-addrman_deserialize-5090634410098688.log

vasild added a commit to vasild/bitcoin that referenced this issue Jul 16, 2021
In `CAddrMan::ResetI2PPorts()`, if an I2P address is found in two or
more "new" buckets, our first encounter of it would change the port to
0 and re-position it within that "new" bucket. The `CAddrInfo` object is
shared between all occurrences of an address in all "new" buckets. So
subsequent encounters of that address will see the `CAddrInfo` already
with port 0 and will skip re-positioning.

To fix that, check and re-position if necessary even for I2P entries
with port 0.

Fixes bitcoin#22450 (comment)
@vasild
Copy link
Contributor

vasild commented Jul 16, 2021

Fix pending at #22455

@jonatack
Copy link
Contributor

Reproduced on my old laptop with the addrman_deserialize fuzzer

#530590	REDUCE cov: 2083 ft: 11547 corp: 872/18Mb lim: 1048575 exec/s: 80 rss: 683Mb L: 282/957238 MS: 2 EraseBytes-InsertRepeatedBytes-
fuzz: ./addrman.h:261: void CAddrMan::Serialize(Stream &) const [Stream = CDataStream]: Assertion `nIds != nNew' failed.
==4549== ERROR: libFuzzer: deadly signal
    #0 0x55b32a31ddb1 in __GNU_EH_FRAME_HDR (/home/jon/projects/bitcoin/bitcoin/src/test/fuzz/fuzz+0xea8db1)
    #1 0x55b32a263ce8  (/home/jon/projects/bitcoin/bitcoin/src/test/fuzz/fuzz+0xdeece8)
    #2 0x55b32a247ad3  (/home/jon/projects/bitcoin/bitcoin/src/test/fuzz/fuzz+0xdd2ad3)
    #3 0x7f24ff51d13f  (/lib/x86_64-linux-gnu/libpthread.so.0+0x1413f)
    #4 0x7f24ff189ce0 in __libc_signal_restore_set signal/../sysdeps/unix/sysv/linux/internal-signals.h:86:3
    #5 0x7f24ff189ce0 in raise signal/../sysdeps/unix/sysv/linux/raise.c:48:3
    #6 0x7f24ff173536 in abort stdlib/abort.c:79:7
    #7 0x7f24ff17340e in __assert_fail_base assert/assert.c:92:3
    #8 0x7f24ff182661 in __assert_fail assert/assert.c:101:3
    #9 0x55b32a377569 in __GNU_EH_FRAME_HDR (/home/jon/projects/bitcoin/bitcoin/src/test/fuzz/fuzz+0xf02569)
    #10 0x55b32a3562df in __GNU_EH_FRAME_HDR (/home/jon/projects/bitcoin/bitcoin/src/test/fuzz/fuzz+0xee12df)
    #11 0x55b32a427581 in __GNU_EH_FRAME_HDR (/home/jon/projects/bitcoin/bitcoin/src/test/fuzz/fuzz+0xfb2581)
    #12 0x55b32a427103 in __GNU_EH_FRAME_HDR (/home/jon/projects/bitcoin/bitcoin/src/test/fuzz/fuzz+0xfb2103)
    #13 0x55b32a34bc45 in __GNU_EH_FRAME_HDR (/home/jon/projects/bitcoin/bitcoin/src/test/fuzz/fuzz+0xed6c45)
    #14 0x55b32a34bb0d in __GNU_EH_FRAME_HDR (/home/jon/projects/bitcoin/bitcoin/src/test/fuzz/fuzz+0xed6b0d)
    #15 0x55b32a34b8a8 in __GNU_EH_FRAME_HDR (/home/jon/projects/bitcoin/bitcoin/src/test/fuzz/fuzz+0xed68a8)
    #16 0x55b32b1b6cad in _end (/home/jon/projects/bitcoin/bitcoin/src/test/fuzz/fuzz+0x1d41cad)
    #17 0x55b32b1b6958 in _end (/home/jon/projects/bitcoin/bitcoin/src/test/fuzz/fuzz+0x1d41958)
    #18 0x55b32a249363  (/home/jon/projects/bitcoin/bitcoin/src/test/fuzz/fuzz+0xdd4363)
    #19 0x55b32a24886a  (/home/jon/projects/bitcoin/bitcoin/src/test/fuzz/fuzz+0xdd386a)
    #20 0x55b32a24a567  (/home/jon/projects/bitcoin/bitcoin/src/test/fuzz/fuzz+0xdd5567)
    #21 0x55b32a24b1a5  (/home/jon/projects/bitcoin/bitcoin/src/test/fuzz/fuzz+0xdd61a5)
    #22 0x55b32a23a487 in typeinfo name for boost::system::detail::system_error_category (/home/jon/projects/bitcoin/bitcoin/src/test/fuzz/fuzz+0xdc5487)
    #23 0x55b32a2644a2  (/home/jon/projects/bitcoin/bitcoin/src/test/fuzz/fuzz+0xdef4a2)
    #24 0x7f24ff174d09 in __libc_start_main csu/../csu/libc-start.c:308:16
    #25 0x55b32a2110a9 in secp256k1_fe_negate /home/jon/projects/bitcoin/bitcoin/src/secp256k1/./src/field_5x52_impl.h:383:48
    #26 0x55b32a2110a9 in secp256k1_ecmult_strauss_wnaf /home/jon/projects/bitcoin/bitcoin/src/secp256k1/./src/ecmult_impl.h:547:13
    #27 0x55b32a2110a9 in secp256k1_ecmult /home/jon/projects/bitcoin/bitcoin/src/secp256k1/./src/ecmult_impl.h:574:5

NOTE: libFuzzer has rudimentary signal handlers.
      Combine libFuzzer with AddressSanitizer or similar for better crash reports.
SUMMARY: libFuzzer: deadly signal
MS: 2 PersAutoDict-CrossOver- DE: "+\x00"-; base unit: fbd17f1aa0fcc4e97f0bcafb8e19837f75cde233
artifact_prefix='./'; Test unit written to ./crash-8dbb7526c0e5673271d4ef81bb819ae81f2bb000

@maflcko
Copy link
Member Author

maflcko commented Jul 19, 2021

Wrote a fuzz test in #22493 to reproduce on "old" master (without commit e0a2b39)

maflcko pushed a commit that referenced this issue Jul 19, 2021
…g unserialization

816f29e addrman: detect on-disk corrupted nNew and nTried during unserialization (Vasil Dimov)

Pull request description:

  Negative `nNew` or `nTried` are not possible during normal operation.
  So, if we read such values during unserialize, report addrman
  corruption.

  Fixes #22450

ACKs for top commit:
  MarcoFalke:
    cr ACK 816f29e
  jonatack:
    ACK 816f29e
  lsilva01:
    Code Review ACK 816f29e.  This change provides a more accurate description of the error.

Tree-SHA512: 01bdd72d2d86a0ef770a319fee995fd1e147b24a8db84ddb8cd121688e7f94fed73fddc0084758e7183c4f8d08e971f0b1b224f5adb10928a5aa4dbbc8709d74
hebasto pushed a commit to hebasto/gui-qml that referenced this issue Jul 19, 2021
Negative `nNew` or `nTried` are not possible during normal operation.
So, if we read such values during unserialize, report addrman
corruption.

Fixes bitcoin/bitcoin#22450
jarolrod pushed a commit to jarolrod/gui-qml that referenced this issue Jul 19, 2021
Negative `nNew` or `nTried` are not possible during normal operation.
So, if we read such values during unserialize, report addrman
corruption.

Fixes bitcoin/bitcoin#22450
josibake pushed a commit to josibake/bitcoin that referenced this issue Jul 21, 2021
Negative `nNew` or `nTried` are not possible during normal operation.
So, if we read such values during unserialize, report addrman
corruption.

Fixes bitcoin#22450
sidhujag pushed a commit to syscoin/syscoin that referenced this issue Jul 23, 2021
…d during unserialization

816f29e addrman: detect on-disk corrupted nNew and nTried during unserialization (Vasil Dimov)

Pull request description:

  Negative `nNew` or `nTried` are not possible during normal operation.
  So, if we read such values during unserialize, report addrman
  corruption.

  Fixes bitcoin#22450

ACKs for top commit:
  MarcoFalke:
    cr ACK 816f29e
  jonatack:
    ACK 816f29e
  lsilva01:
    Code Review ACK bitcoin@816f29e.  This change provides a more accurate description of the error.

Tree-SHA512: 01bdd72d2d86a0ef770a319fee995fd1e147b24a8db84ddb8cd121688e7f94fed73fddc0084758e7183c4f8d08e971f0b1b224f5adb10928a5aa4dbbc8709d74
janus pushed a commit to BitgesellOfficial/bitgesell that referenced this issue Nov 5, 2021
Negative `nNew` or `nTried` are not possible during normal operation.
So, if we read such values during unserialize, report addrman
corruption.

Fixes bitcoin/bitcoin#22450
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this issue Mar 5, 2022
…d during unserialization

816f29e addrman: detect on-disk corrupted nNew and nTried during unserialization (Vasil Dimov)

Pull request description:

  Negative `nNew` or `nTried` are not possible during normal operation.
  So, if we read such values during unserialize, report addrman
  corruption.

  Fixes bitcoin#22450

ACKs for top commit:
  MarcoFalke:
    cr ACK 816f29e
  jonatack:
    ACK 816f29e
  lsilva01:
    Code Review ACK bitcoin@816f29e.  This change provides a more accurate description of the error.

Tree-SHA512: 01bdd72d2d86a0ef770a319fee995fd1e147b24a8db84ddb8cd121688e7f94fed73fddc0084758e7183c4f8d08e971f0b1b224f5adb10928a5aa4dbbc8709d74
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this issue Mar 5, 2022
…d during unserialization

816f29e addrman: detect on-disk corrupted nNew and nTried during unserialization (Vasil Dimov)

Pull request description:

  Negative `nNew` or `nTried` are not possible during normal operation.
  So, if we read such values during unserialize, report addrman
  corruption.

  Fixes bitcoin#22450

ACKs for top commit:
  MarcoFalke:
    cr ACK 816f29e
  jonatack:
    ACK 816f29e
  lsilva01:
    Code Review ACK bitcoin@816f29e.  This change provides a more accurate description of the error.

Tree-SHA512: 01bdd72d2d86a0ef770a319fee995fd1e147b24a8db84ddb8cd121688e7f94fed73fddc0084758e7183c4f8d08e971f0b1b224f5adb10928a5aa4dbbc8709d74
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this issue Mar 5, 2022
…d during unserialization

816f29e addrman: detect on-disk corrupted nNew and nTried during unserialization (Vasil Dimov)

Pull request description:

  Negative `nNew` or `nTried` are not possible during normal operation.
  So, if we read such values during unserialize, report addrman
  corruption.

  Fixes bitcoin#22450

ACKs for top commit:
  MarcoFalke:
    cr ACK 816f29e
  jonatack:
    ACK 816f29e
  lsilva01:
    Code Review ACK bitcoin@816f29e.  This change provides a more accurate description of the error.

Tree-SHA512: 01bdd72d2d86a0ef770a319fee995fd1e147b24a8db84ddb8cd121688e7f94fed73fddc0084758e7183c4f8d08e971f0b1b224f5adb10928a5aa4dbbc8709d74
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this issue Mar 7, 2022
…d during unserialization

816f29e addrman: detect on-disk corrupted nNew and nTried during unserialization (Vasil Dimov)

Pull request description:

  Negative `nNew` or `nTried` are not possible during normal operation.
  So, if we read such values during unserialize, report addrman
  corruption.

  Fixes bitcoin#22450

ACKs for top commit:
  MarcoFalke:
    cr ACK 816f29e
  jonatack:
    ACK 816f29e
  lsilva01:
    Code Review ACK bitcoin@816f29e.  This change provides a more accurate description of the error.

Tree-SHA512: 01bdd72d2d86a0ef770a319fee995fd1e147b24a8db84ddb8cd121688e7f94fed73fddc0084758e7183c4f8d08e971f0b1b224f5adb10928a5aa4dbbc8709d74
gades pushed a commit to cosanta/cosanta-core that referenced this issue May 8, 2022
…d during unserialization

816f29e addrman: detect on-disk corrupted nNew and nTried during unserialization (Vasil Dimov)

Pull request description:

  Negative `nNew` or `nTried` are not possible during normal operation.
  So, if we read such values during unserialize, report addrman
  corruption.

  Fixes bitcoin#22450

ACKs for top commit:
  MarcoFalke:
    cr ACK 816f29e
  jonatack:
    ACK 816f29e
  lsilva01:
    Code Review ACK bitcoin@816f29e.  This change provides a more accurate description of the error.

Tree-SHA512: 01bdd72d2d86a0ef770a319fee995fd1e147b24a8db84ddb8cd121688e7f94fed73fddc0084758e7183c4f8d08e971f0b1b224f5adb10928a5aa4dbbc8709d74
div72 pushed a commit to div72/Gridcoin-Research that referenced this issue May 18, 2022
Negative `nNew` or `nTried` are not possible during normal operation.
So, if we read such values during unserialize, report addrman
corruption.

Fixes bitcoin/bitcoin#22450
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Aug 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants