Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: fix creation of "std::string"s with \0s #20000

Merged
merged 1 commit into from
Nov 20, 2020

Conversation

vasild
Copy link
Contributor

@vasild vasild commented Sep 23, 2020

A string literal "abc" contains a terminating \0, so that is 4
bytes. There is no need to write "abc\0" unless two terminating
\0s are necessary.

std::string objects do not internally contain a terminating \0, so
std::string("abc") creates a string with size 3 and is the same as
std::string("abc", 3).

In "\01" the 01 part is interpreted as one number (1) and that is
the same as "\1" which is a string like {1, 0} whereas "\0z" is a
string like {0, 'z', 0}. To create a string like {0, '1', 0} one
must use "\0" "1".

Adjust the tests accordingly.

@eriknylund
Copy link
Contributor

eriknylund commented Sep 23, 2020

Concept ACK. I browsed over these decode tests when extending the Base32 encoding tests with support for no padding for #19845 and I found them hard to comprehend. I think this is an improvement.

@vasild
Copy link
Contributor Author

vasild commented Sep 23, 2020

This is actually mode widespread. I checked all tests and fixed some others too.

@eriknylund, this is how stumbled on that too! :)

Copy link
Contributor

@practicalswift practicalswift left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK b7f1cc6 - patch looks correct

Thanks for cleaning up these tests! :)

BOOST_CHECK_EQUAL(failure, true);
(void)DecodeBase32(std::string("AWSX3VPP", 8), &failure);
BOOST_CHECK_EQUAL(failure, false);
(void)DecodeBase32(std::string("invalid", 8), &failure);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails before and after, indicating that the failure has nothing to do with NUL, no?

Copy link
Contributor

@eriknylund eriknylund Sep 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I also don't fully understand what any of these tests are trying to achieve? Wouldn't something like

diff --git a/src/test/base32_tests.cpp b/src/test/base32_tests.cpp
index 26ed78e2c2..4a5af0fd4d 100644
--- a/src/test/base32_tests.cpp
+++ b/src/test/base32_tests.cpp
@@ -23,13 +23,13 @@ BOOST_AUTO_TEST_CASE(base32_testvectors)
 
     // Decoding strings with embedded NUL characters should fail
     bool failure;
-    (void)DecodeBase32(std::string("invalid", 8), &failure);
-    BOOST_CHECK(failure);
-    (void)DecodeBase32(std::string("AWSX3VPP"), &failure);
+    (void)DecodeBase32(std::string("my======", 8), &failure);
     BOOST_CHECK(!failure);
-    (void)DecodeBase32(std::string("AWSX3VPP\0invalid", 16), &failure);
+    (void)DecodeBase32(std::string("my======\0", 9), &failure);
+    BOOST_CHECK(failure);
+    (void)DecodeBase32(std::string("\0my======", 9), &failure);
     BOOST_CHECK(failure);
-    (void)DecodeBase32(std::string("AWSX3VPPinvalid", 16), &failure);
+    (void)DecodeBase32(std::string("my=\0=====", 9), &failure);
     BOOST_CHECK(failure);
 }

do more according to what the comment on L24 implies?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no?

no :-)

Before this PR DecodeBase32(std::string("invalid", 7), &failure) fails because the supplied input has invalid length (7) and is not padded to a proper length (8) with =.

With this PR the same call fails due to the ValidAsCString() check early in DecodeBase32() (this check passes before this PR). The trailing \0 upsets the ValidAsCString() check which I think was the intent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, with this PR it is not "the same call", but DecodeBase32(std::string("invalid", 8), &failure).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the failure reason is not returned, it might be good to use a valid Base32 string, which properly decodes and then make it an invalid cstr, which fails. Otherwise it is harder to guess which failure reason was the intent from just reading the test code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be good to use a valid Base32 string, which properly decodes and then make it an invalid cstr, which fails

The tests below already do that with AWSX3VPP (length 8, valid) and AWSX3VPP\0invalid (length 16, invalid due to embedded \0).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my understanding AWSX3VPP is not a valid base32 encoded string, but perhaps I'm missing something obvious?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? It has proper length (8) and no embedded \0s. DecodeBase32() decodes is successfully.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I see your point and why I misunderstood the test in this case. To me the value AWXS3VPP looks like a magic string chosen on purpose and not on random, and I expected it to actually decode into something more human readable, like the other test values in the file on L14-15.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments to make it clearer.

BOOST_CHECK_EQUAL(failure, true);
(void)DecodeBase32(std::string("AWSX3VPP", 8), &failure);
BOOST_CHECK_EQUAL(failure, false);
(void)DecodeBase32(std::string("invalid", 8), &failure);
Copy link
Contributor

@eriknylund eriknylund Sep 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I also don't fully understand what any of these tests are trying to achieve? Wouldn't something like

diff --git a/src/test/base32_tests.cpp b/src/test/base32_tests.cpp
index 26ed78e2c2..4a5af0fd4d 100644
--- a/src/test/base32_tests.cpp
+++ b/src/test/base32_tests.cpp
@@ -23,13 +23,13 @@ BOOST_AUTO_TEST_CASE(base32_testvectors)
 
     // Decoding strings with embedded NUL characters should fail
     bool failure;
-    (void)DecodeBase32(std::string("invalid", 8), &failure);
-    BOOST_CHECK(failure);
-    (void)DecodeBase32(std::string("AWSX3VPP"), &failure);
+    (void)DecodeBase32(std::string("my======", 8), &failure);
     BOOST_CHECK(!failure);
-    (void)DecodeBase32(std::string("AWSX3VPP\0invalid", 16), &failure);
+    (void)DecodeBase32(std::string("my======\0", 9), &failure);
+    BOOST_CHECK(failure);
+    (void)DecodeBase32(std::string("\0my======", 9), &failure);
     BOOST_CHECK(failure);
-    (void)DecodeBase32(std::string("AWSX3VPPinvalid", 16), &failure);
+    (void)DecodeBase32(std::string("my=\0=====", 9), &failure);
     BOOST_CHECK(failure);
 }

do more according to what the comment on L24 implies?

src/test/base32_tests.cpp Outdated Show resolved Hide resolved
@eriknylund
Copy link
Contributor

ACK b7f1cc6 I've reviewed the changes and I think they are an improvement. I would consider renaming the issue to better reflect that changes that are not only base32 related.

@vasild vasild changed the title test: fix DecodeBase32 tests test: fix string handling with embedded \0s Oct 7, 2020
@vasild vasild changed the title test: fix string handling with embedded \0s test: fix creation of "std::string"s with \0s Oct 7, 2020
Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review ACK b7f1cc6 🗄️

A string literal `"abc"` contains a terminating `\0`, so that is 4
bytes. There is no need to write `"abc\0"` unless two terminating
`\0`s are necessary.

`std::string` objects do not internally contain a terminating `\0`, so
`std::string("abc")` creates a string with size 3 and is the same as
`std::string("abc", 3)`.

In `"\01"` the `01` part is interpreted as one number (1) and that is
the same as `"\1"` which is a string like `{1, 0}` whereas `"\0z"` is a
string like `{0, 'z', 0}`. To create a string like `{0, '1', 0}` one
must use `"\0" "1"`.

Adjust the tests accordingly.
@vasild
Copy link
Contributor Author

vasild commented Oct 31, 2020

Changed to use s-suffix strings. Tested with --enable-c++17.

@laanwj
Copy link
Member

laanwj commented Nov 19, 2020

ACK ecc6cf1

This very much clarifies the code and works locally for me. I don't know what the travis and CI failures are about.

@practicalswift
Copy link
Contributor

ACK ecc6cf1 modulo happily green CI

Needs rebase to pull in C++17 switch?

@vasild vasild closed this Nov 19, 2020
@vasild vasild reopened this Nov 19, 2020
@laanwj laanwj merged commit 2878167 into bitcoin:master Nov 20, 2020
@vasild vasild deleted the fix_base32_tests branch November 20, 2020 08:54
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Nov 20, 2020
ecc6cf1 test: fix creation of std::string objects with \0s (Vasil Dimov)

Pull request description:

  A string literal `"abc"` contains a terminating `\0`, so that is 4
  bytes. There is no need to write `"abc\0"` unless two terminating
  `\0`s are necessary.

  `std::string` objects do not internally contain a terminating `\0`, so
  `std::string("abc")` creates a string with size 3 and is the same as
  `std::string("abc", 3)`.

  In `"\01"` the `01` part is interpreted as one number (1) and that is
  the same as `"\1"` which is a string like `{1, 0}` whereas `"\0z"` is a
  string like `{0, 'z', 0}`. To create a string like `{0, '1', 0}` one
  must use `"\0" "1"`.

  Adjust the tests accordingly.

ACKs for top commit:
  laanwj:
    ACK ecc6cf1
  practicalswift:
    ACK ecc6cf1 modulo happily green CI

Tree-SHA512: 5eb489e8533a4199a9324b92f7280041552379731ebf7dfee169f70d5458e20e29b36f8bfaee6f201f48ab2b9d1d0fc4bdf8d6e4c58d6102f399cfbea54a219e
random-zebra added a commit to PIVX-Project/PIVX that referenced this pull request Aug 11, 2021
ecde04a [Consensus] Bump Active Protocol version to 70923 for v5.3 (random-zebra)
b63e4f5 Consensus: Add v5.3 enforcement height for testnet. (furszy)
f44be94 Only relay IPv4, IPv6, Tor addresses (Pieter Wuille)
015298c fix: tor: Call event_base_loopbreak from the event's callback (furszy)
34ff7a8 Consensus: Add mnb ADDRv2 guard. (furszy)
b4515dc GUI: Present v3 onion addresses properly in MNs list. (furszy)
337d43d tests: don't export in6addr_loopback (Vasil Dimov)
2cde8e0 GUI: Do not show the tor v3 onion address in the topbar. (furszy)
0b5f406 Doc: update tor.md with latest upstream information. (furszy)
89df7f2 addrman: ensure old versions don't parse peers.dat (Vasil Dimov)
bb90c5c test: add getnetworkinfo network name regression tests (Jon Atack)
d8e01b5 rpc: update GetNetworksInfo() to not return unsupported networks (Jon Atack)
57fc7b0 net: update GetNetworkName() with all enum Network cases (Jon Atack)
647d60b tests: Modify rpc_bind to conform to bitcoin#14532 behaviour. (Carl Dong)
d4d6729 Allow running rpc_bind.py --nonloopback test without IPv6 (Kristaps Kaupe)
4a034d8 test: Add rpc_bind test to default-run tests (Wladimir J. van der Laan)
61a08af [tests] bind functional test nodes to 127.0.0.1  Prevents OSX firewall (Sjors Provoost)
6a4f1e0 test: Add basic addr relay test (furszy)
78aa61c net: Make addr relay mockable (furszy)
ba954ca Send and require SENDADDRV2 before VERACK (Pieter Wuille)
61c2ed4 Bump net protocol version + don't send 'sendaddrv2' to pre-70923 software (furszy)
ccd508a tor: make a TORv3 hidden service instead of TORv2 (Vasil Dimov)
6da9a14 net: advertise support for ADDRv2 via new message (furszy)
e58d5d0 Migrate to test_large_inv() to Misbehaving logging. (furszy)
d496b64 [QA] fix mininode CAddress ser/deser (Jonas Schnelli)
cec9567 net: CAddress & CAddrMan: (un)serialize as ADDRv2 Change the serialization of `CAddrMan` to serialize its addresses in ADDRv2/BIP155 format by default. Introduce a new `CAddrMan` format version (3). (furszy)
b8c1dda streams update: get rid of nType and nVersion. (furszy)
3eaa273 Support bypassing range check in ReadCompactSize (Pieter Wuille)
a237ba4 net: recognize TORv3/I2P/CJDNS networks (Vasil Dimov)
8e50853 util: make EncodeBase32 consume Spans (Sebastian Falbesoner)
1f67e30 net: CNetAddr: add support to (un)serialize as ADDRv2 (Vasil Dimov)
2455420 test: move HasReason so it can be reused (furszy)
d41adb4 util: move HasPrefix() so it can be reused (Vasil Dimov)
f6f86af Unroll Keccak-f implementation (Pieter Wuille)
45222e6 Implement keccak-f[1600] and SHA3-256 (Pieter Wuille)
08ad06d net: change CNetAddr::ip to have flexible size (furszy)
3337219 net: improve encapsulation of CNetAddr. (furszy)
910d5c4 test: Do not instantiate CAddrDB for static call (Hennadii Stepanov)
6b607ef Drop IsLimited in favor of IsReachable (Ben Woosley)
a40711b IsReachable is the inverse of IsLimited (DRY). Includes unit tests (marcaiaf)
8839828 net: don't accept non-left-contiguous netmasks (Vasil Dimov)
5d7f864 rpcbind: Warn about exposing RPC to untrusted networks (Luke Dashjr)
2a6abd8 CNetAddr: Add IsBindAny method to check for INADDR_ANY (Luke Dashjr)
4fdfa45 net: Always default rpcbind to localhost, never "all interfaces" (Luke Dashjr)
31064a8 net: Minor accumulated cleanups (furszy)
9f9c871 tests: Avoid using C-style NUL-terminated strings as arguments (practicalswift)
f6c52a3 tests: Add tests to make sure lookup methods fail on std::string parameters with embedded NUL characters (practicalswift)
a751b9b net: Avoid using C-style NUL-terminated strings as arguments in the netbase interface (furszy)
f30869d test: add IsRFC2544 tests (Mark Tyneway)
ed5abe1 Net: Proper CService deserialization + GetIn6Addr return false if addr isn't an IPv6 addr (furszy)
86d73fb net: save the network type explicitly in CNetAddr (Vasil Dimov)
ad57dfc net: document `enum Network` (Vasil Dimov)
cb160de netaddress: Update CNetAddr for ORCHIDv2 (Carl Dong)
c3c04e4 net: Better misbehaving logging (furszy)
3660487 net: Use C++11 member initialization in protocol (Marco)
082baa3 refactor: Drop unused CBufferedFile::Seek() (Hennadii Stepanov)
e2d776a util: CBufferedFile fixes (Larry Ruane)
6921f42 streams: backport OverrideStream class (furszy)

Pull request description:

  Conjunction of a large number of back ports, updates and refactorings that made with the final goal of implementing v3 Onion addresses support (BIP155 https://github.com/bitcoin/bips/blob/master/bip-0155.mediawiki) before the tor v2 addresses EOL, scheduled, by the Tor project, for (1) July 15th: v2 addr support removal from the code base, and (2) October 15th: v2 addr network disable, where **every peer in our network running under Tor will loose the connection and drop the network**.

  As BIP155 describes, this is introducing a new P2P message to gossip longer node addresses over the P2P network. This is required to support new-generation Onion addresses, I2P, and potentially other networks that have longer endpoint addresses than fit in the 128 bits of the current addr message.

  In order to achieve the end goal, had to:
  1.  Create Span class and push it up to latest Bitcoin implementation.
  2.  Update the whole serialization framework and every object using it up to latest Bitcoin implementation (3-4 years ahead of what we currently are in master).
  3.  Update the address manager implementing ASN-based bucketing of the network nodes.
  4.  Update and refactor the netAddress and address manager tests to latest Bitcoin implementation (4 years ahead of what we currently are in master).
  5.  Several util string, vector, encodings, parsing, hashing backports and more..

  Important note:
  This PR it is not meant to be merged as a standalone PR, will decouple smaller ones moving on. Adding on each sub-PR its own description isolated from this big monster.

  Second note:
  This is still a **work-in-progress**, not ready for testing yet. I'm probably missing to mention few PRs that have already adapted to our sources. Just making it public so can decouple the changes, we can start merging them and i can continue working a bit more confortable (rebase a +170 commits separate branch is not fun..).

  ### List of back ported and adapted PRs:

  Span and Serialization:
  ----------------
  *  bitcoin#12886.
  *  bitcoin#12916.
  *  bitcoin#13558.
  *  bitcoin#13697. (Only Span's commit 29943a9)
  *  bitcoin#17850.
  *  bitcoin#17896.
  *  bitcoin#12752.
  *  bitcoin#16577.
  *  bitcoin#16670. (without faebf62)
  *  bitcoin#17957.
  *  bitcoin#18021.
  *  bitcoin#18087.
  *  bitcoin#18112 (only from 353f376 that we don't support).
  *  bitcoin#18167.
  *  bitcoin#18317.
  *  bitcoin#18591 (only Span's commit 0fbde48)
  *  bitcoin#18468.
  *  bitcoin#19020.
  *  bitcoin#19032.
  *  bitcoin#19367.
  *  bitcoin#19387.

  Net, NetAddress and AddrMan:
  ----------------

  *  bitcoin#7932.
  *  bitcoin#10756.
  *  bitcoin#10765.
  *  bitcoin#12218.
  *  bitcoin#12855.
  *  bitcoin#13532.
  *  bitcoin#13575.
  *  bitcoin#13815.
  *  bitcoin#14532.
  *  bitcoin#15051.
  *  bitcoin#15138.
  *  bitcoin#15689.
  *  bitcoin#16702.
  *  bitcoin#17243.
  *  bitcoin#17345.
  *  bitcoin#17754.
  *  bitcoin#17758.
  *  bitcoin#17812.
  *  bitcoin#18023.
  *  bitcoin#18454.
  *  bitcoin#18512.
  *  bitcoin#19314.
  *  bitcoin#19687

  Keys and Addresses encoding:
  ----------------
  * bitcoin#11372.
  * bitcoin#17511.
  * bitcoin#17721.

  Util:
  ----------------
  * bitcoin#9140.
  * bitcoin#16577.
  * bitcoin#16889.
  * bitcoin#19593.

  Bench:
  ----------------
  * bitcoin#16299.

  BIP155:
  ----------------
  *  bitcoin#19351.
  *  bitcoin#19360.
  *  bitcoin#19534.
  *  bitcoin#19628.
  *  bitcoin#19841.
  *  bitcoin#19845.
  *  bitcoin#19954.
  *  bitcoin#19991 (pending).
  *  bitcoin#19845.
  *  bitcoin#20000 (pending).
  *  bitcoin#20120.
  *  bitcoin#20284.
  *  bitcoin#20564.
  *  bitcoin#21157 (pending).
  *  bitcoin#21564 (pending).
  *  Fully removed v2 onion addr support.
  *  Add hardcoded seeds.
  *  Add release-notes, changes to files.md and every needed documentation.

  I'm currently working on the PRs marked as "pending", this isn't over, but I'm pretty pretty close :). What a long road..

ACKs for top commit:
  random-zebra:
    utACK ecde04a
  Fuzzbawls:
    ACK ecde04a

Tree-SHA512: 82c95fbda76fce63f96d8a9af7fa9a89cb1e1b302b7891e27118a6103af0be23606bf202c7332fa61908205e6b6351764e2ec23d753f1e2484028f57c2e8b51a
deadalnix pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Jan 11, 2022
Summary:
A string literal `"abc"` contains a terminating `\0`, so that is 4
bytes. There is no need to write `"abc\0"` unless two terminating
`\0`s are necessary.

`std::string` objects do not internally contain a terminating `\0`, so
`std::string("abc")` creates a string with size 3 and is the same as
`std::string("abc", 3)`.

In `"\01"` the `01` part is interpreted as one number (1) and that is
the same as `"\1"` which is a string like `{1, 0}` whereas `"\0z"` is a
string like `{0, 'z', 0}`. To create a string like `{0, '1', 0}` one
must use `"\0" "1"`.

Adjust the tests accordingly.

This is a backport of [[bitcoin/bitcoin#20000 | core#20000]]

Test Plan: `ninja check`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D10808
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants