Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Keccak and SHA3_256 #19841

Merged
merged 3 commits into from Sep 10, 2020
Merged

Implement Keccak and SHA3_256 #19841

merged 3 commits into from Sep 10, 2020

Conversation

sipa
Copy link
Member

@sipa sipa commented Aug 30, 2020

Add a simple (and initially unoptimized) Keccak/SHA3 implementation based on https://github.com/mjosaarinen/tiny_sha3/blob/master/sha3.c, as one will be needed for TORv3 support (the conversion from BIP155 encoding to .onion notation uses a SHA3-based checksum). In follow-up commits, a benchmark is added, and the Keccakf function is unrolled for a (for me) 4.9x speedup.

Test vectors are taken from https://csrc.nist.gov/projects/cryptographic-algorithm-validation-program/secure-hashing#sha3vsha3vss.

@sipa
Copy link
Member Author

sipa commented Aug 30, 2020

Ping @vasild.

@jonatack
Copy link
Contributor

Nice!

@DrahtBot
Copy link
Contributor

DrahtBot commented Aug 31, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@fanquake
Copy link
Member

Concept ACK on this approach to adding a Keccak/SHA3 implementation.

@naumenkogs
Copy link
Member

Concept ACK as a requirement for Torv3.

@practicalswift
Copy link
Contributor

Concept ACK

Nice simple implementation: will fuzz it to see if I'm able to reject that it is as robust as it looks :)

@laanwj
Copy link
Member

laanwj commented Aug 31, 2020

This is a nice compact implementation of KeccaK/SHA3 (even after the unrolling).
Code review ACK 2263ab1

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 2263ab1

I tested this with #19845, changing the code to use this SHA3_256 implementation and it works (produces the same checksum in the TORv3 address):

// CHECKSUM = H(".onion checksum" | PUBKEY | VERSION)[:2]

hasher.Write(Span<const uint8_t>{reinterpret_cast<const uint8_t*>(".onion checksum"), 15});
hasher.Write(m_addr);
hasher.Write(Span<const uint8_t>{reinterpret_cast<const uint8_t*>("\x03"), 1});

I also tested this implementation against the same vectors as TestSHA256, just like in #19845 and the results are the same (as the SHA3_256 from that PR, not as the SHA256!).

The last two tests:

    TestSHA3_256(std::string(1000000, 'a'),
                 "5c8875ae474a3634ba4fd55ec85bffd661f32aca75c6d699d0cdcb6c115891c1");
    TestSHA3_256(test1, "7d06279100b45d54a479e973aa01ea0cceafb656c45238557813c9c39ceb6403");

take 2.8s VS 0.4s for SHA256 (in non optimized build).

Verdict: likely it works correctly and there may be room for performance improvement, but this is not relevant for the TORv3 code.

Ideally this would get merged first and then #19845 adapted to use it (and ditch the implementation imported in that PR).

@sipa
Copy link
Member Author

sipa commented Aug 31, 2020

@vasild FWIW, you should be able to use hasher.Write(MakeUCharSpan(".onion checksum")).

Regarding speed: the SHA256 implementations we have take advantage of SSE4/AVX2/SHA-NI instructions when available. I didn't use any of those for SHA3 as it seems unnecessary for its current use case. In an optimized build, but with HW accelerations for SHA256 disabled, the SHA3 code here is very close to SHA256:

ns/byte byte/s err% total benchmark
1.69 590,913,173.58 0.1% 0.02 SHA1
4.12 242,749,791.96 0.0% 0.05 SHA256
4.11 243,277,749.23 0.9% 0.04 SHA3_256_1M
2.90 344,486,900.54 0.1% 0.03 SHA512

On the same system (which has SHA-NI), with hardware accelerations enabled it is of course:

ns/byte byte/s err% total benchmark
0.63 1,586,480,646.52 0.0% 0.01 SHA256

@vasild
Copy link
Contributor

vasild commented Aug 31, 2020

Right, all is good. And SHA3-256 is supposed to be slower than SHA256.

@laanwj
Copy link
Member

laanwj commented Sep 1, 2020

Verdict: likely it works correctly and there may be room for performance improvement, but this is not relevant for the TORv3 code.

As it is not relevant to our use case, I don't think this is worth it. I wouldn't like to see the added complication of assembly implementations and special intrinsics just for computing an address checksum (it would be different if it was for packet checksums). Brevity (and of course correctness) of the implementation is the important thing here.

Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK
It seems that currently the header <array> is unnecessarily included both in the implementation and the unit test source files.
// EDIT: Same for <stdint.h> and <stdlib.h> in sha3.h.

@laanwj
Copy link
Member

laanwj commented Sep 3, 2020

@theStack Agree about array, it doesn't seem to be used at all. Wouldnt' stdint.h be necessary for sized integer types such as uint32_t? (yes, it's probably included indirectly somehow, but we like to be explicit about dependencies)

@theStack
Copy link
Contributor

theStack commented Sep 3, 2020

@theStack Agree about array, it doesn't seem to be used at all. Wouldnt' stdint.h be necessary for sized integer types such as uint32_t? (yes, it's probably included indirectly somehow, but we like to be explicit about dependencies)

@laanwj: Oh indeed, you are right. For some reason I forgot that sized integer types and size_t are not integral part of the language but defined in headers... 🤦‍♂️ so both <stdint.h> are <stdlib.h> are needed indeed (though one could probably use the C++ counterparts <cstdint> and <cstdlib> instead, but that'd be more of a general discussion not related only to this PR).

@sipa
Copy link
Member Author

sipa commented Sep 3, 2020

I added <array> to get access to std::begin and std::end.

@laanwj
Copy link
Member

laanwj commented Sep 4, 2020

I forgot that sized integer types and size_t are not integral part of the language but defined in headers... man_facepalming

Haah. I try to forget that sometimes …

I added to get access to std::begin and std::end.

Oh, of course!

@theStack
Copy link
Contributor

theStack commented Sep 5, 2020

I added <array> to get access to std::begin and std::end.

Ah, I wasn't aware those are declared in <array>. Sorry for the noise.

For reviewing this PR, I thought it would be nice to validate this implementation against a standard one with random input data (in addition to only the test vectors). I decided for python's hashlib which has sha3_256 available since Version 3.6. Thanks to pybind11 it's super-easy to create Python bindings for C++ code. With less than 20 LOC binding code, SHA3_256 can be used in Python in a natural way:

import sha3_sipa


def sha3_256_sipa(input):
    return sha3_sipa.SHA3_256().Write(input).Finalize()

The whole mini-project with Makefile and the comparison test with random input data is on my branch https://github.com/theStack/bitcoin/tree/test-sha3_256-against-python3-implementation
If anyone wants to try out, the following instructions should work on any recent Ubuntu with Python3.6+:

$ sudo apt-get install python3-dev
$ sudo pip3 install pybind11
$ cd sha3_test
$ make
$ ./sha3_test.py

Running this already for a few hours, with millions of random inputs (random length up to 1 MB) and as expected, all of them passed. Will as next step code-review the PR tomorrow.

@elichai
Copy link
Contributor

elichai commented Sep 6, 2020

I like how compact this is :)

@sipa
Copy link
Member Author

sipa commented Sep 7, 2020

Improved the tests and comments a bit.

@theStack Nice, that's pretty neat!

@elichai It was much more compact before unrolling, but I couldn't resist a 4.9x speedup with such a simple change (even though it really doesn't matter here...).

@practicalswift You're of course welcome to do any testing you like, but I don't think fuzzing for robustness is all that useful here. The KeccakF function has completely fixed code path and memory accesses, so it either fails for everything or for nothing. The only input-dependent behavior is in SHA3_256::Write, but even there the only thing that matters is the length of the inputs, which is well-covered by the included tests. It'd be slightly more useful to test that hashing the same data, chopped up in differently sized chunks, gives the same result.

Copy link
Contributor

@vasild vasild left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK ab654c7

It'd be slightly more useful to test that hashing the same data, chopped up in differently sized chunks, gives the same result.

I tested that as I ran the new code through the same tests as the existent SHA256 which uses TestVector.

@elichai
Copy link
Contributor

elichai commented Sep 7, 2020

@elichai It was much more compact before unrolling, but I couldn't resist a 4.9x speedup with such a simple change (even though it really doesn't matter here...).

haha, you could instead add #pragma unroll 5/24/25 :P

@sipa
Copy link
Member Author

sipa commented Sep 7, 2020

haha, you could instead add #pragma unroll 5/24/25 :P

I tried that. Numbers I get:

ns/byte byte/s err% total benchmark
28.88 34,630,178.09 1.0% 0.33 3f01ddb (before unrolling)
5.88 170,120,654.67 0.8% 0.07 ab654c7 (after unrolling)
5.60 178,667,398.49 0.4% 0.06 3f01ddb, but with pragmas to unroll everything in one round

So that means it's (on GCC 9.3.0, on x86_64 Linux) possible to get slightly better performance with forced unrolling, but also a lot more fragile (pragmas may work different across compilers and without the pragmas the code is a lot worse).

I prefer keeping the code as-is. It's still very readable (I think), is platform neutral, and has close to ideal performance.

@elichai
Copy link
Contributor

elichai commented Sep 8, 2020

I prefer keeping the code as-is. It's still very readable (I think), is platform neutral, and has close to ideal performance.

Yeah it probably won't work on MSVC

@practicalswift
Copy link
Contributor

ACK ab654c7 -- patch looks correct and no sanitizer complaints when doing some basic fuzz testing of the added code (remember: don't trust: fuzz!) :)

@laanwj
Copy link
Member

laanwj commented Sep 10, 2020

Thanks @theStack. I have run your cross-verification for a while here as well, no errors came up.

I would prefer to keep the code as-is, with an explicit unroll. Even though a pragma would make the code shorter, it's non standard C++ functionality.

re-ACK ab654c7

@laanwj laanwj merged commit a47e596 into bitcoin:master Sep 10, 2020
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Sep 11, 2020
ab654c7 Unroll Keccak-f implementation (Pieter Wuille)
3f01ddb Add SHA3 benchmark (Pieter Wuille)
2ac8bf9 Implement keccak-f[1600] and SHA3-256 (Pieter Wuille)

Pull request description:

  Add a simple (and initially unoptimized) Keccak/SHA3 implementation based on https://github.com/mjosaarinen/tiny_sha3/blob/master/sha3.c, as one will be needed for TORv3 support (the conversion from BIP155 encoding to .onion notation uses a SHA3-based checksum). In follow-up commits, a benchmark is added, and the Keccakf function is unrolled for a (for me) 4.9x speedup.

  Test vectors are taken from https://csrc.nist.gov/projects/cryptographic-algorithm-validation-program/secure-hashing#sha3vsha3vss.

ACKs for top commit:
  practicalswift:
    ACK ab654c7 -- patch looks correct and no sanitizer complaints when doing some basic fuzz testing of the added code (remember: **don't trust: fuzz!**) :)
  laanwj:
    re-ACK ab654c7
  vasild:
    ACK ab654c7

Tree-SHA512: 8a91b18c46e8fb178b7ff82046cff626180362337e515b92fbbd771876e795da2ed4e3995eb4849773040287f6e687237f469a90474ac53f521fc12e0f5031d9
deadalnix pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Feb 9, 2021
Summary:
```
Add a simple (and initially unoptimized) Keccak/SHA3 implementation
based on https://github.com/mjosaarinen/tiny_sha3/blob/master/sha3.c, as
one will be needed for TORv3 support (the conversion from BIP155
encoding to .onion notation uses a SHA3-based checksum). In follow-up
commits, a benchmark is added, and the Keccakf function is unrolled for
a (for me) 4.9x speedup.

Test vectors are taken from
https://csrc.nist.gov/projects/cryptographic-algorithm-validation-program/secure-hashing#sha3vsha3vss.
```

Backport of core [[bitcoin/bitcoin#19841 | PR19841]].

Test Plan:
  ninja check
  ninja bench-bitcoin

Reviewers: #bitcoin_abc, majcosta

Reviewed By: #bitcoin_abc, majcosta

Differential Revision: https://reviews.bitcoinabc.org/D9185
kwvg added a commit to kwvg/dash that referenced this pull request May 18, 2021
kwvg added a commit to kwvg/dash that referenced this pull request May 19, 2021
kwvg added a commit to kwvg/dash that referenced this pull request May 20, 2021
kwvg added a commit to kwvg/dash that referenced this pull request May 20, 2021
kwvg added a commit to kwvg/dash that referenced this pull request May 20, 2021
kwvg added a commit to kwvg/dash that referenced this pull request May 20, 2021
random-zebra added a commit to PIVX-Project/PIVX that referenced this pull request Aug 11, 2021
ecde04a [Consensus] Bump Active Protocol version to 70923 for v5.3 (random-zebra)
b63e4f5 Consensus: Add v5.3 enforcement height for testnet. (furszy)
f44be94 Only relay IPv4, IPv6, Tor addresses (Pieter Wuille)
015298c fix: tor: Call event_base_loopbreak from the event's callback (furszy)
34ff7a8 Consensus: Add mnb ADDRv2 guard. (furszy)
b4515dc GUI: Present v3 onion addresses properly in MNs list. (furszy)
337d43d tests: don't export in6addr_loopback (Vasil Dimov)
2cde8e0 GUI: Do not show the tor v3 onion address in the topbar. (furszy)
0b5f406 Doc: update tor.md with latest upstream information. (furszy)
89df7f2 addrman: ensure old versions don't parse peers.dat (Vasil Dimov)
bb90c5c test: add getnetworkinfo network name regression tests (Jon Atack)
d8e01b5 rpc: update GetNetworksInfo() to not return unsupported networks (Jon Atack)
57fc7b0 net: update GetNetworkName() with all enum Network cases (Jon Atack)
647d60b tests: Modify rpc_bind to conform to bitcoin#14532 behaviour. (Carl Dong)
d4d6729 Allow running rpc_bind.py --nonloopback test without IPv6 (Kristaps Kaupe)
4a034d8 test: Add rpc_bind test to default-run tests (Wladimir J. van der Laan)
61a08af [tests] bind functional test nodes to 127.0.0.1  Prevents OSX firewall (Sjors Provoost)
6a4f1e0 test: Add basic addr relay test (furszy)
78aa61c net: Make addr relay mockable (furszy)
ba954ca Send and require SENDADDRV2 before VERACK (Pieter Wuille)
61c2ed4 Bump net protocol version + don't send 'sendaddrv2' to pre-70923 software (furszy)
ccd508a tor: make a TORv3 hidden service instead of TORv2 (Vasil Dimov)
6da9a14 net: advertise support for ADDRv2 via new message (furszy)
e58d5d0 Migrate to test_large_inv() to Misbehaving logging. (furszy)
d496b64 [QA] fix mininode CAddress ser/deser (Jonas Schnelli)
cec9567 net: CAddress & CAddrMan: (un)serialize as ADDRv2 Change the serialization of `CAddrMan` to serialize its addresses in ADDRv2/BIP155 format by default. Introduce a new `CAddrMan` format version (3). (furszy)
b8c1dda streams update: get rid of nType and nVersion. (furszy)
3eaa273 Support bypassing range check in ReadCompactSize (Pieter Wuille)
a237ba4 net: recognize TORv3/I2P/CJDNS networks (Vasil Dimov)
8e50853 util: make EncodeBase32 consume Spans (Sebastian Falbesoner)
1f67e30 net: CNetAddr: add support to (un)serialize as ADDRv2 (Vasil Dimov)
2455420 test: move HasReason so it can be reused (furszy)
d41adb4 util: move HasPrefix() so it can be reused (Vasil Dimov)
f6f86af Unroll Keccak-f implementation (Pieter Wuille)
45222e6 Implement keccak-f[1600] and SHA3-256 (Pieter Wuille)
08ad06d net: change CNetAddr::ip to have flexible size (furszy)
3337219 net: improve encapsulation of CNetAddr. (furszy)
910d5c4 test: Do not instantiate CAddrDB for static call (Hennadii Stepanov)
6b607ef Drop IsLimited in favor of IsReachable (Ben Woosley)
a40711b IsReachable is the inverse of IsLimited (DRY). Includes unit tests (marcaiaf)
8839828 net: don't accept non-left-contiguous netmasks (Vasil Dimov)
5d7f864 rpcbind: Warn about exposing RPC to untrusted networks (Luke Dashjr)
2a6abd8 CNetAddr: Add IsBindAny method to check for INADDR_ANY (Luke Dashjr)
4fdfa45 net: Always default rpcbind to localhost, never "all interfaces" (Luke Dashjr)
31064a8 net: Minor accumulated cleanups (furszy)
9f9c871 tests: Avoid using C-style NUL-terminated strings as arguments (practicalswift)
f6c52a3 tests: Add tests to make sure lookup methods fail on std::string parameters with embedded NUL characters (practicalswift)
a751b9b net: Avoid using C-style NUL-terminated strings as arguments in the netbase interface (furszy)
f30869d test: add IsRFC2544 tests (Mark Tyneway)
ed5abe1 Net: Proper CService deserialization + GetIn6Addr return false if addr isn't an IPv6 addr (furszy)
86d73fb net: save the network type explicitly in CNetAddr (Vasil Dimov)
ad57dfc net: document `enum Network` (Vasil Dimov)
cb160de netaddress: Update CNetAddr for ORCHIDv2 (Carl Dong)
c3c04e4 net: Better misbehaving logging (furszy)
3660487 net: Use C++11 member initialization in protocol (Marco)
082baa3 refactor: Drop unused CBufferedFile::Seek() (Hennadii Stepanov)
e2d776a util: CBufferedFile fixes (Larry Ruane)
6921f42 streams: backport OverrideStream class (furszy)

Pull request description:

  Conjunction of a large number of back ports, updates and refactorings that made with the final goal of implementing v3 Onion addresses support (BIP155 https://github.com/bitcoin/bips/blob/master/bip-0155.mediawiki) before the tor v2 addresses EOL, scheduled, by the Tor project, for (1) July 15th: v2 addr support removal from the code base, and (2) October 15th: v2 addr network disable, where **every peer in our network running under Tor will loose the connection and drop the network**.

  As BIP155 describes, this is introducing a new P2P message to gossip longer node addresses over the P2P network. This is required to support new-generation Onion addresses, I2P, and potentially other networks that have longer endpoint addresses than fit in the 128 bits of the current addr message.

  In order to achieve the end goal, had to:
  1.  Create Span class and push it up to latest Bitcoin implementation.
  2.  Update the whole serialization framework and every object using it up to latest Bitcoin implementation (3-4 years ahead of what we currently are in master).
  3.  Update the address manager implementing ASN-based bucketing of the network nodes.
  4.  Update and refactor the netAddress and address manager tests to latest Bitcoin implementation (4 years ahead of what we currently are in master).
  5.  Several util string, vector, encodings, parsing, hashing backports and more..

  Important note:
  This PR it is not meant to be merged as a standalone PR, will decouple smaller ones moving on. Adding on each sub-PR its own description isolated from this big monster.

  Second note:
  This is still a **work-in-progress**, not ready for testing yet. I'm probably missing to mention few PRs that have already adapted to our sources. Just making it public so can decouple the changes, we can start merging them and i can continue working a bit more confortable (rebase a +170 commits separate branch is not fun..).

  ### List of back ported and adapted PRs:

  Span and Serialization:
  ----------------
  *  bitcoin#12886.
  *  bitcoin#12916.
  *  bitcoin#13558.
  *  bitcoin#13697. (Only Span's commit 29943a9)
  *  bitcoin#17850.
  *  bitcoin#17896.
  *  bitcoin#12752.
  *  bitcoin#16577.
  *  bitcoin#16670. (without faebf62)
  *  bitcoin#17957.
  *  bitcoin#18021.
  *  bitcoin#18087.
  *  bitcoin#18112 (only from 353f376 that we don't support).
  *  bitcoin#18167.
  *  bitcoin#18317.
  *  bitcoin#18591 (only Span's commit 0fbde48)
  *  bitcoin#18468.
  *  bitcoin#19020.
  *  bitcoin#19032.
  *  bitcoin#19367.
  *  bitcoin#19387.

  Net, NetAddress and AddrMan:
  ----------------

  *  bitcoin#7932.
  *  bitcoin#10756.
  *  bitcoin#10765.
  *  bitcoin#12218.
  *  bitcoin#12855.
  *  bitcoin#13532.
  *  bitcoin#13575.
  *  bitcoin#13815.
  *  bitcoin#14532.
  *  bitcoin#15051.
  *  bitcoin#15138.
  *  bitcoin#15689.
  *  bitcoin#16702.
  *  bitcoin#17243.
  *  bitcoin#17345.
  *  bitcoin#17754.
  *  bitcoin#17758.
  *  bitcoin#17812.
  *  bitcoin#18023.
  *  bitcoin#18454.
  *  bitcoin#18512.
  *  bitcoin#19314.
  *  bitcoin#19687

  Keys and Addresses encoding:
  ----------------
  * bitcoin#11372.
  * bitcoin#17511.
  * bitcoin#17721.

  Util:
  ----------------
  * bitcoin#9140.
  * bitcoin#16577.
  * bitcoin#16889.
  * bitcoin#19593.

  Bench:
  ----------------
  * bitcoin#16299.

  BIP155:
  ----------------
  *  bitcoin#19351.
  *  bitcoin#19360.
  *  bitcoin#19534.
  *  bitcoin#19628.
  *  bitcoin#19841.
  *  bitcoin#19845.
  *  bitcoin#19954.
  *  bitcoin#19991 (pending).
  *  bitcoin#19845.
  *  bitcoin#20000 (pending).
  *  bitcoin#20120.
  *  bitcoin#20284.
  *  bitcoin#20564.
  *  bitcoin#21157 (pending).
  *  bitcoin#21564 (pending).
  *  Fully removed v2 onion addr support.
  *  Add hardcoded seeds.
  *  Add release-notes, changes to files.md and every needed documentation.

  I'm currently working on the PRs marked as "pending", this isn't over, but I'm pretty pretty close :). What a long road..

ACKs for top commit:
  random-zebra:
    utACK ecde04a
  Fuzzbawls:
    ACK ecde04a

Tree-SHA512: 82c95fbda76fce63f96d8a9af7fa9a89cb1e1b302b7891e27118a6103af0be23606bf202c7332fa61908205e6b6351764e2ec23d753f1e2484028f57c2e8b51a
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet