New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use SipHash-2-4 for various non-cryptographic hashes #8020

Merged
merged 4 commits into from May 18, 2016

Conversation

Projects
None yet
8 participants
@sipa
Member

sipa commented May 6, 2016

Use SipHash-2-4 for:

  • CCoinsViewCache hashmap (instead of a custom Lookup3/XOR scheme)
  • CTxMempool::mapTx txid index (converting from an ordered map)
  • Address relay peer selection (instead of SHA256)

Computing a hash for a txid using this takes around 52 CPU cycles in benchmarks (the Lookup3/XOR based mechanism used for CCoinsViewCache took 31 cycles). I think that's negligible still, while using a much more standard construct designed for such purposes.

@gmaxwell

This comment has been minimized.

Show comment
Hide comment
@gmaxwell

gmaxwell May 6, 2016

Member

Concept ACK.

Member

gmaxwell commented May 6, 2016

Concept ACK.

@pstratem

This comment has been minimized.

Show comment
Hide comment
@pstratem

pstratem May 6, 2016

Contributor

Concept ACK
ad00e3af31df0655c1cdc95b8d29dcc1ec413e5b

Contributor

pstratem commented May 6, 2016

Concept ACK
ad00e3af31df0655c1cdc95b8d29dcc1ec413e5b

@kazcw

View changes

Show outdated Hide outdated src/main.cpp Outdated
@laanwj

This comment has been minimized.

Show comment
Hide comment
@laanwj

laanwj May 7, 2016

Member

Concept ACK
Python took the same step for 3.4 to convert their hash algorithms to SipHash, and they have similar motivations of mitigating collision attacks, so we're in good company: https://www.python.org/dev/peps/pep-0456/

Member

laanwj commented May 7, 2016

Concept ACK
Python took the same step for 3.4 to convert their hash algorithms to SipHash, and they have similar motivations of mitigating collision attacks, so we're in good company: https://www.python.org/dev/peps/pep-0456/

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa May 8, 2016

Member

Here is a comment about using SipHash-1-3 instead of SipHash-2-4: rust-lang/rust#29754 (comment)

It would be approximately twice as fast for hashing txids, and the distinguisher mentioned in that comment is not relevant for data that is a multiple of 8 bytes (which is the case for us), as 8 bytes of padding are added in that case. Opinions?

Member

sipa commented May 8, 2016

Here is a comment about using SipHash-1-3 instead of SipHash-2-4: rust-lang/rust#29754 (comment)

It would be approximately twice as fast for hashing txids, and the distinguisher mentioned in that comment is not relevant for data that is a multiple of 8 bytes (which is the case for us), as 8 bytes of padding are added in that case. Opinions?

@laanwj

This comment has been minimized.

Show comment
Hide comment
@laanwj

laanwj May 9, 2016

Member

Siphash 2-4 seems to be the defacto 'safe' choice. Siphash 1-3 happens to be secure for our use case right now, but that sounds a tad brittle: as if a small change in our use case, or a small advancement in cryptoanalysis of the function, could mean that the known weakness does suddenly affect us?

Not sure how much hashing the txids is a bottleneck in the whole scheme of things around CCoinsCache, it's only 32 bytes after all. Conservatively I'd say only consider 'downgrading' if performance is seriously affected by going from 29 cycles to 49 cycles.

On the other hand it's pretty nice if using a hash with better security properties could be faster than the Lookup3/XOR scheme, so I see the attraction in that.

Member

laanwj commented May 9, 2016

Siphash 2-4 seems to be the defacto 'safe' choice. Siphash 1-3 happens to be secure for our use case right now, but that sounds a tad brittle: as if a small change in our use case, or a small advancement in cryptoanalysis of the function, could mean that the known weakness does suddenly affect us?

Not sure how much hashing the txids is a bottleneck in the whole scheme of things around CCoinsCache, it's only 32 bytes after all. Conservatively I'd say only consider 'downgrading' if performance is seriously affected by going from 29 cycles to 49 cycles.

On the other hand it's pretty nice if using a hash with better security properties could be faster than the Lookup3/XOR scheme, so I see the attraction in that.

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa May 9, 2016

Member

Did some better benchmarks:

  • Current Lookup3-based approach: 31 cycles
  • SipHash-2-4: 52 cycles
  • SipHash-1-3: 32 cycles
Member

sipa commented May 9, 2016

Did some better benchmarks:

  • Current Lookup3-based approach: 31 cycles
  • SipHash-2-4: 52 cycles
  • SipHash-1-3: 32 cycles
@dcousens

This comment has been minimized.

Show comment
Hide comment
@dcousens

dcousens May 11, 2016

Contributor

Ha, I thought SipHash was something home grown by @sipa... TIL.
LGTM, for the record, do we have a comparison with cycle counts of the other hash functions? (SHA1 say?)

Contributor

dcousens commented May 11, 2016

Ha, I thought SipHash was something home grown by @sipa... TIL.
LGTM, for the record, do we have a comparison with cycle counts of the other hash functions? (SHA1 say?)

@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa May 11, 2016

Member

@dcousens Based on numbers I get from #8039:

  • SHA1: 577 cycles (1 block, 64 bytes)
  • SHA256: 1504 cycles (1 block, 64 bytes)
  • SHA512: 1988 cycles (1 block, 128 bytes)
  • RIPEMD160: 751 cycles (1 block, 64 bytes)
Member

sipa commented May 11, 2016

@dcousens Based on numbers I get from #8039:

  • SHA1: 577 cycles (1 block, 64 bytes)
  • SHA256: 1504 cycles (1 block, 64 bytes)
  • SHA512: 1988 cycles (1 block, 128 bytes)
  • RIPEMD160: 751 cycles (1 block, 64 bytes)
@dcousens

This comment has been minimized.

Show comment
Hide comment
@dcousens

dcousens May 11, 2016

Contributor

Thanks @sipa 👍

Contributor

dcousens commented May 11, 2016

Thanks @sipa 👍

@TheBlueMatt

View changes

Show outdated Hide outdated src/hash.cpp Outdated
@sipa

This comment has been minimized.

Show comment
Hide comment
@sipa

sipa May 16, 2016

Member

Going to stick to SipHash-2-4 for now. We can always switch to something else later.

Ready for merge, I think.

Member

sipa commented May 16, 2016

Going to stick to SipHash-2-4 for now. We can always switch to something else later.

Ready for merge, I think.

@pstratem

This comment has been minimized.

Show comment
Hide comment
@pstratem

pstratem May 17, 2016

Contributor

utACK 658c6481616f12a3466d1ddeb23eb91788ab2e66

Contributor

pstratem commented May 17, 2016

utACK 658c6481616f12a3466d1ddeb23eb91788ab2e66

@gmaxwell

This comment has been minimized.

Show comment
Hide comment
@gmaxwell

gmaxwell May 17, 2016

Member

ACK

Member

gmaxwell commented May 17, 2016

ACK

@TheBlueMatt

View changes

Show outdated Hide outdated src/uint256.h Outdated
@TheBlueMatt

View changes

Show outdated Hide outdated src/hash.cpp Outdated
@@ -295,7 +297,7 @@ struct CCoinsCacheEntry
CCoinsCacheEntry() : coins(), flags(0) {}
};
typedef boost::unordered_map<uint256, CCoinsCacheEntry, CCoinsKeyHasher> CCoinsMap;
typedef boost::unordered_map<uint256, CCoinsCacheEntry, SaltedTxidHasher> CCoinsMap;

This comment has been minimized.

@TheBlueMatt

TheBlueMatt May 17, 2016

Contributor

Does std/boost::unordered_map guarantee that this will not create SaltedTxidHashers willy-nilly, using different salts?

@TheBlueMatt

TheBlueMatt May 17, 2016

Contributor

Does std/boost::unordered_map guarantee that this will not create SaltedTxidHashers willy-nilly, using different salts?

This comment has been minimized.

@sipa

sipa May 17, 2016

Member

Yes, a single hasher is constructed at map creation time.

@sipa

sipa May 17, 2016

Member

Yes, a single hasher is constructed at map creation time.

sipa added some commits May 6, 2016

Use SipHash-2-4 for CCoinsCache index
This is ~1.7x slower than the Lookup3-of-Xor-with-salt construct we were
using before, but it is a primitive designed for exactly this.
@TheBlueMatt

This comment has been minimized.

Show comment
Hide comment
@TheBlueMatt

TheBlueMatt May 17, 2016

Contributor

ACK a68ec21

Contributor

TheBlueMatt commented May 17, 2016

ACK a68ec21

@TheBlueMatt TheBlueMatt referenced this pull request May 18, 2016

Merged

Compact Blocks #8068

@laanwj laanwj merged commit a68ec21 into bitcoin:master May 18, 2016

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

laanwj added a commit that referenced this pull request May 18, 2016

Merge #8020: Use SipHash-2-4 for various non-cryptographic hashes
a68ec21 Use SipHash-2-4 for address relay selection (Pieter Wuille)
8cc9cfe Switch CTxMempool::mapTx to use a hash index for txids (Pieter Wuille)
382c871 Use SipHash-2-4 for CCoinsCache index (Pieter Wuille)
0b1295b Add SipHash-2-4 primitives to hash (Pieter Wuille)
@rebroad

This comment has been minimized.

Show comment
Hide comment
@rebroad

rebroad Aug 9, 2016

Contributor

I'm trying to work out what problem these siphashes fix. What is the risk that this solves please?

Contributor

rebroad commented Aug 9, 2016

I'm trying to work out what problem these siphashes fix. What is the risk that this solves please?

@laanwj

This comment has been minimized.

Show comment
Hide comment
@laanwj

laanwj Aug 13, 2016

Member

It's fast and cryptographically solid - which reduces DoS risk by attackers remotely predicting the behavior of the hash function.

As I quoted above already, pretty much same rationale as for Python: https://www.python.org/dev/peps/pep-0456/

Member

laanwj commented Aug 13, 2016

It's fast and cryptographically solid - which reduces DoS risk by attackers remotely predicting the behavior of the hash function.

As I quoted above already, pretty much same rationale as for Python: https://www.python.org/dev/peps/pep-0456/

codablock added a commit to codablock/dash that referenced this pull request Sep 16, 2017

Merge bitcoin#8020: Use SipHash-2-4 for various non-cryptographic hashes
a68ec21 Use SipHash-2-4 for address relay selection (Pieter Wuille)
8cc9cfe Switch CTxMempool::mapTx to use a hash index for txids (Pieter Wuille)
382c871 Use SipHash-2-4 for CCoinsCache index (Pieter Wuille)
0b1295b Add SipHash-2-4 primitives to hash (Pieter Wuille)

codablock added a commit to codablock/dash that referenced this pull request Sep 19, 2017

Merge bitcoin#8020: Use SipHash-2-4 for various non-cryptographic hashes
a68ec21 Use SipHash-2-4 for address relay selection (Pieter Wuille)
8cc9cfe Switch CTxMempool::mapTx to use a hash index for txids (Pieter Wuille)
382c871 Use SipHash-2-4 for CCoinsCache index (Pieter Wuille)
0b1295b Add SipHash-2-4 primitives to hash (Pieter Wuille)

codablock added a commit to codablock/dash that referenced this pull request Sep 27, 2017

Merge bitcoin#8020: Use SipHash-2-4 for various non-cryptographic hashes
a68ec21 Use SipHash-2-4 for address relay selection (Pieter Wuille)
8cc9cfe Switch CTxMempool::mapTx to use a hash index for txids (Pieter Wuille)
382c871 Use SipHash-2-4 for CCoinsCache index (Pieter Wuille)
0b1295b Add SipHash-2-4 primitives to hash (Pieter Wuille)

codablock added a commit to codablock/dash that referenced this pull request Dec 21, 2017

Merge bitcoin#8020: Use SipHash-2-4 for various non-cryptographic hashes
a68ec21 Use SipHash-2-4 for address relay selection (Pieter Wuille)
8cc9cfe Switch CTxMempool::mapTx to use a hash index for txids (Pieter Wuille)
382c871 Use SipHash-2-4 for CCoinsCache index (Pieter Wuille)
0b1295b Add SipHash-2-4 primitives to hash (Pieter Wuille)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment