Switch to libsecp256k1-based ECDSA validation #6954

Merged
merged 1 commit into from Nov 16, 2015

Projects

None yet
@sipa
Member
sipa commented Nov 5, 2015

This updates the libsecp256k1 subtree to upstream master, and switches ECDSA validation to it.

Practical upshot:

  • Signature validation is anywhere between 2.5 and 5.5 times faster.
  • Consensus code no longer depends on OpenSSL or its signature parser.
  • Removes linking with OpenSSL from libconsensus.

Libsecp256k1 itself has not had a stable release, but we're very close to that. This PR is effectively a preview, with the intention of switching to the released version before the Bitcoin Core 0.12 release.

The past months libsecp256k1 has undergone very extensive testing and validation, though some of that work is still under review. This includes:

  • Very high branch coverage (which required generating a trillion random test vectors and strip them down to the 32 which collectively give the highest coverage for the scalar arithmetic code)
  • Formal proofs for the group laws using SAGE, and human-verifiable proof annotations for the field arithmetic logic.
  • Test code that generates random valid DER signatures, fuzzes them, and compares the result of parsing them internally and using OpenSSL - which turned up a set of (to me) previously unknown behaviour in OpenSSL's parser.
  • The ability to compile in a test mode which switches to a slightly different curve (one constant change), resulting in a group of only 139 points instead of around 2256, allowing exhaustive testing of nearly all code.

The above things are planned to be finished before final release, as well as some API changes - though probably none that affect Bitcoin's usage.

Thanks to everyone who contributed so far (including but not limited to @gmaxwell, @apoelstra, @peterdettman, @theuni, @luke-jr, ...).

@gmaxwell gmaxwell added this to the 0.12.0 milestone Nov 5, 2015
@dcousens
Contributor
dcousens commented Nov 5, 2015

concept ACK

@laanwj
Member
laanwj commented Nov 5, 2015

ACK

@MarcoFalke
Member

Concept ACK

$ ./contrib/devtools/git-subtree-check.sh src/secp256k1
src/secp256k1 in HEAD was last updated to upstream commit 2bfb82b10edf0f0b0e366a12f94c8b21a914159d (tree 7a74d5de4d518bc5165feeeb75edb0ac4e8c5bea)
src/secp256k1 in HEAD currently refers to tree 7a74d5de4d518bc5165feeeb75edb0ac4e8c5bea
GOOD
@fanquake
Contributor
fanquake commented Nov 6, 2015

Concept ACK. Will start testing shortly.

@jonasschnelli
Member

Concept ACK.

Awesome work! Many thanks to all contributors, especially @sipa and @gmaxwell.
Started testing now.

@paveljanik
Contributor

In the build process, you run gen_context:

gcc -I./ -g -O2 -Wall -Wextra -Wno-unused-function -c src/gen_context.c -o gen_context.o
gcc gen_context.o -o gen_context
./gen_context

Is this compatible with our gitian/crossbuild processes?

@jonasschnelli
Member

@paveljanik: Yes. It is. I also tested a gitian build: https://bitcoin.jonasschnelli.ch/pulls/6954/

@jgarzik
Member
jgarzik commented Nov 6, 2015

concept ACK

@jonasschnelli
Member

Just synced a fresh node with this branch with -debug=bench. It took ~9h.
Standard parameters (i didn't change dbcache).
Debug log is here: https://bitcoin.jonasschnelli.ch/secp/debug.log (>300MB!).

System: Quad Core Intel(R) Xeon(R) CPU E31245 @ 3.30GHz, 16GB Ram, 7200rpm disk.

@CodeShark
Contributor

Concept ACK

@gmaxwell
Member
gmaxwell commented Nov 6, 2015

ACK. I provided some nits, I consider them super minor and not blockers.

Also tested IBD w/ no checkpoints, reindex, reindex in valgrind, signmessage/verify message. Could not test derive, because we don't actually expose it.

@jonasschnelli
Member

Unrepresentative non-scientific trivial IBD comparison with standard parameters:
System Ressources see comment above. 1000MBit internet connection (server data center).

Current Master:

2015-11-06 16:54:46 Bitcoin version v0.11.99.0-4ee149a (2015-11-05 23:39:48 +0100)
--- snip ---
2015-11-07 04:53:00 UpdateTip: new best=00000000000000000b48452a4825f7a171b7dcc4a4e63db523a39b9b33536d6f  height=382323

Time for IBD: ~11h59'

This PR

2015-11-06 07:40:49 Bitcoin version v0.11.99.0-332a5b0 (2015-11-06 08:30:43 +0100)
--- snip ---
015-11-06 16:30:17 UpdateTip: new best=00000000000000000b48452a4825f7a171b7dcc4a4e63db523a39b9b33536d6f  height=382323

Time for IBD: ~8h50`

Will do the same now with -dbcache=3000.

@gmaxwell
Member
gmaxwell commented Nov 8, 2015

Will do the same now with -dbcache=3000.

FWIW, you need a coinscache around 5500 MB to actually have the UTXO in ram. In that configuration with this patch and checkpoints disabled I performed an IBD off the network in about three and a half hours:

2015-11-06 01:25:11 Bitcoin version v0.11.99.0-66c5214 (2015-11-05 21:19:41 +0100)
...
2015-11-06 05:04:32 UpdateTip: new best=000000000000000003c9781a9ef9c3e1b7b5a644f26c0eb448513366fd31782f  height=382265  log2_work=83.564787  tx=91230928  date=2015-11-06 05:03:19 progress=0.999999  cache=5486.7MiB(7411884tx)
@jonasschnelli
Member

-dbache=3000 benchmark:

current master:
2015-11-07 19:21:43 Bitcoin version v0.11.99.0-4ee149a (2015-11-05 23:39:48 +0100)
2015-11-08 03:58:51 UpdateTip: new best=000000000000000009641f7a3734cfb9fe861c20a1aa77e8d1d6780913728bfe height=382556
Total: ~7h37'

This PR
2015-11-08 12:06:24 Bitcoin version v0.11.99.0-332a5b0 (2015-11-06 08:30:43 +0100)
2015-11-08 15:54:21 UpdateTip: new best=000000000000000009641f7a3734cfb9fe861c20a1aa77e8d1d6780913728bfe height=382556
Total: ~3h48'

@jonasschnelli
Member

-dbcache=6000 with this PR:

2015-11-08 19:17:23 Bitcoin version v0.11.99.0-332a5b0 (2015-11-06 08:30:43 +0100)
2015-11-08 22:39:21 UpdateTip: new best=000000000000000009641f7a3734cfb9fe861c20a1aa77e8d1d6780913728bfe  height=382556 

Total: 3h22

@jonasschnelli
Member

Just encountered a possible issue:

The node running this PR is stuck on height 382748.

  {
    "height": 382761,
    "hash": "00000000000000000a017d90eaf3bcffdc106a4606ab7e94880a506b1b7ca8a4",
    "branchlen": 13,
    "status": "headers-only"
  }, 
  {
    "height": 382748,
    "hash": "00000000000000001043ada919c3851e17876deca67acc19f365fab4a79bd59d",
    "branchlen": 0,
    "status": "active"
  }, 
  {
    "height": 382649,
    "hash": "000000000000000011981b8d79967f6d2c04e2a008d3365b404752898eb784d5",
    "branchlen": 1,
    "status": "headers-only"
  }

Therefore the mempool is getting relatively big:

{
  "size": 21944,
  "bytes": 136785777,
  "usage": 299990208,
  "maxmempool": 300000000,
  "mempoolminfee": 0.00002013
}

Node running current master (openssl verification) on the same machine (same ip, different port) reports:

~$ ./src/bitcoin-cli <snipp> getchaintips
[
  {
    "height": 382760,
    "hash": "000000000000000003bd88e340376361db3422c8df01b0dacfd4b291da71b7fe",
    "branchlen": 0,
    "status": "active"
  }
]

last 1000 lines of the debug log: https://bitcoin.jonasschnelli.ch/secp/stuck_debug.log

@morcos
Contributor
morcos commented Nov 9, 2015

I'm very in favor of this being merged, however I can't say I have reviewed this code or done specific testing. I have however been using it over the last several months and never encountered a problem.

@petertodd
Member

Re: libsecp256k1 testing, awesome work!

Is there a document anywhere describing that process a bit more formally? It'd be good to have that to point people too. (particularly if you have enough time to make it repeatable by others)

@jonasschnelli
Member

F.Y.I:
The problem above (#6954 (comment)) could be cured – as expected – with a bitcoind restart.
I have bootstrapped another fresh mainnet node with this PR and -dbcache=6000 and it didn't happen again.
Anyways, the problem is/was very likely unrelated to this PR.

@jonasschnelli jonasschnelli commented on an outdated diff Nov 10, 2015
src/init.cpp
@@ -967,6 +967,7 @@ bool AppInit2(boost::thread_group& threadGroup, CScheduler& scheduler)
// ********************************************************* Step 4: application initialization: dir lock, daemonize, pidfile, debug log
// Initialize elliptic curve code
+ ECC_Verify_Start();
@jonasschnelli
jonasschnelli Nov 10, 2015 Member

nit: it would be nice if we could print a log entry similar to L1005 about the usage of libsecp256k1 for verification (maybe with a version number [in future]). It might be helpful when analyzing a debug.log coming out of self-compiled bitcoin-core versions.

@jonasschnelli
Member

Tested & Code-Review ACK.

Did serval IBD's and running nodes with this PR since 3 days without problems (expect the one that is not related to this PR). Attached my OSX leaks detector during IBD and -verifyblocks=288 phase (can only collect the BDB leaks).

+1 for next step to fully remove the openssl dependency by replacing RandAddSeed() and co. with something like #5885 (wallet AES could also be replaced by #5949). The openssl dependency then would only be required when building the QT only BIP70 implementation.

@laanwj
Member
laanwj commented Nov 10, 2015

Reindex up to block 380912 w/ dbcache=8000 took 3.2 hours with this patch, 8.0 hours without.

@jtimon
Contributor
jtimon commented Nov 10, 2015

Concept ACK. I've been waiting for this.

@gmaxwell
Member

This needs more people to review the integration code before it can be merged. (Enough benchmarks.)

@sipa
Member
sipa commented Nov 11, 2015

Rebased on top of #6983 and nits addressed.

@jonasschnelli
Member

The DIFF is huge because of the squashes libsecp256k1 git subtree. The integration code (first couple of files in the diff) can be reviewed with a fair amount of effort. One shouldn't be discourage because of the large diff.

@sipa
Member
sipa commented Nov 11, 2015

@jonasschnelli Or just look at the individual commits.

@jtimon jtimon referenced this pull request Nov 13, 2015
Merged

Update libsecp256k1 #6983

@sipa
Member
sipa commented Nov 13, 2015

Rebased.

@laanwj laanwj and 1 other commented on an outdated diff Nov 13, 2015
src/pubkey.h
@@ -205,4 +210,12 @@ struct CExtPubKey {
bool Derive(CExtPubKey& out, unsigned int nChild) const;
};
+/** Users of this module must hold an ECCVerifyHandle. */
@laanwj
laanwj Nov 13, 2015 Member

Users have to hold a ECCVerifyHandle. Maybe we should make them pass it in to make sure of that? :-)

@sipa
sipa Nov 13, 2015 Member

I've considered that, but it would be a mess to pass it down all the way from libconsensus' main function through all validation and script code to pubkey.cpp...

@laanwj laanwj commented on the diff Nov 13, 2015
src/ecwrapper.cpp
-
-bool CECKey::Recover(const uint256 &hash, const unsigned char *p64, int rec)
-{
- if (rec<0 || rec>=3)
- return false;
- ECDSA_SIG *sig = ECDSA_SIG_new();
- BN_bin2bn(&p64[0], 32, sig->r);
- BN_bin2bn(&p64[32], 32, sig->s);
- bool ret = ECDSA_SIG_recover_key_GFp(pkey, sig, (unsigned char*)&hash, sizeof(hash), rec, 0) == 1;
- ECDSA_SIG_free(sig);
- return ret;
-}
-
-bool CECKey::TweakPublic(const unsigned char vchTweak[32]) {
- bool ret = true;
- BN_CTX *ctx = BN_CTX_new();
@laanwj
laanwj Nov 13, 2015 Member

Happy to see this code in red.

@laanwj laanwj commented on an outdated diff Nov 13, 2015
src/init.cpp
@@ -978,6 +979,7 @@ bool AppInit2(boost::thread_group& threadGroup, CScheduler& scheduler)
// Initialize elliptic curve code
ECC_Start();
+ verify_handle.reset(new ECCVerifyHandle());
@laanwj
laanwj Nov 13, 2015 Member

Nice to see this has an explicit lifecycle, instead of the OpenSSL global init object

@laanwj laanwj and 2 others commented on an outdated diff Nov 13, 2015
src/pubkey.cpp
-#include "ecwrapper.h"
+namespace
+{
+secp256k1_context* secp256k1_context_verify = NULL;
@laanwj
laanwj Nov 13, 2015 Member

Does this need synchronization?

@sipa
sipa Nov 13, 2015 Member

Only the secp256k1_context_randomize (not used in this module) and secp256k1_context_destroy (obviously) functions need exclusive access, so no, as long as all uses happen after the context is guaranteed to be constructed.

@laanwj
laanwj Nov 13, 2015 Member

Ok. But verify_refcount is not updated atomically, so multiple threads creating/destroying ECCVerifyHandles at the same time can cause problems. Maybe that's not supposed to happen though.

@sipa
sipa Nov 13, 2015 Member

That's indeed ugly, but it isn't supposed to happen: all but one instance of ECCVerifyHandle are global (and global initializers are guaranteed to not run in parallel), and the one other is in init.cpp, before any parallel code runs. I've considered putting a lock around it, but it would make libconsensus depend on boost again (until c++11).

@sipa
sipa Nov 13, 2015 Member

Perhaps that can use a comment to explain that constraint.

@laanwj
laanwj Nov 13, 2015 Member

Yes let's just document it

@jtimon
jtimon Nov 14, 2015 Contributor

Yes, please don't put synchronization in libbitcoinconsensus, at least until we decide for either C++11 or C for the long term language trend for libconsensus. Last time I checked in IRC, it seems I wasn't alone in preferring a libconsensus that eventually is only C over one that is eventually C++11. Although that decision shouldn't affect this PR or even libconsensus' encapsulation completion, I insist that it would be good very good to know this asap.

@laanwj
Member
laanwj commented Nov 13, 2015

Code review ACK

@laanwj laanwj and 1 other commented on an outdated diff Nov 13, 2015
src/pubkey.cpp
+ slen--;
+ spos++;
+ }
+ /* Copy S value */
+ if (slen > 32) {
+ overflow = 1;
+ } else {
+ memcpy(tmpsig + 64 - slen, input + spos, slen);
+ }
+
+ if (!overflow) {
+ overflow = !secp256k1_ecdsa_signature_parse_compact(ctx, sig, tmpsig);
+ }
+ if (overflow) {
+ memset(tmpsig, 0, 64);
+ secp256k1_ecdsa_signature_parse_compact(ctx, sig, tmpsig);
@laanwj
laanwj Nov 13, 2015 Member

Looks like another "Hack to initialize sig with a correctly-parsed but invalid signature"? We already did this in the beginning (not that it can hurt to do it again)

@sipa
sipa Nov 13, 2015 Member

The other call to secp256k1_ecdsa_signature_parse_compact above may have overwritten the result in sig, but still have returned overflow.

Note that this function is (at this point) a literal copy from the src/secp256k1/contrib directory, where it is well-tested. It's however copied as the code in contrib carries no real guarantees of providing the exact same behaviour in the long run, and this is consensus critical.

@laanwj
laanwj Nov 14, 2015 Member

Ouch. That one is subtle. I'd suggest adding a comment so that other people don't have the same dumb thought as me.

@gmaxwell
Member

This pull could write the release notes too, see comments for benchmark numbers. :)

@gmaxwell
Member

I can also confirm this makes Bitcoin work on unmodified RHEL7.

@jtimon jtimon and 1 other commented on an outdated diff Nov 15, 2015
src/init.cpp
@@ -154,6 +154,7 @@ class CCoinsViewErrorCatcher : public CCoinsViewBacked
static CCoinsViewDB *pcoinsdbview = NULL;
static CCoinsViewErrorCatcher *pcoinscatcher = NULL;
+static boost::scoped_ptr<ECCVerifyHandle> verify_handle;
@jtimon
jtimon Nov 15, 2015 Contributor

bike-shedding: s/verify_handle/globalVerifyHandle or something of the short to make its scope clearer.

@sipa
sipa Nov 15, 2015 Member

Done.

@jtimon jtimon commented on the diff Nov 15, 2015
src/pubkey.h
@@ -205,4 +210,13 @@ struct CExtPubKey {
bool Derive(CExtPubKey& out, unsigned int nChild) const;
};
+/** Users of this module must hold an ECCVerifyHandle. The constructor and
+ * destructor of these are not allowed to run in parallel, though. */
@jtimon
jtimon Nov 15, 2015 Contributor

@laanwj wrote:

Users have to hold a ECCVerifyHandle. Maybe we should make them pass it in to make sure of that? :-)

I agree. It may look weird because it doesn't appear to be used.
But if secp256k1_context_verify and verify_refcount were static attributes of ECCVerifyHandle, we would see much clearly which functions/methods need to take the ECCVerifyHandle as parameter.
I suggest doing both changes.

@jtimon
jtimon Nov 15, 2015 Contributor

In fact, can't the consensus code that needs it simply take the secp256k1_context_verify explicitly?

@sipa
sipa via email Nov 15, 2015 Member
@sipa
sipa via email Nov 15, 2015 Member
@sipa
sipa Nov 15, 2015 Member

Oh, static attributes! I missed that... that's possible.

@sipa
sipa Nov 15, 2015 Member

Done. Only for the refcount though, as the context variable is needed outside of the class (for obvious reasons).

@jtimon
jtimon Nov 15, 2015 Contributor

Yes, I guess passing secp256k1_context_verify explicitly explicitly would be too disruptive. Could we document somewhere that it would be ideal but we just don't want to do it right now? Maybe an example doing it "the right way" (a static getter or something) would be useful too.

@sipa
sipa via email Nov 15, 2015 Member
@jtimon
Contributor
jtimon commented Nov 15, 2015

I cannot give an utACK because I'm not trying to understand everything that is done here.
But I reviewed the parts is easy for me to understand and I would utACK those parts (modulo my "put secp256k1_context_verify inside ECCVerifyHandle" nit).

@sipa
Member
sipa commented Nov 15, 2015

Addressed some nits.

@laanwj
Member
laanwj commented Nov 16, 2015

Bikeshedding over different (but semantically equivalent) syntax can happen later. I'd like to move forward with this, to make sure it ends up in 0.12 and to save @sipa the work of rebasing this every time.

Further testing and review will probably only happen anyway when this is merged.

@laanwj laanwj merged commit 6e18268 into bitcoin:master Nov 16, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@jtimon
Contributor
jtimon commented Nov 16, 2015

Sure, my non-addressed suggestion was too disruptive to make this happen fast and it's not a priority.
I'm glad that this got merged like this, but I'm also glad that my suggestion doesn't seem crazy either.

@Aquentus

Is there proper and academic documentation for these proposed changes and is there a BIP?

This isn't solely a code change, in my opinion. New algorithms are being proposed towards some of the most fundamental aspects of bitcoin - signature validation. As we all know, things can go terribly wrong in this space and unfortunately it is not even knowable what can go wrong.

Certain experts have expressed concerns about seemingly impossible claims that 700% improvements can be achieved by these "invented" mathematical algorithms. Further concerns have been raised towards suggestions that this is more tested than standard practice when standard practice requires some 20 years of testing.

I am not, of course, suggesting that the proposed changes are conceptually deficient, but I would not be comfortable with these changes in the absence of proper documentation and wider review from experts who specialise in this area because it is far too easy to hide backdoors in this space or less maliciously to unknowably fatally weaken the algorithm and its security.

@luke-jr
Member
luke-jr commented Dec 24, 2015

@Aquentus While I agree with your concerns, I am not sure this needs a BIP beyond BIP 66. The review of libsecp256k1 (which included comparison of results with the current consensus-protocol OpenSSL verification) turned up serious consensus-failure bugs in OpenSSL which BIP 66 solved. When review of the new code has fixed de facto problems in the old code, I think it demonstrates a clear higher standard of testing for the new code. "20 years of testing" is non-existent insofar as consensus safety is concerned - the field of decentralised consensus systems is only 6 years old, after all.

@sipa
Member
sipa commented Dec 24, 2015

I disagree with a need for a BIP. None of the consensus rules are intended
to change.

I do agree that better visibility of the algorithms used is useful.

One thing that is planned before final release is an explanation of the
test procedures and formal verification mechanisms used.

A small summary of the optimizations used in signature verification, and are not used by OpenSSL:

  • A special representation for field elements (numbers modulo 2^256 - 2^32
  • 933), using 5 52-bit limbs (rather than the typical 4 64-bit limbs),
    inspired by an Ed25519 implementation. This allows performing multiple
    additions and constant multiplications in a row without carry or
    normalization.
  • Multiplication algorithm for field elements that computes the
    intermediate high 256 bits in parallel with the low 256 bits (by Peter
    Dettman).
  • x86_64 assembly implementation for scalar and field multiplications
    (disabled in Bitcoin's build).
  • Inversion-free comparison for ECDSA verification (by Greg Maxwell).
  • Fast inversion and square root ladders (by Peter Dettman).
  • Very large table of precomputed multiples of the secp256k1 curve
    generator point (~1 megabyte), resulting in very low overhead for computing
    aP + bG compared to just aP.
  • Effective affine EC multiplication, which allows using affine group
    addition formulae with a single correction at the end for all additions
    (possible due to secp256k1 having a=0 in the group equation, by Peter
    Dettman).
  • Bypassing the exensive point-in-group check needed by ECDSA, as for
    secp256k1 every point on the curve belongs to the group (it has cofactor 1).
  • No heap memory overhead anywhere during tight loops, as all memory is
    allocated on the stack, preconputed, or constant.
  • No indirect calls or function pointers, as libsecp256k1 only supports one
    curve, so no dispatch/lookups for implementations are needed.

For all of the above, we have:

  • High test coverage (the only missing parts are cryptographically
    improbable to reach ones).
  • Hand verifiable proof of correctness for the field multiplication
    algorithm (provably no overflow, and provably equivalent to the correct
    response assuming no overflow).
  • Computer verified proof of correctness for group addition formulae
    (computer algebra system can prove polynomial equivalence between
    implementation and mathematical expressions for the result).
  • Special compilable mode that changes a constant to end up with a very
    small group, and exhaustive tests that all assumptions remain true (in
    progress, by Andrew Poelstra).
  • Test cases for the scalar code that were extracted from a set of 1
    trillion randomly generated tests which give very high
    coverage, and work in progress to algebraically derive cases that trigger
    the (nearly) unreachable remaining ones.

Furthermore:

  • An old build mode which used OpenSSL's bignum code for integer operations helped discover a bug in OpenSSL square code (CVE-2014-3570).
  • During tested we discovered an platform inconsistency in OpenSSL's signature parsing code (fixed by BIP66).
  • Our ECDSA signature fuzzer pointed out several types of previously unknown invalid signature types that OpenSSL accepted (discovered after BIP66 took effect).
@daira daira referenced this pull request in zcash/zcash Feb 14, 2016
Closed

eliminate openssl as a dependency of libzerocash #485

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment