Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-enable secp256k1-zkp and make it default #1534

Closed
prusnak opened this issue Mar 18, 2021 · 8 comments · Fixed by #1949
Closed

Re-enable secp256k1-zkp and make it default #1534

prusnak opened this issue Mar 18, 2021 · 8 comments · Fixed by #1949
Assignees

Comments

@prusnak
Copy link
Member

prusnak commented Mar 18, 2021

secp256k1(-zkp) performs better for signature verification and contains schnorr algo

@prusnak prusnak added this to the 21.05 milestone Mar 18, 2021
@prusnak
Copy link
Member Author

prusnak commented Mar 18, 2021

I did some experiments in the secp256k1-zkp branch

Currently, trezor.crypto.curves.secp256k1_zkp is not a drop-in replacement for trezor.crypto.curves.secp256k1, because we need to use the context manager.

So the following

secp256k1.foo()

turns into this:

with secp256k1_zkp.Context() as secp256k1:
    secp256k1.foo()

However, we might want to make this a drop-in replacement (so it looks and behaves like a module with static methods) and hide the context creation behind the scenes. There are (at least) two ways how to do this:

  • allocate the context before uPy interpreter runs and keep it resident (outside of uPy heap)
  • allocate the context within uPy on first use and keep it resident (inside of uPy heap)
  • other ... ?

@onvej-sl
Copy link
Contributor

TLDR

  • Signing, private key to public key derivation and ecdh computation are resistant to simple power analysis (and timing attacks). If we use context randomization (secp256k1_context_randomize), signing and private key to public key derivation should be also resistant to differential power analysis. Nevertheless, the randomization makes the operations twice as long. It seems to me that ecdh computation is prone to differential power analysis. All these operation should be resistant to both simple power analysis and differential power analysis in trezor-crypto
  • See the tables below for the choice of constants ECMULT_WINDOW_SIZE and ECMULT_GEN_PREC_BITS.

Suggestion

  • It doesn't make sense to recreate the context before every signing and verification, since

    • creating the context takes longer than signing or verification and
    • the context is modified only by secp256k1_context_randomize (or secp256k1_ecmult_gen_blind) which serves as protection against differential power analysis and the only part that is modified is initial and blind in secp256k1_ecmult_gen_context.
  • Instead of that I suggest having a global context that is created once. If we want to have a differential power analysis protection, the context should be randomized with secp256k1_context_randomize before every call of secp256k1_ecdsa_sig_sign and secp256k1_ec_pubkey_create. If we want these operations to be thread-safe we may create a "shallow copy" of the context with memcpy(ctx_clone, ctx, sizeof(secp256k1_context)) before the randomization. It should be OK since initial and blind are stored in the context as values, whereas all the big tables are stored as pointers.

  • We can also make use of secp256k1_ec_pubkey_create, which is faster than ecdsa_get_public_key{33,65} (and scalar_multiply). If we want to use it even in bip32.c, the context cannot be stored on the micropython heap.

Report

Constants

ECMULT_WINDOW_SIZE

The constant is an integer from 2 to 24, inclusive. The bigger the constant is

  • the faster is secp256k1_ecmult (and secp256k1_ecdsa_sig_verify)
  • the bigger is the size of secp256k1_ecmult_context (and secp256k1_context)
  • the slower is secp256k1_ecmult_context_build (and secp256k1_context_preallocated_create)
ECMULT_WINDOW_SIZE secp256k1_ecdsa_sig_verify secp256k1_ecmult_context_build secp256k1_ecmult_context
2 16.1 ms 5.9 ms 128 B
4 14.3 ms 6.4 ms 512 B
6 13.5 ms 8.3 ms 2 kB
8 12.9 ms 16.2 ms 8 kB
10 12.6 ms 47 ms 32 kB

ECMULT_GEN_PREC_BITS

The constant is 2, 4 or 8. The bigger the constant is

  • the faster is secp256k1_ecmult_gen (and secp256k1_ecdsa_sig_sign, secp256k1_ec_pubkey_create)
  • the bigger is the size of secp256k1_ecmult_gen_context (and secp256k1_context)
  • the slower is secp256k1_ecmult_gen_context_build (and secp256k1_context_preallocated_create), provided the context is not precomputed in compile-time using the flag USE_ECMULT_STATIC_PRECOMPUTATION.
ECMULT_GEN_PREC_BITS secp256k1_ecdsa_sig_sign secp256k1_ec_pubkey_create secp256k1_ecmult_gen_context
2 23.4 ms 14.4 ms 32 kB
4 11.7 ms 7.2 ms 64 kB
8 5.9 ms 3.6 ms 512 kB

Only the times for ECMULT_GEN_PREC_BITS = 4 in the table were measured, other times are estimated.

WINDOW_A

The constant has a fixed value of 5 which is optimal in terms of speed of functions secp256k1_ecmult (and secp256k1_ecdsa_sig_verify) and secp256k1_ecmult_const (and secp256k1_ecdh). The bigger the constant is the more memory on stack these functions needs.

WINDOW_A secp256k1_ecdsa_sig_verify secp256k1_ecdh
2 1440 B 14.7 ms 168 B 23.7 ms
3 1772 B 13.6 ms 336 B 15.1 ms
4 2436 B 13.1 ms 672 B 12.7 ms
5 3764 B 12.9 ms 1344 B 11.8 ms
6 6420 B 13.2 ms 2688 B 12.2 ms

The times were measured for ECMULT_GEN_PREC_BITS = 4 and ECMULT_WINDOW_SIZE = 8, bytes are a lower estimate.

Contexts

secp256k1_context

The context consists of secp256k1_ecmult_context and secp256k1_ecmult_gen.

secp256k1_ecmult_gen_context

The context is used by secp256k1_ecmult_gen (and secp256k1_ecdsa_sig_sign, secp256k1_ec_pubkey_create).
The context consists of:

  • A read-only two-dimensional table.
    • It consists of 2^ECMULT_GEN_PREC_BITS rows and 256 / ECMULT_GEN_PREC_BITS columns.
    • Its total size is sizeof(secp256k1_ge_storage) * 2^ECMULT_GEN_PREC_BITS * 256 / (ECMULT_GEN_PREC_BITS) = 16384 * 2^ECMULT_GEN_PREC_BITS / ECMULT_GEN_PREC_BITS.
    • If the flag USE_ECMULT_STATIC_PRECOMPUTATION, the table is precomputed in compile-tile. This works only for ECMULT_GEN_PREC_BITS = 8.
    • It consists of points with unknown discrete logarithms. This should prevent some attack I don't quite understand.
    • It is a kind of duplication of the table in secp256k1_table.h.
  • A point initial and a scalar blind such that initial = blind * G.
    • The values are used in secp256k1_ecmult_gen (and secp256k1_ecdsa_sig_sign, secp256k1_ec_pubkey_create) as a protection against differential power analysis.
    • The values should be refreshed by secp256k1_context_randomize (or secp256k1_ecmult_gen_blind) before every call of secp256k1_ecmult_gen (that is before every call of secp256k1_ecdsa_sig_sign and secp256k1_ec_pubkey_create), which makes the operation twice as long.
    • This kind of blinding is not implemented in trezor-crypto (which uses randomized coordinates instead of that).

secp256k1_ecmult_context

The context is used in secp256k1_ecdsa_sig_verify.
It consists of two read-only tables:

  • Each table consists of 2^(ECMULT_WINDOW_SIZE - 2) elements.
  • The total size of the tables is sizeof(secp256k1_ge_storage) * 2 * 2^(ECMULT_WINDOW_SIZE - 2) = 32 * 2^ECMULT_WINDOW_SIZE.
  • The tables cannot be precomputed in compile-time. On the other hand, there is no reason to rebuild every time secp256k1_ecdsa_sig_verify is called.

Functions

secp256k1_ecmult_gen

Computes lambda b : b * G, where G is the generator and b is a scalar.

  • It is used by secp256k1_ecdsa_sig_sign and secp256k1_ec_pubkey_create.
  • It is constant time.
  • It computes b * G as (b - initial) * G + blind, where initial and blind are generated so that blind = initial * G. This is a protection against differential power analysis.
  • It uses ECMULT_WINDOW_SIZE-ary representation of b, that is b is represented by digits b_i such that
    • b = sum(b_i * 2^(i * ECMULT_WINDOW_SIZE))
    • 0 <= b_i < 2^ECMULT_WINDOW_SIZE
  • It uses a precomputed table that is contained in secp256k1_ecmult_gen_context. The table consists of points with unknown discrete logarithms. This should prevent a kind of attack I don't quite understand.
  • A similar method is used by scalar_multiply in trezor-crypto, which differs in these points:
    • It uses the constant ECMULT_WINDOW_SIZE = 4.
    • b is represented by digits b_i such that
      • -2^(WINDOW_A - 1) < b_i < 2^(WINDOW_A - 1)
      • b_i % 2 == 1
    • It uses a table that is half in size but it consists of points with known discrete logarithm.

secp256k1_ecmult

Computes lambda a, b, P : a * P + b * G, where G is the generator, a and b are scalars and P is a point.

  • It is used by secp256k1_ecdsa_sig_verify.
  • It uses the fast endomorphism (see bellow).
  • It uses Shamir's trick.
  • It uses sliding-window method.
  • It uses ECMULT_WINDOW_SIZE-ary non-adjacent form of a, that is a is represented by digits a_i such that
    • a = sum(a_i * 2^i)
    • -2^(WINDOW_A - 1) < a_i < 2^(WINDOW_A - 1)
    • a_i % 2 = 1 or a_i = 0
  • It uses WINDOW_A-ary non-adjacent form of b
  • It uses a precomputed table that is contained in secp256k1_ecmult_context.
  • It uses around 2^(WINDOW_A - 2) * (2 * sizeof(secp256k1_ge) + sizeof(secp256k1_gej) + 40) + 260 * sizeof(int) + 68 = 83 * 2^WINDOW_A + 1108 bytes on the stack.
  • It is non constant time, which should be all right since the method is used only in verification which usually does not work with secret information.

secp256k1_ecmult_const

Computes lambda a, P : a * P, where a is a scalar and P is a point.

  • It is used by secp256k1_ecdh.
  • It uses the fast endomorphism.
  • It is constant time.
  • It uses window method.
  • It uses a kind of WINDOW_A - 1-ary form of a, to be precise, a is represented by digits a_i such that
    • a = sum(a_i * 2^((WINDOW_A - 1)* i))
    • -2^(WINDOW_A - 1) < a_i < 2^(WINDOW_A - 1)
    • a_i % 2 == 1
  • It uses around 2^(WINDOW_A - 2) * sizeof(secp256k1_ge) * 2 = 42 * 2^WINDOW_A bytes on the stack.
  • The same method with the same constant (WINDOW_A - 1 = 4) is used by point_multiply in trezor-crypto.
  • It seems to me that it doesn't use any kind of randomization as a protection against differential power analysis.

Fast endomorphism

All the curves secp{192,224,256}k1 were generated so that there is an effectively computable endomorphism that can be utilized to speed up point multiplication. The method appeared in https://www.iacr.org/archive/crypto2001/21390189.pdf for the first time.

The endomorphism phi for secp256k1 is defined as

phi : (x, y) -> (beta * x mod p, y)

where

beta = 0x7ae96a2b657c07106e64479eac3434e99cf0497512f58995c1396c28719501ee

and p is the prime the coordinates are reduced with. If (x, y) lies on the curve, the point phi((x, y)) lies on the curve as well and this surprising formula holds true:

phi(P) = lambda * P

where

lambda = 0x5363ad4cc05c30e0a5261c028812645a122e22ea20816678df02967c1b23bd72

Suppose we want to compute k * P, the algorithm that makes use of the endomorphism is the following:

  • Find k1 and k2 such that k = (k1 + k2 * lambda) mod n, where n is the order of the curve, and k1 and k2 are about half in length compared to k.
  • Compute k1 * P + k2 * phi(P) using Shamir's trick. If double-and-add method is used, the speed-up is about 40 %.

Other facts

@prusnak
Copy link
Member Author

prusnak commented Apr 12, 2021

@onvej-sl WOW, great work Ondrej! Thank you for the very thorough analysis!

@alex-jerechinsky alex-jerechinsky modified the milestones: 21.05, 21.06 May 11, 2021
@tsusanka tsusanka modified the milestones: 21.06, 21.07 May 20, 2021
@alex-jerechinsky alex-jerechinsky added this to 🎯 To do in FW · July 14 🌊 via automation May 24, 2021
@tsusanka tsusanka mentioned this issue Jun 8, 2021
4 tasks
@prusnak
Copy link
Member Author

prusnak commented Jun 12, 2021

Just rebased secp256k1-zkp on top of the current master

@tsusanka tsusanka moved this from 🎯 To do to 🔥 Priority in FW · July 14 🌊 Jun 15, 2021
@alex-jerechinsky alex-jerechinsky moved this from 🔥 Priority to 🏃‍♀️ In progress in FW · July 14 🌊 Jun 16, 2021
@alex-jerechinsky alex-jerechinsky removed this from 🏃‍♀️ In progress in FW · July 14 🌊 Jul 21, 2021
@alex-jerechinsky alex-jerechinsky added this to 🎯 To do in FW · September 8 🍂 via automation Jul 21, 2021
@alex-jerechinsky alex-jerechinsky moved this from 🎯 To do to 🔥 Priority in FW · September 8 🍂 Jul 21, 2021
@alex-jerechinsky alex-jerechinsky removed this from 🔥 Priority in FW · September 8 🍂 Aug 19, 2021
@alex-jerechinsky alex-jerechinsky added this to 🎯 To do in FW · October 13 🍊 via automation Aug 19, 2021
@alex-jerechinsky alex-jerechinsky moved this from 📥 Inbox to 📽 Product in Firmware · Backlog 🗂 Oct 5, 2021
@tsusanka tsusanka removed this from the 21.07 milestone Oct 7, 2021
@tsusanka tsusanka removed this from 🎯 To do in FW · October 13 🍊 Oct 7, 2021
@tsusanka tsusanka added this to 🎯 To do in FW · November 10 🥫 via automation Oct 7, 2021
@tsusanka tsusanka moved this from 🎯 To do to 🏃‍♀️ In progress in FW · November 10 🥫 Oct 7, 2021
@tsusanka tsusanka removed this from 📽 Product in Firmware · Backlog 🗂 Oct 7, 2021
@alex-jerechinsky alex-jerechinsky removed this from 🏃‍♀️ In progress in FW · November 10 🥫 Oct 22, 2021
@alex-jerechinsky alex-jerechinsky added this to 🎯 To do in Oct 7 - Oct 21 ❄️ via automation Oct 22, 2021
@alex-jerechinsky alex-jerechinsky moved this from 🎯 To do to 🏃‍♀️ In progress in Oct 7 - Oct 21 ❄️ Oct 22, 2021
@alex-jerechinsky alex-jerechinsky removed this from 🏃‍♀️ In progress in Oct 7 - Oct 21 ❄️ Nov 18, 2021
@onvej-sl
Copy link
Contributor

Implemented in #1897 and #1678.

@invd
Copy link
Contributor

invd commented Nov 21, 2021

I've started integrating the new mainlined zkp functionality into the T1 fuzzer and noticed some API issues. I'm not sure if this closed ticket is the right place to bring it up, so feel free to move it or respond to it elsewhere. (#1864 feels like a different scope, though.)

  • zkp_context.h doesn't expose bool zkp_context_is_initialized(void), so other code can't check the initialization status of the internal static secp256k1_context *context; variable
  • On regular builds without -DNDEBUG, the assert(context == NULL); in int zkp_context_init() prevents re-initializing non-null contexts
  • This presents practical issues since void zkp_context_destroy() only clears the context, but does not set context = NULL; again

The most straightforward approach is to expose the initialization check and actually null the context pointer on destroy, but I haven't looked into this very deeply and may be missing some logic or side effects.

@onvej-sl
Copy link
Contributor

I'm not sure if this closed ticket is the right place to bring it up, so feel free to move it or respond to it elsewhere

Next time please open a new issue. It is easier to track it.

@invd
Copy link
Contributor

invd commented Nov 27, 2021

Next time please open a new issue.

Will do, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Backlog 🗂
❌ Closed
Development

Successfully merging a pull request may close this issue.

5 participants