Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement blinding for scalar multiplication #11361

Closed

Conversation

dfaranha
Copy link

@dfaranha dfaranha commented Mar 18, 2020

We are a group of security researchers and cryptographers from academia and industry, and would like to continue the report of a security vulnerability in OpenSSL's implementation of binary/prime field ECDSA signatures. We initiated contact by e-mail in December 2019 and decided to open a pull request publicly to collaborate further with a fix.

  • Affected versions: 1.0.2u and 1.1.0l (current stable releases except 1.1.1 branch)
  • Affected curve parameters:
    • sect163r1, sect283r1/k1, sect409k1, sect571r1 (i.e. binary curves with group order slightly below the power of two) for 1.0.2u and 1.1.0l
    • secp192r1/k1, secp224r1, secp256r1/k1, secp384r1, secp521r1 (i.e. prime curves with group order slightly below the power of two) for 1.0.2u

Severity: full key exposure via cache timing attack

Executive summary

We discovered non-constant time implementations of Montgomery ladder scalar multiplication in the aforementioned releases, which enable the attacker to learn 1-bit of secret nonce with high precision making use of FLUSH+RELOAD cache timing attack technique [1]. Such a small leakage of nonces yields to key-recovery attacks from sufficiently many ECDSA signatures, due to our optimized version of Bleichenbacher's technique [2,3].

A full description of the attack can be found in [4] below.

[1] Y. Yarom, K. Falkner. "FLUSH+RELOAD: a High Resolution, Low Noise,L3 Cache Side-Channel Attack". USENIX Security 2014.

[2] D. Bleichenbacher. "On the generation of one-time keys in DL signature schemes". Presentation at IEEE P1363 working group meeting. 2000.

[3] A. Takahashi, M. Tibouchi, M. Abe. "New Bleichenbacher records: fault attacks on qDSA signatures". TCHES 2018(3), pp. 331–371, 2018.

[4] D. F. Aranha, F. R. Novaes, A. Takahashi, M. Tibouchi, Y. Yarom. "LadderLeak: Breaking ECDSA With Less Than One Bit Of Nonce Leakage". Cryptology ePrint Archive: Report 2020/615, available at https://eprint.iacr.org/2020/615

Overview of the vulnerability

The attack starts with the detection based on the second topmost bit using a cache-timing attack and follows with the Bleichenbacher methodology. Although the vulnerabilities are similar we split the discussion in the binary and prime curve cases.

  • For the binary curve case, the Montomery ladder is implemented in function ec_GF2m_montgomery_point_multiply() within file ec2_mult.c using López-Dahab coordinates. The function computes scalar multiplication kP for fixed-length scalar k and input point P = (x,y). The ladder starts by initializing two points (X1,Z1) = (X, 1) and (X2,Z2) = 2P = (x^4 + b, x^2). The first loop iteration follows after a conditional swap function that exchanges these two points based on the value of the second topmost key bit. The first function to be called within the first iteration is gf2m_Madd(), which starts by multiplying by value Z1. However, since the finite field arithmetic is not implemented in constant-time for binary fields, there is a timing difference between multiplying by (1) or (x^2), since modular reduction is only needed in the second case. In particular, a modular reduction will be computed when Z1 is x^2 after the conditional swap. This happens when the second topmost bit is 1 because the conditional swap effectively swapped the two sets of values. Although the timing difference is very small, it can be amplified by running a FLUSH-RELOAD attack that measures the amount of time the first multiplication takes while multiple threads in the background penalize the modular reduction code by evicting it from the cache. We observed that it is possible to amplify the timing difference to more than 100,000 cycles on multiple processors, which allows for a detection probability of success above 95% when FLUSH-RELOAD is used.

  • For the prime curve case, the analysis is a little more involved. OpenSSL implements the Montgomery Ladder by using optimized formulas for elliptic curve arithmetic in the Weierstrass model. The algorithm is implemented in function ec_mul_consttime(), but which does not run in constant-time from a cache perspective. The ladder starts again by initializing two accumulators r = P (in affine coordinates) and s = 2P (in projective coordinates). The first loop iteration is non-trivial and computes a point addition and a point doubling after a conditional swap. Depending on the key bit, the conditional swap is effective and only one point will remain stored in projective coordinates. Both the point addition and point doubling functions have optimizations in place for mixed addition, and our detection works on the point doubling case implemented in function ec_GFp_simple_dbl(). When the input point for the doubling function is in affine coordinates, a field multiplication is replaced by a faster call to BN_copy(). This happens when the two accumulators are not swapped in the ladder, which means that point r in affine coordinates is doubled and the second topmost bit is 0. The timing difference is again very small, but can be amplified to at least 15,000 cycles using performance degradation threads that evict the BN_copy() code from the cache. Our detection code implements the FLUSH-RELOAD technique and correctly determines the second topmost bit with around 99% probability of success.

Validation of the attack


We have conducted an experiment that recovers the signing key of a sect163r1 ECDSA key pair given about 2^26 signatures generated by OpenSSL, with relatively modest computational resources (around 3000 CPU hours and 720GB on a high-performance workstation). Even fewer signatures would suffice with a slightly bigger computation.

Our attack code also generalizes to other larger parameters in theory, although the required number of signatures and time complexity are orders of magnitude larger. We're currently executing a practical experiment of our attack against secp192r1.

Impact of the vulnerability

The vulnerability impacts private keys for ECDSA signatures instantiated with the affected curves. The most likely attack scenario is targeting a server's private key, in which the attacker has execution capabilities in the same machine.

How to fix

A possible fix amounts to implementing coordinate randomization to balance the two possibilities for the key bit in the first loop iteration of the Montgomery ladder. This way, the Z coordinates of both accumulator points will be non-trivial and the multiplication latency will be similar, with a tiny performance penalty.

This pull request implement such a countermeasure for the binary case in version 1.0.2, but we are happy to contribute additional patches for prime curves and version 1.1.0 if necessary.

Contact information

  • Diego F. Aranha @dfaranha (Aarhus University)
  • Akira Takahashi @akiratk0355 (Aarhus University)
  • Mehdi Tibouchi @mti (NTT Corporation)
  • Yuval Yarom @javali7 (University of Adelaide)

@openssl-machine openssl-machine added the hold: cla required The contributor needs to submit a license agreement label Mar 18, 2020
@mattcaswell
Copy link
Member

Thanks for your submission!

@romen - can you take a look at this?

Note, that we will need a CLA from all authors of the patch. See https://www.openssl.org/policies/cla.html

The other point to note is that 1.1.0 and 1.0.2 are both out of public support and therefore we're not pushing any fixes to the public branches for those releases. There are some people on extended support for 1.0.2, so we can potentially make this fix available to them.

@cpereida
Copy link
Contributor

Nice work @dfaranha and team!

Out of curiosity, is there any paper with more details publicly available somewhere?

@dfaranha
Copy link
Author

Thanks! A paper will be written when our ongoing computation finishes. I also added a patch to fix the prime curve case while preserving the code shared with binary curves.

@romen romen self-assigned this Mar 24, 2020
@romen romen added the branch: 1.0.2 Merge to OpenSSL_1_0_2-stable branch label Mar 24, 2020
@romen
Copy link
Member

romen commented Mar 24, 2020

Note, that we will need a CLA from all authors of the patch. See https://www.openssl.org/policies/cla.html

@dfaranha is there any update on the CLA matter from all code authors?

@dfaranha
Copy link
Author

Collecting the last one, hopefully today!

@mattcaswell
Copy link
Member

Close/reopen to kick CLA bot

@mattcaswell mattcaswell reopened this Mar 25, 2020
@openssl-machine openssl-machine removed the hold: cla required The contributor needs to submit a license agreement label Mar 25, 2020
@mattcaswell
Copy link
Member

Ping @romen - all CLA issues are now resolved

Copy link
Member

@romen romen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dfaranha and all the other authors for this contribution, and sorry for the delay in my review: the initial part was done quite early, but then I got delayed entering the rabbit hole of how binary curves end up using ec_mul_consttime().

I have a few comments that I would like to address together.

crypto/ec/ec_mult.c Outdated Show resolved Hide resolved
crypto/ec/ec2_mult.c Outdated Show resolved Hide resolved
crypto/ec/ec_mult.c Outdated Show resolved Hide resolved
crypto/ec/ec_mult.c Outdated Show resolved Hide resolved
Copy link
Member

@romen romen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @dfaranha for all the work!

@romen
Copy link
Member

romen commented Apr 6, 2020

@paulidale does it make sense for 1.0.2 to have to deal with what would happen if an entropy source is not available or faulty?
Do we need to add special handling for such case in this patch? (I don't think we even contemplate such a case in 1.0.2 in other places, so maybe it does not make sense to have a special case here and nothing elsewhere)

@romen romen added the approval: review pending This pull request needs review by a committer label Apr 6, 2020
Copy link
Member

@mattcaswell mattcaswell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will defer to @romen on the maths - but the code looks good.

@mattcaswell mattcaswell added approval: done This pull request has the required number of approvals and removed approval: review pending This pull request needs review by a committer labels Apr 6, 2020

/* first randomize r->Z to blind s. */
do {
if (!BN_rand(&r->Z, BN_num_bits(&group->field), 0, 0)) {
Copy link
Contributor

@bbbrumley bbbrumley Apr 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: sorry wrong place in the code for that comment :\

Here you want

BN_rand_range(&r->Z, &group->field);


/* now generate another random field element to blind (x1,z1) */
do {
if (!BN_rand(z1, BN_num_bits(&group->field), 0, 0)) {
Copy link
Contributor

@bbbrumley bbbrumley Apr 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct. E.g. for B-233 the field polynomial will have 234 bits while field elements have 233 bits. I think it's

BN_rand(z1, BN_num_bits(&group->field) - 1, -1, 0))

where I think the top=-1 allows the top bit to be random and not fixed.

Can you verify in the debugger?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm, good point, let me check!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW this is what we're doing in 111 and master

/* s blinding: make sure lambda (s->Z here) is not zero */
do {
if (!BN_priv_rand_ex(s->Z, BN_num_bits(group->field) - 1,
BN_RAND_TOP_ANY, BN_RAND_BOTTOM_ANY, ctx)) {
ECerr(EC_F_EC_GF2M_SIMPLE_LADDER_PRE, ERR_R_BN_LIB);
return 0;
}
} while (BN_is_zero(s->Z));

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BN_num_bits(group->field) - 1 makes total sense. We wanted to fix the top bit to guarantee that z1 and z2 had exactly the same length, but this is indeed not necessary anymore. I can commit and push both fixes shortly.

Copy link
Member

@romen romen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reconfirm (sorry this slipped under the radar)

Note to self/committer: we should squash all commits in this PR together.

@romen romen added the approval: done This pull request has the required number of approvals label Apr 14, 2020
@romen
Copy link
Member

romen commented Apr 14, 2020

Draft of the commit message as I intend it to be written upon final merge, please @dfaranha confirm if this is fine with you (I am trying to avoid to ask a force-push from you with squash+msg_edit as that would technically require reconfirmation and restart the grace-period).

Ping to @mattcaswell and @mspncp as well as we discussed these editorial changes privately.

Note to @mattcaswell : please check that in the Co-authored-by: tags I used the emails included in the CLAs.


Implement blinding for EC scalar multiplication


This commit implements coordinate blinding for the generic
implementations of both binary and prime elliptic curves in 1.0.2, to
avoid leaking bits of the scalar and, potentially, bug attacks.

While blinding is implemented in the 1.1.1 and master branches, it was
deliberately decided to avoid backporting those changes as they were
originally written for the newer branches, as the solution adopted there
required major restructuring of code and structures that was deemed not
suitable for 1.0.2.

A group of security researchers and cryptographers from academia and
industry, listed below, reported a successful cache timing attack
in OpenSSL 1.0.2u against specific prime and binary curves whose order
or field length is close to a word boundary.

In this commit, as a possible fix the authors propose implementing
coordinate randomization to balance the two possibilities for the key
bit in the first loop iteration of the Montgomery ladder. This way, the
Z coordinates of both accumulator points will be non-trivial and the
multiplication latency will be similar, with a tiny performance penalty.

The original [PR] includes more details about the reported attack,
literature references and discussions on how the originally proposed fix
was incrementally edited to reflect the relevant details of the 1.1.1
and master branches regarding coordinate blinding.

The authors of the original report and fix are Diego F. Aranha and
Akira Takahashi (both from Aarhus University), Mehdi Tibouchi (NTT
Corporation) and Yuval Yarom (University of Adelaide).

[PR]: https://github.com/openssl/openssl/pull/11361

---

Co-authored-by: Akira Takahashi <takahashi@cs.au.dk>
Co-authored-by: Mehdi Tibouchi <tibouchi.mehdi@lab.ntt.co.jp>
Co-authored-by: Yuval Yarom <yval@cs.adelaide.edu.au>

@romen
Copy link
Member

romen commented Apr 14, 2020

(also I noticed I was too hasty in applying the approval: done label, as I should wait for a formal reconfirm from Matt)

@dfaranha
Copy link
Author

Fully agreed, thanks for the writeup!

@mspncp
Copy link
Contributor

mspncp commented Apr 14, 2020

@romen the commit message looks great, only two nits:

  • Please add only a single empty line between subject line and body.
  • The separator line (---) before the footer might be common in emails but is rather uncommon in Git commit messages IMO. I would omit them.

@mspncp
Copy link
Contributor

mspncp commented Apr 14, 2020

@romen another nit: instead of adding a markdown link to the pull request

The original PR includes more details about the reported attack,

via an explicit reference link

[PR]: https://github.com/openssl/openssl/pull/11361

I would just use the hashtag notation which GitHub autoconverts to links:

The original pull request #11361 includes more details about the reported attack,

Because the URL in your commit message would duplicate by the merged-from annotation added by the addrev script.

(Side note: I am surprised that [PR] works for reference links; I thought the correct syntax for a collapsed reference link was [PR][])

@mspncp
Copy link
Contributor

mspncp commented Apr 14, 2020

(Side note: I am surprised that [PR] works for reference links; I thought the correct syntax for a collapsed reference link was [PR][])

TIL: it's called a shortcut reference link.

@romen
Copy link
Member

romen commented Apr 14, 2020

Here is the updated commit message taking into account @mspncp feedback:

Implement blinding for EC scalar multiplication

This commit implements coordinate blinding for the generic
implementations of both binary and prime elliptic curves in 1.0.2, to
avoid leaking bits of the scalar and, potentially, bug attacks.

While blinding is implemented in the 1.1.1 and master branches, it was
deliberately decided to avoid backporting those changes as they were
originally written for the newer branches, as the solution adopted there
required major restructuring of code and structures that was deemed not
suitable for 1.0.2.

A group of security researchers and cryptographers from academia and
industry, listed below, reported a successful cache timing attack
in OpenSSL 1.0.2u against specific prime and binary curves whose order
or field length is close to a word boundary.

In this commit, as a possible fix, the authors propose implementing
coordinate randomization to balance the two possibilities for the key
bit in the first loop iteration of the Montgomery ladder. This way, the
Z coordinates of both accumulator points will be non-trivial and the
multiplication latency will be similar, with a tiny performance penalty.

The original GitHub Pull Request #11361 includes more details about the
reported attack, literature references and discussions on how the
originally proposed fix was incrementally edited to reflect the relevant
details of the 1.1.1 and master branches regarding coordinate blinding.

The authors of the original report and fix are Diego F. Aranha and
Akira Takahashi (both from Aarhus University), Mehdi Tibouchi (NTT
Corporation) and Yuval Yarom (University of Adelaide).



Co-authored-by: Akira Takahashi <takahashi@cs.au.dk>
Co-authored-by: Mehdi Tibouchi <tibouchi.mehdi@lab.ntt.co.jp>
Co-authored-by: Yuval Yarom <yval@cs.adelaide.edu.au>

@openssl-machine
Copy link
Collaborator

24 hours has passed since 'approval: done' was set, but as this PR has been updated in that time the label 'approval: ready to merge' is not being automatically set. Please review the updates and set the label manually.

@t8m t8m added approval: ready to merge The 24 hour grace period has passed, ready to merge and removed approval: done This pull request has the required number of approvals labels Apr 15, 2020
romen pushed a commit to romen/openssl that referenced this pull request Apr 24, 2020
This commit implements coordinate blinding for the generic
implementations of both binary and prime elliptic curves in 1.0.2, to
avoid leaking bits of the scalar and, potentially, bug attacks.

While blinding is implemented in the 1.1.1 and master branches, it was
deliberately decided to avoid backporting those changes as they were
originally written for the newer branches, as the solution adopted there
required major restructuring of code and structures that was deemed not
suitable for 1.0.2.

A group of security researchers and cryptographers from academia and
industry, listed below, reported a successful cache timing attack
in OpenSSL 1.0.2u against specific prime and binary curves whose order
or field length is close to a word boundary.

In this commit, as a possible fix, the authors propose implementing
coordinate randomization to balance the two possibilities for the key
bit in the first loop iteration of the Montgomery ladder. This way, the
Z coordinates of both accumulator points will be non-trivial and the
multiplication latency will be similar, with a tiny performance penalty.

The original GitHub Pull Request openssl#11361 includes more details about the
reported attack, literature references and discussions on how the
originally proposed fix was incrementally edited to reflect the relevant
details of the 1.1.1 and master branches regarding coordinate blinding.

The authors of the original report and fix are Diego F. Aranha and
Akira Takahashi (both from Aarhus University), Mehdi Tibouchi (NTT
Corporation) and Yuval Yarom (University of Adelaide).

Co-authored-by: Akira Takahashi <takahashi@cs.au.dk>
Co-authored-by: Mehdi Tibouchi <tibouchi.mehdi@lab.ntt.co.jp>
Co-authored-by: Yuval Yarom <yval@cs.adelaide.edu.au>
@mattcaswell
Copy link
Member

The other point to note is that 1.1.0 and 1.0.2 are both out of public support and therefore we're not pushing any fixes to the public branches for those releases. There are some people on extended support for 1.0.2, so we can potentially make this fix available to them.

I have now made this available to our support customers (via git), and it will be included the next time we do a 1.0.2 release for them.

Thank you very much for your contribution!

@romen
Copy link
Member

romen commented May 26, 2020

@dfaranha is this the related paper?
https://eprint.iacr.org/2020/615

If it is, I would suggest to edit the description to add a link to the manuscript for future reference!

Thanks again to all the team for your contribution!

ghost pushed a commit to vmware/photon that referenced this pull request Jun 1, 2020
Updated openssl to 1.0.2v to include this PR:
openssl/openssl#11361

Also, removed changes in tools/c_rehash as this repo
have tools/c_rehash.in which will reflect the changes
in tools/c_rehash.

Change-Id: Iacba3c048365374b672fa9c8350d77d128c71fbb
Signed-off-by: Tapas Kundu <tkundu@vmware.com>
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/10154
Tested-by: gerrit-photon <photon-checkins@vmware.com>
Reviewed-by: Anish Swaminathan <anishs@vmware.com>
ghost pushed a commit to vmware/photon that referenced this pull request Jun 1, 2020
Updated openssl to 1.0.2v to include this PR:
openssl/openssl#11361

Also, removed changes in tools/c_rehash as this repo
have tools/c_rehash.in which will reflect the changes
in tools/c_rehash.

Change-Id: I7bbca7f521af67becd1a6a963c1608b65ae872d3
Signed-off-by: Tapas Kundu <tkundu@vmware.com>
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/10153
Tested-by: gerrit-photon <photon-checkins@vmware.com>
Reviewed-by: Anish Swaminathan <anishs@vmware.com>
ghost pushed a commit to vmware/photon that referenced this pull request Jun 1, 2020
Updated openssl to 1.0.2v to include this PR:
openssl/openssl#11361

Also, removed changes in tools/c_rehash as this repo
have tools/c_rehash.in which will reflect the changes
in tools/c_rehash.

Change-Id: Id2ef76f5416504ba5a959d430ef7fd9319356cdf
Signed-off-by: Tapas Kundu <tkundu@vmware.com>
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/10152
Tested-by: gerrit-photon <photon-checkins@vmware.com>
Reviewed-by: Anish Swaminathan <anishs@vmware.com>
ghost pushed a commit to vmware/photon that referenced this pull request Jun 1, 2020
Updated openssl to 1.0.2v to include this PR:
    openssl/openssl#11361

Also, removed changes in tools/c_rehash as this repo
have tools/c_rehash.in which will reflect the changes
in tools/c_rehash.

Change-Id: I927ce163ca2965cdb8a6b7ecc58efc21ee1fcdac
Signed-off-by: Tapas Kundu <tkundu@vmware.com>
Reviewed-on: http://photon-jenkins.eng.vmware.com:8082/10151
Tested-by: gerrit-photon <photon-checkins@vmware.com>
Reviewed-by: Anish Swaminathan <anishs@vmware.com>
@mspncp
Copy link
Contributor

mspncp commented Jun 12, 2020

Note: today I stumbled on Twitter over this blog post about the LadderLeak attack:

ECDSA: Handle with Care

laffer1 added a commit to MidnightBSD/src that referenced this pull request May 21, 2022
…ia and industry, and would like to continue the report of a security vulnerability in OpenSSL's implementation of binary/prime field ECDSA signatures. We initiated contact by e-mail in December 2019 and decided to open a pull request publicly to collaborate further with a fix.

Affected versions: 1.0.2u and 1.1.0l (current stable releases except 1.1.1 branch)
Affected curve parameters:
sect163r1, sect283r1/k1, sect409k1, sect571r1 (i.e. binary curves with group order slightly below the power of two) for 1.0.2u and 1.1.0l
secp192r1/k1, secp224r1, secp256r1/k1, secp384r1, secp521r1 (i.e. prime curves with group order slightly below the power of two) for 1.0.2u
Severity: full key exposure via cache timing attack
Executive summary
We discovered non-constant time implementations of Montgomery ladder scalar multiplication in the aforementioned releases, which enable the attacker to learn 1-bit of secret nonce with high precision making use of FLUSH+RELOAD cache timing attack technique [1]. Such a small leakage of nonces yields to key-recovery attacks from sufficiently many ECDSA signatures, due to our optimized version of Bleichenbacher's technique [2,3].

A full description of the attack can be found in [4] below.
[1] Y. Yarom, K. Falkner. "FLUSH+RELOAD: a High Resolution, Low Noise,L3 Cache Side-Channel Attack". USENIX Security 2014.

[2] D. Bleichenbacher. "On the generation of one-time keys in DL signature schemes". Presentation at IEEE P1363 working group meeting. 2000.

[3] A. Takahashi, M. Tibouchi, M. Abe. "New Bleichenbacher records: fault attacks on qDSA signatures". TCHES 2018(3), pp. 331–371, 2018.

[4] D. F. Aranha, F. R. Novaes, A. Takahashi, M. Tibouchi, Y. Yarom. "LadderLeak: Breaking ECDSA With Less Than One Bit Of Nonce Leakage". Cryptology ePrint Archive: Report 2020/615, available at https://eprint.iacr.org/2020/615

Overview of the vulnerability
The attack starts with the detection based on the second topmost bit using a cache-timing attack and follows with the Bleichenbacher methodology. Although the vulnerabilities are similar we split the discussion in the binary and prime curve cases.

For the binary curve case, the Montomery ladder is implemented in function ec_GF2m_montgomery_point_multiply() within file ec2_mult.c using López-Dahab coordinates. The function computes scalar multiplication kP for fixed-length scalar k and input point P = (x,y). The ladder starts by initializing two points (X1,Z1) = (X, 1) and (X2,Z2) = 2P = (x^4 + b, x^2). The first loop iteration follows after a conditional swap function that exchanges these two points based on the value of the second topmost key bit. The first function to be called within the first iteration is gf2m_Madd(), which starts by multiplying by value Z1. However, since the finite field arithmetic is not implemented in constant-time for binary fields, there is a timing difference between multiplying by (1) or (x^2), since modular reduction is only needed in the second case. In particular, a modular reduction will be computed when Z1 is x^2 after the conditional swap. This happens when the second topmost bit is 1 because the conditional swap effectively swapped the two sets of values. Although the timing difference is very small, it can be amplified by running a FLUSH-RELOAD attack that measures the amount of time the first multiplication takes while multiple threads in the background penalize the modular reduction code by evicting it from the cache. We observed that it is possible to amplify the timing difference to more than 100,000 cycles on multiple processors, which allows for a detection probability of success above 95% when FLUSH-RELOAD is used.

For the prime curve case, the analysis is a little more involved. OpenSSL implements the Montgomery Ladder by using optimized formulas for elliptic curve arithmetic in the Weierstrass model. The algorithm is implemented in function ec_mul_consttime(), but which does not run in constant-time from a cache perspective. The ladder starts again by initializing two accumulators r = P (in affine coordinates) and s = 2P (in projective coordinates). The first loop iteration is non-trivial and computes a point addition and a point doubling after a conditional swap. Depending on the key bit, the conditional swap is effective and only one point will remain stored in projective coordinates. Both the point addition and point doubling functions have optimizations in place for mixed addition, and our detection works on the point doubling case implemented in function ec_GFp_simple_dbl(). When the input point for the doubling function is in affine coordinates, a field multiplication is replaced by a faster call to BN_copy(). This happens when the two accumulators are not swapped in the ladder, which means that point r in affine coordinates is doubled and the second topmost bit is 0. The timing difference is again very small, but can be amplified to at least 15,000 cycles using performance degradation threads that evict the BN_copy() code from the cache. Our detection code implements the FLUSH-RELOAD technique and correctly determines the second topmost bit with around 99% probability of success.

Validation of the attack
​
We have conducted an experiment that recovers the signing key of a sect163r1 ECDSA key pair given about 2^26 signatures generated by OpenSSL, with relatively modest computational resources (around 3000 CPU hours and 720GB on a high-performance workstation). Even fewer signatures would suffice with a slightly bigger computation.

Our attack code also generalizes to other larger parameters in theory, although the required number of signatures and time complexity are orders of magnitude larger. We're currently executing a practical experiment of our attack against secp192r1.

Impact of the vulnerability
The vulnerability impacts private keys for ECDSA signatures instantiated with the affected curves. The most likely attack scenario is targeting a server's private key, in which the attacker has execution capabilities in the same machine.

How to fix
A possible fix amounts to implementing coordinate randomization to balance the two possibilities for the key bit in the first loop iteration of the Montgomery ladder. This way, the Z coordinates of both accumulator points will be non-trivial and the multiplication latency will be similar, with a tiny performance penalty.

This pull request implement such a countermeasure for the binary case in version 1.0.2, but we are happy to contribute additional patches for prime curves and version 1.1.0 if necessary.

Contact information
Diego F. Aranha @dfaranha (Aarhus University)
Akira Takahashi @akiratk0355 (Aarhus University)
Mehdi Tibouchi @mti (NTT Corporation)
Yuval Yarom @javali7 (University of Adelaide)

Obtained from: openssl/openssl#11361
@ytrezq
Copy link

ytrezq commented Apr 17, 2024

@dfaranha Can your work be adapted for koblitz curves (secp256k1)? Or does it also relies on how Openssl sign things in addition of the bit leakage.

@bbbrumley
Copy link
Contributor

@dfaranha Can your work be adapted for koblitz curves (secp256k1)? Or does it also relies on how Openssl sign things in addition of the bit leakage.

@ytrezq projective coordinate blinding, including secp256k1, has been around since about 2018 from .. v1.1.0+?

This particular PR was just about backporting that feature to older v1.0.2, with a very different OpenSSL API.

@ytrezq
Copy link

ytrezq commented Apr 17, 2024

@bbbrumley I wasn t thinking about Openssl but other project where 1 bits of the nonce is leaked in general. So I wanted to know if the current exploit method could be repurposed for Koblitz curves.

@bbbrumley
Copy link
Contributor

Ah I see.

Yes, I believe the answer is yes. You need Bleichenbacher-style techniques for less than 3 bits of nonce leakage. But historically speaking, yes -- nonce bias in ElGamal-family signatures doesn't end well, from the security perspective.

@dfaranha
Copy link
Author

dfaranha commented Apr 17, 2024

Yes, an implementation of secp256k1 leaking one nonce bit per ECDSA signature would be vulnerable in the same way as LadderLeak describes. Please send me an email if you want me to take a look. :)

I should quickly add that curves with efficient endomorphisms such as secp256k1 might also be affected in a different way if they produce biased subscalars through the GLV method. We study such attacks in our Asiacrypt'14 paper.

For a similar vulnerability found mere days ago (more leakage -> lattice attacks) see https://cert.europa.eu/publications/security-advisories/2024-039/

@ytrezq
Copy link

ytrezq commented Apr 30, 2024

@dfaranha : just a question : knowing the variables disclosed at https://etherscan.io/address/0x271682deb8c4e0901d1a1550ad2e64d568e69909#code#F27#L562 would it be possible to do something similar on this special type of secp256k1 curve use

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approval: ready to merge The 24 hour grace period has passed, ready to merge branch: 1.0.2 Merge to OpenSSL_1_0_2-stable branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet