-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apply nonce bit-length mitigation to stop timing leakage. #125
Conversation
Are you sure that this fixes Minerva? it may decrease the side-channel, but I'm quite sure it doesn't eliminate it, see also: https://securitypitfalls.wordpress.com/2018/08/03/constant-time-compare-in-python/ |
@@ -86,6 +86,9 @@ def __eq__(self, other): | |||
else: | |||
return False | |||
|
|||
def __neg__(self): | |||
return Point(self.__curve, self.__x, self.__curve.p() - self.__y) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why point negation is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really for the fix, but the scalarmult can handle negative scalars now.
01988ac
to
4db08ae
Compare
It is a compare of rather very small ints, both of which might be in the small int cache taht Python maintains, the leakage there is so small that the side-channel is thwarted by other noise. The example you sent is a string compare, which obviously is not constant time. The conditional addition is what could be a problem, so I pushed a change that fixes that. |
My article is not just about comparison, all integer operations, be it addition, multiplication or calculating powers in python are using variable bit-length arithmetic. I focused on small integers because that's what's needed for implementing Lucky 13 mitigations. But if small integers don't use a fast path, large integers won't use it either. That means that it is impossible to create constant-time implementation of any arithmetic in python. This is before even using anything but a naïve implementation of elliptic curve arithmetic.
The https://eprint.iacr.org/2011/232.pdf article doesn't talk about python implementation, so I will need to see something more specific before considering it anything more than a scientific theory. I am able to see, in Lucky13 mitigation code written in python, correlation of time with the value of padding length byte, code that hashes 64KiB of data using SHA-256 in the process of verifying the TLS CBC ciphertext. I'd consider that a lot of noise.
this assumes the rest of the code is constant-time, as security section in readme states, that is not the case See that multiplication by a scalar with the same bit length, but higher hamming weight requires more time: In [14]: from ecdsa.ecdsa import generator_256
In [15]: %timeit generator_256 * 0x8000000000000000000000000000000000000000000000000000000000000000
16.8 ms ± 35.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [16]: %timeit generator_256 * 0xc000000000000000000000000000000000000000000000000000000000000000
16.9 ms ± 58.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [17]: %timeit generator_256 * 0x8000000000000000000000000000000000000000000000000000000000000000
16.3 ms ± 48.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [18]: %timeit generator_256 * 0xc000000000000000000000000000000000000000000000000000000000000000
16.7 ms ± 16.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [19]: %timeit generator_256 * 0x8000000000000000000000000000000000000000000000000000000000000000
16.4 ms ± 41.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [20]: %timeit generator_256 * 0xc000000000000000000000000000000000000000000000000000000000000000
16.8 ms ± 19.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [21]: %timeit generator_256 * 0x8000000000000000000000000000000000000000000000000000000000000000
16.9 ms ± 7.59 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [22]: %timeit generator_256 * 0xc000000000000000000000000000000000000000000000000000000000000000
16.9 ms ± 3.81 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [23]: %timeit generator_256 * 0x8000000000000000000000000000000000000000000000000000000000000000
16.6 ms ± 62.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [24]: %timeit generator_256 * 0xc000000000000000000000000000000000000000000000000000000000000000
16.7 ms ± 46.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) |
Yes, the scalar mult here obviously leaks Hamming weight, but that is not exploitable via a HNP-like construction where you need to be sure of your guess of consecutive bits of many nonces. This PR fixes an easily exploitable full private-key recovery vulnerability, I am not saying that it fixes all secret-dependent timing leaks, just this one. I don't see how moving towards not leaking so much easily exploitable information is controversial. An attacker would have a much harder time exploiting the remaining leaks than this one. (If you would get a HNP-like private key recovery attack from Hamming Weight of nonces, it would surely make for an interesting paper) Sure, the README has a note saying that this is not constant time and might leak, but that is almost useless, I can install this package and start using it without ever seeing the README(and/or understanding why timing is an issue). |
hmm, I didn't think how strong the signal would be from the position of the second bit, and it doesn't look like it is very strong (I'm quite sure it's there, it's just harder to show it): In [34]: %timeit generator_256 * 0x9000000000000000000000000000000000000000000000000000000000000000
16.6 ms ± 38.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [35]: %timeit generator_256 * 0xc000000000000000000000000000000000000000000000000000000000000000
16.6 ms ± 38.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [36]: %timeit generator_256 * 0x8100000000000000000000000000000000000000000000000000000000000000
16.6 ms ± 32.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
because it is futile for a library that has "being implemented in pure python" as the primary goal but given how simple the fix is, I'm ok with merging it; but please fix the python 2.6 compatibility issue
well, if anybody uses software for security applications without reading its documentation, I have bad news for them... that being said, any suggestion for the improvement of that text? Do you think that addition of something like "if the attacker is able to measure the time it takes to generate the key or make signature, he or she will be able to recover the private key" would be enough, or should we go straight for "This library, like all cryptographic libraries implemented in pure python, is insecure. Please use python-cryptography instead."? |
since bit_length and orderlen are utility methods, they should not be defined in main modules (curves) or in modules for specific features (rfc6979) move them to the util module, keep them importable from old places though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've slightly updated the use of bit_length
so that it doesn't cause a circular dependency, main code unchanged
This should be a mitigation for the Minerva attack. See https://minerva.crocs.fi.muni.cz for more info.