New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RSA decryption API is very hard to use safely #13421
Comments
I think changing what the current RSA_PKCS1_PADDING does is
probably not a good idea. People will most likely rely on the
current behaviour where an error is returned, and I think your
proposal includes never returning an error. I have no idea how most
applications will then continue with that random data.
So if you want to change the behaviour, I think we need to add a
new option to the function that then documents that behaviour.
|
I agree changing the behavior to map the error away wouldn't work well. While any application that relies on it is likely already vulnerable to RSAES-PKCS1-v1_5's flaws, silently changing the success criteria won't make them any better. It's also not clear to me how it'd work anyway. Rather, I'd suggest:
We also need to be clear on what we can accomplish here. The swap to random isn't a general solution to make RSAES-PKCS1-v1_5 safe. It's something that specifically works for TLS (and SSH?) because of how the rest of the protocol is constructed. For instance, it doesn't work for SSL2. |
but when would that matter in regular use case, when not under attack?
the basic assumption of the attack is that the attacker can force the application to process an RSA ciphertext; the attacker generally also has access to the public key; so the application already must handle messages that will decrypt to unexpected lengths—the attacker can create ciphertexts that decrypt to arbitrary messages already |
that's exactly my point, we don't need to know what's the length application expects: if there are two cases that can generate message of given length, namely an actual valid PKCS#1 ciphertext that decrypts to such length, and a ciphertext for which we create a message of a given length, and the attacker can't differentiate between them, then they can't use it as an oracle in the Bleichenbacher attack, making the whole attack fail please read the opening post again: I'm proposing a modification that works even if the attacker can get access to the supposedly decrypted value
just because we document it, won't change code that already uses OpenSSL, it will remain insecure the only option is to remove support for PKCS#1 v1.5, but that's not realistic
Is there an SSH Key exchange that uses RSA encryption? I'm not familiar with it, but then, I've only looked at what OpenSSH and libssh support...
I don't think so
that's because nobody tried to implement workarounds in SSLv2, we just killed it |
Ah, I see. My initial thought is that the synthesized length will be unlikely to match the application's expected output size. (E.g. consider a naive TLS-like protocol using OpenSSL's API. It would check for But that's not really the probabilities we're comparing. The oracle will return 1 if:
PKCS-conforming and S-PKCS-conforming plaintexts are common enough that we can find them, but they're still fairly rare, so I think the second bucket is... usually larger. (Bleichenbacher's paper bounds the probability of PKCS-conforming ciphertext between 0.18 * 2^-16 and 0.97 * 2^-8 for RSA-512 and higher. For RSA-2048, the first term seems to be around 0.6. The second term is closer to 2^16 if your modulus is a multiple of 8 bits.) That might be enough to frustrate the attack? This probably needs some more analysis. (This also assumes we're okay blanket-removing the error case from all uses of RSAES-PKCS1-v1_5. See note below. If we're making a new API, I don't think there's any need to play games with lengths and we can just ask the caller to tell us the length they expect. An API tailored to session keys also nudges developers away from using RSAES-PKCS1-v1_5 as a general-purpose encryption scheme.)
Yeah, any fix to RSAES-PKCS1-v1_5's flaws requires changing the behavior of that existing code. Whether that behavior change comes from OpenSSL or the application, depends on the change. We can add a new API with the fixed behavior and get folks to migrate, or we can change the existing API's behavior. The latter gets us where we want faster, but carries a risk of breaking things if applications were relying on the old behavior. If, say, fixing a bug in the TLS implementation, it makes sense to just change the existing API. But here we're talking about dramatically changing the meaning of an existing API.
That's not the only option. We can remove support for that API, which would ask the remaining RSAES-PKCS1-v1_5 users move to the new one.
I'm not very familiar with SSH either. It looks like I may be wrong here (yay!). For some reason I remembered seeing DecryptPKCS1v15SessionKey used by Go's SSH implementation as well, but now I can't find it. I'm probably misremembering. (Or perhaps it was some old mode and they since removed it?)
Hehe, you've gotten me to read the DROWN paper again to remind myself how this works. :-) The SSLv3/TLS swap-with-random countermeasure works because we're effectively using the client Finished message (which immediately follows ClientKeyExchange) as an authenticator. That is, whether the counter measure works depends on the rest of the protocol. SSLv2 doesn't look like that. See section 4.1. It's got a ServerVerify message in between ClientMasterKey (SSLv2 ClientKeyExchange) and client Finished, where the server encrypts some data. When using an export cipher, the master key was too short so the attacker could observe the mitigation triggering some non-determinism. There's also a variation based on some OpenSSL bug (section 5.1). Of course, export ciphers and the OpenSSL bug are a problem in themselves, but the point remains that we depend on the rest of the protocol. DROWN also relied on the countermeasure using a random substitute. Perhaps some deterministic replacement, like the HMAC you suggest, would have worked. |
and that noisiness is precisely what helps us, there are two scenarios that can generate S-PKCS conformant plaintextext and the attacker can't differentiate between them, even if the hypothetical TLS implementation using the API leaks like a sieve
yes, it's enough if we synthesize the message deterministically; as the attack will fail if even one positive guess is incorrect and we will return "yes, the decryption went fine" for invalid ciphertexts far more often than for valid, so for the few tens of thousands of oracle calls necessary (in the optimistic scenario of system giving positive signal for positive decryptions of any length) with the improved algorithm we're basically certain that at least one guess by the attacker will be incorrect (I mean 0.5^20000 is as good as impossible, and attacker has lower chances than 0.5 of hitting real S-PKCS conforming ciphertexts)
yes, there might be uses that require to get a real error in case of decryption failure, but I hazard a guess that they are far and few in between, so it's better to document well and special case them than to special case regular use (and fix the 10000 code examples on the Internet that tell how to use the RSA decryption API)
I'm afraid it's more like "hope that at least some people migrate, maybe, after much time" than "get folks to migrate"
true, but I still worry that there may be users that don't know the size of the decrypted message a-priori... and an API that behaves like I've described will be still needed |
I did some back-of-the-envelope calculations, and if we assume 20000 oracle calls necessary (very optimistic, for bad version oracle we are usually talking about few million calls) to decrypt an RSA ciphertext, then an oracle having a true positive rate of 0.99555 (a false positive rate of just 0.00445) would still give us a a security margin of 128bits (log_2(0.99555^20000) ~= -128.69) the only realistic attack scenario is with an attacker having access to the decrypted message, and then looking for the ciphertexts that are max size (not uncommon) and then brute-forcing the padding bytes (embarrassingly parallelizable but not cheap at 2^64 RSA encryptions per oracle call²), since it's the attacker that decides if the decryption was successful or not, the few tens of thousands of oracle calls per decryption are realistic, let's assume the 20k calls, ie. 14 bits, and 2^16 cycles per RSA encryption (what I get from 2048 bit RSA on i7 4790K), we're talking about security margin of about 94 bits, which I think is good enough ² - I'm using 2^64 not the 2^32 birthday paradox because the vast majority of ciphertexts will be false positives (only few thousand true positive oracle calls are necessary to decrypt the ciphertext, most oracle calls are expected to fail) and we can prove that the oracle is a false positive only by exhaustion: by showing that there are no padding bytes that will encrypt to the matching ciphertext |
should I read it as "none of the core OpenSSL developers will work on this before 3.0.0 is out" or as "we will not accept a PR for 3.0.0 alpha that implements this" |
Strictly speaking it means the latter - but I'd argue we should should remove the milestone altogether to mean the former. I personally, have no objection to this being addressed, but we don't have the bandwidth within the dev team to take this on at the moment. Note if there are API changes then it would have be completed and merged before beta 1 to be eligible for inclusion in 3.0.0. |
would you consider addition of a new flag to be an API change? not a new symbol, just a new |
Yes - that would be considered an API change IMO. |
I've implemented the proposed fix in #13817, please review. |
merged |
It's well known that incorrect use of RSA decryption functions leads to Bleichenbacher oracles and thus vulnerability to Bleichenbacher attack, allowing for ciphertext decryption or signature forgery.
What's far less well known is how to use RSA decryption APIs correctly. Two most recent examples close to OpenSSL are CVE-2020-25659 (in pyca/cryptography) and CVE-2020-25657 (in M2Crypto).
As such, I think the issue is with the OpenSSL API: it's very hard to use correctly, so applications that do use it will get it wrong.
While the simple solution is to "just stop using PKCS#1 v1.5 decryption", I don't think it's a realistic approach for OpenSSL, it's too widely deployed for us to remove it completely (e.g. JSON Web Tokens use it).
So, I'd like to propose a change to the way the decryption API works, both the RSA_private_decrypt and EVP_PKEY_decrypt with
RSA_PKCS1_PADDING
.My proposal is similar to the one recommended in RFC5246: make decryption always generate a decrypted message, even if padding checks failed.
The Bleichenbacher attack depends on the attacker being able to tell apart the whole systems behaviour in case the
decryption failed because of RSA padding check and between the case where the decryption was successful but later use of the decrypted message failed (for example, because the later HMAC check failed). Note: during the attack, certain fraction of attacker-generated ciphertexts will have valid PKCS#1 padding, may decrypt to the message of expected length or even have expected structure (like the case of TLS pre-master secret starting with the TLS version).
What's important, is that the attacker doesn't know what the ciphertexts he or she sends decrypt to.
Let's consider a case of an attacker that can both perform the API call and get the decrypted value itself.
In such case, if it feeds the messages generated from the Bleichenbacher algorithm to the API and gets some random message
back, the only way of knowing if that value is an actual decrypted value or if it was the randomly generated secret is to generate all possible ciphertexts with that message using all possible padding values: task with a work factor of at least 2^64 (as the minimal padding length is 8 bytes) and far more for if the returned message is smaller.
If the attacker can't see the actual decrypted value, just that the application received a secret message of specific length, then there is no offline operation that it can perform to guess if that message length really came from the ciphertext or if it was the result of the random number generator.
So, I think that both
EVP_PKEY_decrypt
andRSA_private_decrypt
with PKCS#1 padding should use something likeHMAC(hash(private_key), ciphertext)
to seed an RNG that they then use to select the length of returned message and the contents of it.Any arguments against it?
(yes, I've read #8446)
The text was updated successfully, but these errors were encountered: