-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix unsafe bounds check in ssl_load_session() #2131
Conversation
Use `( x >> y ) & z` instead of `x >> y & z`. Both are equivalent by operator precedence, but the former is more readable and the commonly used idiom in the library.
04b012e
to
bfccb3c
Compare
I'm sorry, I don't think I like the casting to an While I agree it's correct in this case, we need to rely on external knowledge (that the value is unsigned 24-bit and the so far undocumented assumption that
For these reasons, I think I prefer Guido's proposal. It turns out it's also currently the most used in the library ( |
ChangeLog
Outdated
@@ -7,6 +7,8 @@ Bugfix | |||
invalidated keys of a lifetime of less than a 1s. Fixes #1968. | |||
* Fix failure in hmac_drbg in the benchmark sample application, when | |||
MBEDTLS_THREADING_C is defined. Found by TrinityTonic, #1095 | |||
* Fix an unsafe bounds check in ssl_load_session(). Reported by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "unsafe bounds check" is alarming enough that people might question why it's in the Bugfix section, not Security. We might want to make it clear that this is not exploitable without breaking the authentication of the session ticket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with the wording now, but I'd also be ok with classifying this under “Changes” with wording like “Improve robustness”, since it's fixing a second line of defense.
bfccb3c
to
1e4013f
Compare
@mpg Thanks for your comments, I changed the PR. @mpg @AndrzejKurek Please re-review. |
@hanno-arm Thanks for making the changes. However I'm not entirely happy yet, as correctness is still not as local as it could be (we need to remember that I'd like us to use the following idiom everywhere:
This idioms works a 100% of the time with 0 previous knowledge assumed. If, inside a particular function, the compiler can prove that (I think micro optimisation trumps readability only in the core loop of computation-heavy crypto primitives. Parsing code is quite the opposite: it's not nearly as CPU intensive, while being extremely sensitive to bounds issues. Please note that I'm only talking about micro optimisations here, obviously if we have a choice between a quadratic and a linear algorithm, or a memory usage of N vs 2N, that kind of optimisation is worth considering.) |
@mpg Thanks for your detailed comments! You persuaded me that in principle your proposal, matching Guido's, is the safest to use. I am not concerned about performance, but rather about code size - we have a large number of bounds checks in the library, and if we add |
@hanno-arm I see your point. Code size is indeed a more serious issue than performance, and worth considering. I must say I'm not sure what I think any more in that particular case. On one hand, you're making good points. On the other hand, I investigated a bit and found the code size impact for checking for Here's how I got this number, so you can criticize if you think I'm mistaken: #include <stdlib.h>
int hanno( size_t len, const unsigned char *p, const unsigned char *end ) {
return( (int) len > end - p );
}
int guido( size_t len, const unsigned char *p, const unsigned char *end ) {
return( p > end || len > (size_t)( end - p ) );
} and observed the size of the functions with: I would be interested in doing more tests to see if the compiler can optimize away some of the superfluous Of course in an ideal world we would have formal verification of bounds for all our parsing functions, so we could confidently remove any redundant check and be sure that we still have enough check, and that it will remain that way when adding code in the middle of the function later, but we don't live in that world yet. |
@mpg Thank you for doing that analysis and for explaining how you got the numbers! I think the fact that this simple example would allow us to save 300 Bytes of code indicates that we should take the underlying question seriously: Which implicit invariants are simple enough to reason about that we don't need to assert them, and which are complex enough that we do want to include an assertion? While I think we must be careful not to allow too complex informal reasoning in the former approach, I also think we shouldn't too easily fall back to the former. In this particular case, for example, I think it is justifiable to omit the Also, a check like (***) In this case, one might even be tempted to argue that |
library/ssl_ticket.c
Outdated
@@ -215,14 +215,17 @@ static int ssl_load_session( mbedtls_ssl_session *session, | |||
size_t cert_len; | |||
#endif /* MBEDTLS_X509_CRT_PARSE_C */ | |||
|
|||
if( p + sizeof( mbedtls_ssl_session ) > end ) | |||
if( end < p ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this check really strange. What is the semantics of the arguments buf
and len
? They designate a buffer, so buf
must be the start of a buffer of len
bytes. end
has to be >= buf
.
And below, end - p
is len
. The one check at this point should be sizeof( mbedtls_ssl_session ) <= len
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gilles-peskine-arm This again is a question of what we take as guaranteed and what not. If we assume that the caller uses sane arguments, then yes, we can omit the check. If not, we have to include it to guard a wraparound. OTOH, even if buf + len
doesn't overflow, we cannot check that buf, ..., buf + len - 1
is actually valid memory, we have to assume it. So I'd agree that it's cleaner to remove the check end < p
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And below, end - p is len. The one check at this point should be sizeof( mbedtls_ssl_session ) <= len.
I prefer to leave that to the compiler and aim at using a single idiom for bounds checking.
library/ssl_ticket.c
Outdated
return( MBEDTLS_ERR_SSL_BAD_INPUT_DATA ); | ||
|
||
memcpy( session, p, sizeof( mbedtls_ssl_session ) ); | ||
p += sizeof( mbedtls_ssl_session ); | ||
|
||
#if defined(MBEDTLS_X509_CRT_PARSE_C) | ||
if( p + 3 > end ) | ||
if( 3 > end - p ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The remaining length in the buffer is len - sizeof( mbedtls_ssl_session )
. The current code is ok, but I'd find this bit more readable if it used len -= sizeof( mbedtls_ssl_session )
at the same time as incrementing p
, and then checked len >= 3
. However I'm ok with keeping the current structure because it's more uniform with other parsing code in the library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are two different idioms to use for bounds checks: Either incrementing p
, keeping end
, and somehow inferring the remaining size from p
and end
, or incrementing p
while decrementing a length parameter like len
. I think the former is more commonly used, so I'd prefer to stick to it here.
On the topic of |
@mpg @gilles-peskine-arm @AndrzejKurek Please re-review. |
Fixes Mbed-TLS#659 reported by Guido Vranken.
76de3e1
to
698c5d7
Compare
698c5d7
to
a7d2fa7
Compare
@@ -215,14 +215,14 @@ static int ssl_load_session( mbedtls_ssl_session *session, | |||
size_t cert_len; | |||
#endif /* MBEDTLS_X509_CRT_PARSE_C */ | |||
|
|||
if( p + sizeof( mbedtls_ssl_session ) > end ) | |||
if( sizeof( mbedtls_ssl_session ) > (size_t)( end - p ) ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style: we normally put a space between (type)
and ( value )
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's the case: egrep '(int|char|size_t)\)\(' library/*.c | wc -l
prints 307 while egrep '(int|char|size_t)\) \(' library/*.c | wc -l
only prints 93. Admittedly, one of them should print 0 if we were fully consistent, but most of the time we're not putting a space here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. So it's (foo_t) single_word
with a space (~90% prevalence) but (foo_t)( more complex expression )
(~80% prevalence). Can we switch to a standard whitespace style in 3.0?
On the topic of checking for |
Summary: This fixes #659. See there for more information, as well as the commit message of aa7503d.
Internal Reference: IOTSSL-1046.