-
-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
statem: fix the alert sent for too large messages #8300
Conversation
This does not seem to have anything to do with the specification text cited in #8293, which is talking about a specific syntax error. As noted in #8189 (comment), decode_error is incorrect in the general case for this DoS check, which may overlap with any number of inputs, valid or invalid against the formal grammar. |
if you're saying that the test case is incorrect, I'd appreciate a bit more precision in stating how it is not creating a ClientHello with data following extensions field
which section of RFC states that? the only reference to DoS in RFC 8446 I could find is in relation to |
The problem is your test case does not consider inputs where multiple things may be wrong. In that case, the error emitted is a function of processing order. Consider instead a Certificate message which exceeds It is true that the specification does not include text for implementation-defined limits on message sizes. This is perhaps an error in the specification. They are necessary for any production TLS implementation. Fixating on that input for this code is quite unproductive, as this code is clearly not implementing the scenario you are describing. It is checking an additional local policy in the TLS implementation. You must realize this, as the actual PR to achieve your goals here would be to remove the check entirely. But this is, of course, a bad idea for DoS reasons, so you haven't done this. I assume your goal here is actually to smooth over network-visible differences between TLS implementations. That is a fine goal, but hopeless for TLS as it is. There are far too many low-level error scenarios with different alerts for the choice of alert to be unambiguous. A more constructive path towards this goal would be to produce a I-D to fold together all these needlessly separate alerts. The following should be one alert:
and perhaps others. I would happily support such a document. Arguing the philosophical differences between decode_error and illegal_parameter is a waste of time. I regret my part in trying to pin down their meanings for TLS 1.3. That was clearly a mistake. |
so the test case is wrong because it is testing the exact scenario described in the RFC?!
so you're saying that by changing that alert I'm changing behaviour that is not sanctioned by the RFC? How is that a problem?
please, can you stop attacking me and focus on the code?
I'm sorry to hear that, because as a software tester I find them invaluable. And I definitely don't mean in the sense of generating easy bugs. |
It is defined in OpenSSL's implementation. See here: OpenSSL has a configurable maximum certificate size. This limit is smaller than what the syntax of Certificate would otherwise prescribe and, as you say, is not sanctioned by the RFC. That limit is enforced by exactly the code this PR changes. That means one can construct inputs which trip this that additionally trip pretty much any other Certificate error condition, the exact same sort of scenario that your test input is targeting. By this PR's interpretation (unsanctioned limits cannot override the alert to send), there is no single alert one could possibly return from that codepath which would satisfy the specification. |
I feel like we're repeating the same over and over again. The code that was change did not have a problem parsing the protocol, for that code it seems wrong to return decode_error. The code that was changed also does not detect that there was data after the client hello, it just detects that the size is larger than some limit that's not in the RFC. I can't actually find a good error to send in that case, illegal_parameter seems like the closest match. Note that the test triggers behaviour that's not in the RFC, and so I think there is little point in changing the alert. You can probably still detect this behaviour regardless of what the actual alert is that is being send. Anyway, I think you should limit the amount of data that you send after the client hello so that the maximum size is not triggered and that parsing the data you send fails instead. |
how is that relevant? you can write code that is able to parse and interpret all kinds of malformed messages, just because a particular implementation is able to do something with it, doesn't mean that the standard prescribed alert should not be sent
there is no valid ClientHello of that size that can be created without adding data after extensions field, ergo checking the length of this kind of messages is enough to say that they are malformed
thing is, it is breaking behaviour that is specified by the RFC, see section quoted in #8293
please check again the tlsfuzzer output, I think you must have missed few things there while reading it the first time |
but I'm not arguing what is the correct handling of messages that have multiple errors in them, you are, and you are not providing any standard references to back need for any particular behaviour in such cases
I'm sure that if you study the test case again, you will notice that this is not what it is doing
well, duh? rejecting large Certificate messages is not allowed by the standard so of course there is no standard prescribed alert message for that condition. There is nothing to reconcile from specification perspective. I do agree that having limit on Certificate message is something that is an acceptable DoS mitigation and acceptable departure from the RFC. But this is just my opinion, not a fact. While the fact that current OpenSSL behaviour is inconsistent with the RFC remains. |
‚send‘ -> ‚sent‘ (commit message and pr title) |
bca986c
to
e935036
Compare
@mspncp good catch, updated |
With the current definition of the maximum acceptable length of client hello it So I do not think the patch as is is correct. It would have to special-case the messages for which the limit is derived from the spec and for these return decode_error and for the rest return invalid_parameter. I am not sure it is worth it to complicate the code. |
that looks more like a bug than like something what we want to leave in:
cases such like?
yes, this PR is not making the situation perfect, but I think it improves it. If you don't think so, let's go through the cases and see if this is the case or not. |
The reason for the limit is to prevent DoS attacks:
Yes, exactly. Increase the lmit if your system needs to handle this. 8K RSA keys are ridiculous in production environments.
When OpenSSL supports quantum-safe algorithms, then something will have to be done. but it doesn't, so we don't now. ;0 |
not any more "ridiculous" than 8k DH keys, also it's inconsistent with how such keys behave on server side, or with different key exchange methods
except people are already doing tests to check how PQC integrates with existing protocols and are modifying openssl to do that; fighting such arbitrary limitations makes that process much more frustrating anyway, point 3 stands and RFC still states
and
|
RFC 8446 states that ClientHello with trailing data must be rejected with decode_error alert Signed-off-by: Hubert Kario <hkario@redhat.com>
e935036
to
15b33d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I was convinced by Hubert that this patch improves the situation in regards to testing of the RFC conformance of the response to incorrect ClientHello and it does not make the situation worse in other cases. I am thus approving this. Please @openssl/omc consider approving it.
Hubert, that quote from the RFC is predicated on the server "negotiating a version of TLS prior to 1.3". If the server decides that a ClientHello looks bogus and will be rejected because it's too big, it has not even gotten to the point of negotiating a version yet, and so it's not clear that the following requirement applies. |
@kaduk
second, which part of the RFC states that arbitrary data can be inserted at the end of a constructed type? how is it consistent with the rest of the RFC? third, RFC 5246 Section 7.4.1.2 is quite explicit in stating that this kind of following data is forbidden:
and as you've rightfully pointed out, at this point in the code, we don't know which version of TLS we are negotiating, but rejecting both TLS 1.2 and TLS 1.3 ClientHello such formatted with decode_error is valid with both RFCs, rejecting it with illegal_parameter is not |
I'm pretty sure the "if not" means "not [contains no data after legacy_compression_methods or contains valid extensions block]"; that's the more local binding, grammatically. That is, still conditional on negotiating 1.2.
Nothing says that such arbitrary data can be inserted. The point is that we don't even have to check if we already know that the packet is not something we can handle, before we start parsing it.
We don't know what version of TLS we're negotiating, but we do know that it's not something worth spending our processing time on. If we were considering negotiating any version of TLS at this point, then these RFC directives would apply, but we have already decided that we are not negotiating anything, and rejecting the connection. Would you be happier if we decreed that "the other end is clearly talking nonsense" and sent a TCP RST? It really feels like you're tilting at a windmill, here. |
exactly, just by looking at the length of the message, you can tell that its encoding is invalid, in TLS 1.0, TLS 1.1, TLS 1.2 and in TLS 1.3. This change makes sure that it is rejected with an alert description prescribed by the RFC 5246 (partially by 8446) and not contradicted by any other RFC.
No, I expect an implementation to be consistent in the way it handles errors. If it sends Alert messages, it should send Alert messages always.
Well, I consider this issue to be as clear-cut as they can be in TLS RFCs, and the change to address it, I think we can all agree, is quite trivial. If we can't agree on such obvious things, I'm second guessing myself if I should be trying to contribute to OpenSSL in the first place. I don't expect OpenSSL developers to work on fixes to make testing of OpenSSL easier to other people, but if I am proposing changes to make testing of OpenSSL easier, changes that don't impact anybody else, and they get this kind of treatment, I see it as OpenSSL not being interested in 3rd party testing of their code. |
@t8m since more than a month has passed since your review, I didn't dare to upgrade your review retroactively to an OTC review. Please reconfirm. |
I would like to see another OTC member review. I still think this change won't break anything, on the other hand from the discussion it is obvious it is a controversial change. |
I have now raised the OTC vote and will report back on the status of the vote result. |
to summarise my position for the vote:
|
The vote to approve this PR did not pass - i.e. the OTC did not vote to accept this PR. I suggested during the OTC discussion that we really should ask the IETF for a new alert to handle this situation as none of the existing alerts are actually a good match for this (their descriptions really don't align well with this context). I'd appreciate other views on the topic of asking for a new alert (we could hold an OTC vote to make this request on behalf of the project itself) ... |
Asking the IETF for a new alert seems reasonable. |
You want error messages to be as generic as possible, otherwise you get into oracles. You can ask, but the TLS WG will not give one (in my opionion). |
It might be more prudent to try to get a new alert that wholly supersedes both |
Or perhaps we just fold one into the other. |
Agree with #8300 (comment). The question is the timetable for getting such an RFC out. Certainly at least a year. |
It's not entirely clear it would have to take that long (though if it stayed in my review queue as FIFO it probably would |
Add 2-4 weeks to write the doc, 4-6 weeks for WG email review. Add time for crossing the Christmas/holdiay barrier. |
with such overhead I'd prefer to handle all the dark corners in a single RFC than to do it piecemeal
if an oracle though different error values is possible, then it definitely has a timing component and as such should get same kind of treatment as CBC mode got. Paving over it with a single error message only hides the issue, it doesn't fix it. |
No. Consider login which always asks for name/password but returns "no such user" or "wrong password" Nowadays they always return "Bad login" |
huh? If you return "bad login" after 1s when login is wrong but after 5s when the login is correct but password is wrong, then you have a timing oracle for user enumeration |
Yes, and if you have a bad username, but ask for a password and encrypt anyway there's no timing. |
oh, and the encrypted value is compared to to what? what if the server uses LDAP to retrieve those encrypted values? but I digress if the error values for different kinds of errors must be the same because otherwise they will leak data, the specification needs to be very explicit about it and point out side-channel attacks — which means that testing and programming that part will be hard and error prone but if the specification says that the different errors have to have different error messages, and many people looked at it and concluded that it doesn't leak any information, that means that the timing of those responses is irrelevant, making programming and testing of that code easier collecting different errors into one error code "just in case" they may turn into an oracle won't make programmers write side-channel secure code "just in case" too |
I doubt this is true. Shrug. |
the original Bleichenbacher paper from 1998 depends on alert descriptions, not their timing you really think that few other people looked at TLS specifications from that point of view since then? |
for all error/alerts? and able to look at implementations? yes, I do think that. |
It really doesn't take a lot of effort to read 100 pages of protocol specification with a specific set of mind. I'm sure it's far less than turning specification into a model that can be algorithmically proven.
there's a difference between an implementation bug and a specification bug; sure, implementation doesn't need to follow specification to the letter, but specification really needs to provide "known good" behaviour if we want to have even a remote chance of seeing correctly behaving implementations |
This discussion has gone way off topic. Let's drop the side-channel subtopic, it is not pertinent here. |
Both timing and the error code can be an oracle. But as long as
it's based on public information, the only thing it leaks is which
implementation it is.
I'm not concerned about which error the RFC says we should sent in
this case. Sending the other error code does not help the user to
fix the error, or get a better error messsage.
|
then why you're opposed to changing it?
why do you think it's unhelpful to send an alert indicating that the message is not well-formed upon receipt of a not well-formed message? |
Automated Ping: This issue is in a state where it requires action by @openssl/otc but the last update was 30 days ago |
This issue is in a state where it requires action by @openssl/otc but the last update was 60 days ago |
This issue is in a state where it requires action by @openssl/otc but the last update was 91 days ago |
This PR is in a state where it requires action by @openssl/otc but the last update was 122 days ago |
This PR is in a state where it requires action by @openssl/otc but the last update was 153 days ago |
Closing this PR due to the above decision. |
fixes #8293
Checklist