Mask CSRF tokens to mitigate BREACH attack #11729
Conversation
| @@ -0,0 +1,54 @@ | |||
| module ActionController | |||
| class AuthenticityToken | |||
| class << self | |||
egilburg
Aug 3, 2013
Contributor
I wouldn't make this a use class methods. It seems most of the methods here revolve around the session object, and thus it would make sense to make AuthenticityToken be initialized with a readable attribute session as follows:
token = AuthenticityToken.new(session)
token.generate_masked
# ...
class AuthenticityToken
attr_reader :session
def initialize(session)
@session = session
end
def generate_masked
# ...
end
# ...
end
On the same topic, based on the Single Responsibility Principle, perhaps you could make different classes for the token generator and the token itself:
token_builder = AuthenticityTokenGenerator.new(session)
token = token_builder.generate_masked
Of course, for consumers who only need the token object iself, the above can be simplified to:
token = AuthenticityTokenGenerator.new(session).generate_masked # => an AuthenticityToken object
Or an even simpler, with a class builder method:
token = AuthenticityTokenGenerator.generate(session) # => an AuthenticityToken object
This way, the methods related to generating the token can be on the builder class, while the methods related to validating it can be on the token class itself.
The xor_byte_strings method can still be a class method because it's accessed both by builder and by token itself, and does not operate on a session.
I wouldn't make this a use class methods. It seems most of the methods here revolve around the session object, and thus it would make sense to make AuthenticityToken be initialized with a readable attribute session as follows:
token = AuthenticityToken.new(session)
token.generate_masked
# ...
class AuthenticityToken
attr_reader :session
def initialize(session)
@session = session
end
def generate_masked
# ...
end
# ...
endOn the same topic, based on the Single Responsibility Principle, perhaps you could make different classes for the token generator and the token itself:
token_builder = AuthenticityTokenGenerator.new(session)
token = token_builder.generate_maskedOf course, for consumers who only need the token object iself, the above can be simplified to:
token = AuthenticityTokenGenerator.new(session).generate_masked # => an AuthenticityToken objectOr an even simpler, with a class builder method:
token = AuthenticityTokenGenerator.generate(session) # => an AuthenticityToken objectThis way, the methods related to generating the token can be on the builder class, while the methods related to validating it can be on the token class itself.
The xor_byte_strings method can still be a class method because it's accessed both by builder and by token itself, and does not operate on a session.
| def generate_masked(session) | ||
| one_time_pad = SecureRandom.random_bytes(LENGTH) | ||
| encrypted_csrf_token = xor_byte_strings(one_time_pad, master_csrf_token(session)) | ||
| masked_token = one_time_pad + encrypted_csrf_token |
egilburg
Aug 3, 2013
Contributor
Consider using .concat here instead of + to avoid extra object creation.
Consider using .concat here instead of + to avoid extra object creation.
| end | ||
|
|
||
| def valid?(session, encoded_masked_token, logger = nil) | ||
| return false if encoded_masked_token.nil? |
egilburg
Aug 3, 2013
Contributor
Since you don't expect false to be passed for encoded_masked_token with different semantics than nil, it's simpler to write:
return false unless encoded_masked_token
Since you don't expect false to be passed for encoded_masked_token with different semantics than nil, it's simpler to write:
return false unless encoded_masked_token|
|
||
| masked_token == master_csrf_token(session) | ||
|
|
||
| elsif masked_token.length == LENGTH * 2 |
egilburg
Aug 3, 2013
Contributor
You don't have an else clause here. While it will return nil which acts as falsy, given the boolean method name valid?, it's probably better to explicitly return else false here.
You don't have an else clause here. While it will return nil which acts as falsy, given the boolean method name valid?, it's probably better to explicitly return else false here.
| elsif masked_token.length == LENGTH * 2 | ||
| # Split the token into the one-time pad and the encrypted | ||
| # value and decrypt it | ||
| one_time_pad = masked_token[0...LENGTH] |
egilburg
Aug 3, 2013
Contributor
You can use masked_token.first(LENGTH) here.
You can use masked_token.first(LENGTH) here.
| # Split the token into the one-time pad and the encrypted | ||
| # value and decrypt it | ||
| one_time_pad = masked_token[0...LENGTH] | ||
| encrypted_csrf_token = masked_token[LENGTH..-1] |
egilburg
Aug 3, 2013
Contributor
You can use masked_token.last(LENGTH) here.
You can use masked_token.last(LENGTH) here.
| private | ||
|
|
||
| def xor_byte_strings(s1, s2) | ||
| s1.bytes.zip(s2.bytes).map { |(c1,c2)| c1 ^ c2 }.pack('c*') |
egilburg
Aug 3, 2013
Contributor
Can use map! to avoid extra object creation. Also, don't need parentheses around c1, c2.
If this is a hotspot, can optimize a bit more by extraciting 'c*' into a constant to avoid object creation each time.
Can use map! to avoid extra object creation. Also, don't need parentheses around c1, c2.
If this is a hotspot, can optimize a bit more by extraciting 'c*' into a constant to avoid object creation each time.
| end | ||
|
|
||
| def master_csrf_token(session) | ||
| session[:_csrf_token] ||= SecureRandom.base64(LENGTH) |
egilburg
Aug 3, 2013
Contributor
It doesn't seem worthwhile to pass around the entire session object where most methods (other than this one) just care about the token value itself, not mutating the session. I think that the code mutating the session should live separately from the methods/class that generates/validates the token - as early as possible, the token should be extracted and operated on, and as late as possible the session should be updated with the new token value.
It doesn't seem worthwhile to pass around the entire session object where most methods (other than this one) just care about the token value itself, not mutating the session. I think that the code mutating the session should live separately from the methods/class that generates/validates the token - as early as possible, the token should be extracted and operated on, and as late as possible the session should be updated with the new token value.
| end | ||
|
|
||
| # Sets the token value for the current session. | ||
| def form_authenticity_token | ||
| session[:_csrf_token] ||= SecureRandom.base64(32) | ||
| AuthenticityToken.generate_masked(session) |
egilburg
Aug 3, 2013
Contributor
I wonder if this will impact performance. This method, and all string manipulation logic inside it, will be executed on every request (as opposed to current implementation, which caches the value in the session). Perhaps add some benchmarks to this method?
I wonder if this will impact performance. This method, and all string manipulation logic inside it, will be executed on every request (as opposed to current implementation, which caches the value in the session). Perhaps add some benchmarks to this method?
Alamoz
Aug 3, 2013
Why not make it a configurable option, so users can decide which is more important to them?
config.csrf_masking - or - config.mask_csrf
The first form, though longer in length, hints the process recurs for each request. Perhaps also add a comment about decreased performance if decrease is significant.
Why not make it a configurable option, so users can decide which is more important to them?
config.csrf_masking - or - config.mask_csrf
The first form, though longer in length, hints the process recurs for each request. Perhaps also add a comment about decreased performance if decrease is significant.
|
it's SSL problem, isn't it. Can you imagine hassle of generating new token everytime? |
|
Doesn't look too bad. He is implementing the suggested solution. |
|
Would this have the side-effect of also protecting encrypted cookie sessions from this exploit as the ciphertext of the session cookie (containing the |
|
After thinking about it, wouldn't just adding random characters to the session every request (when using encrypted sessions) mitigate this attack entirely for all data in the session? Actually, on further reflection, would I be right in assuming that the session isn't at risk at all because HTTP headers aren't compressed? I guess this still effects the |
|
The paper states HTTP-level compression as one of the preconditions. That is something users may configure their web servers to do, so such users may want this proposed behavior in their rails apps. |
|
I would be worth having the masking/unmasking functions available as a library interface rather than baked into |
|
@waynerobinson The ciphertext of cookie-stored sessions is different every time anyway, because it uses a random IV during encryption. If you encrypt the same plaintext twice, you should get a different result each time since sending the same ciphertext over the wire tells the attacker that you sent the same data twice, which leaks information. |
|
Calling in the @NZKoz bat signal. |
|
We need something universal, maybe we should transfer csrf token as an additional cookie readable on client side.
It will be a header, not included in response body |
|
@waynerobinson Nothing in the HTTP headers is at risk because they're not in the same compression context as the body (you're right; they're not compressed at all). @homakov Not sending the CSRF token each time would definitely mitigate the attack (or sending it in a header). The problem is that this will break non-XHR form posts, which include the CSRF token as the |
|
@egilburg Thanks for the comments; I'll make these changes. I'll also do some benchmarking; we definitely need to figure out if this is expensive or not. If the masking is cheap, I'd just as soon do it on every request and not make it a configuration option (or at least make it opt-out) - users should not want to opt out of masking unless there's a performance hit, or they're doing something really exotic / custom with CSRF tokens. |
|
One other option to improve performance / reduce complexity is to use a different technique to obfuscate the authenticity token. There's nothing special about the OTP / XOR masking algorithm suggested in the paper; we just need a way to randomly obfuscate the data on each request in a way that the server can check. So something like |
|
Please have a look, this should be a faster and better protection https://gist.github.com/homakov/6147227 |
yes, if JS is off we need something different |
|
You can't implement CSRF tokens as a cookie; the weaknesses of cookies is what CSRF tokens are supposed to prevent. If you send the token as a cookie, the browser will attach it to all requests to your server regardless of which origin they come from, defeating the point of CSRF protection. Plus, any security solution that relies on JavaScript should be considered a weak solution. |
|
If you're concerned about the performance of XOR (which might be a reasonable concern but someone should benchmark it and find out), then write it in C. |
|
@jcoglan i know how CSRF works at my fingertips :) You probably misunderstood what I proposed: instead of plain tag we put it into Set-Cookie and only after page load (in runtime) we add CSRF tokens and other important information.
Very foggy argument, what exactly is wrong about this one? Besides XORing we need a way to hide ANY secret tokens (api_key in my demo). How are you going to hide it? un XORing with javascript? Set-Cookie is a simplest solution, with only one weakness - it requires JS on. |
|
Any solution that assumes the user agent will run JavaScript as part of the security process cannot be general-purpose, since not all user-agents run JavaScript. Invoking the user agent's JS runtime also means there's another component we need to place our trust in. XOR-masking seems like a totally reasonable approach. It's easily understood and easy to implement, and does not rely on client-side behaviour in order to protect the server. It's the same technique used by WebSocket on all data to prevent the client from constructing arbitrary byte sequences. The argument that XOR is slow has not been tested, and we should not make arguments from performance when we have no numbers to talk about. The JS solution may seem to require less code, but it's most complex, since it invokes more parts of the stack in order to work. We could protect any token by providing two server-side functions, All I'm saying is we should not lean on JavaScript when a workable solution can be done entirely server-side. |
| def initialize(session, logger = nil) | ||
| session[:_csrf_token] ||= SecureRandom.base64(LENGTH) | ||
| @master_csrf_token = Base64.strict_decode64(session[:_csrf_token]) | ||
| @logger = logger |
egilburg
Aug 5, 2013
Contributor
Would be friendly to make logger an attr_accessor so other code/tests can get/set it post-initialize.
Would be friendly to make logger an attr_accessor so other code/tests can get/set it post-initialize.
jmazzi
Aug 5, 2013
Contributor
Unless it's being used outside this class right now, I don't think that is needed, personally :)
Unless it's being used outside this class right now, I don't think that is needed, personally :)
| @@ -203,5 +203,9 @@ def form_authenticity_param | |||
| def protect_against_forgery? | |||
| allow_forgery_protection | |||
| end | |||
|
|
|||
| def authenticity_token | |||
| AuthenticityToken.new(session, logger) | |||
egilburg
Aug 5, 2013
Contributor
You don't need to generate this over and over in a single request, right? Please memoize this e.g. @authenticity_token ||= AuthenticityToken.new(session, logger), so this doesn't generate several times between calls in the same request.
You don't need to generate this over and over in a single request, right? Please memoize this e.g. @authenticity_token ||= AuthenticityToken.new(session, logger), so this doesn't generate several times between calls in the same request.
|
|
||
| masked_token == @master_csrf_token | ||
|
|
||
| elsif masked_token.length == LENGTH * 2 |
egilburg
Aug 5, 2013
Contributor
Might seem like a micro-optimization, but also for slightly better readability, I'd rename the constant LEGNTH to UNMASKED_LENGTH (or RAW_LENGTH), and right below it define MASKED_LENGTH = UNMASKED_LENGTH * 2. Then I'd just be checking against the two constants here. Also, make it a case statement and put the more commonly expected case further up:
case masked_token.length
when MASKED_LENGTH
# ...
when UNMASKED_LENGTH
# ...
else
false
end
Might seem like a micro-optimization, but also for slightly better readability, I'd rename the constant LEGNTH to UNMASKED_LENGTH (or RAW_LENGTH), and right below it define MASKED_LENGTH = UNMASKED_LENGTH * 2. Then I'd just be checking against the two constants here. Also, make it a case statement and put the more commonly expected case further up:
case masked_token.length
when MASKED_LENGTH
# ...
when UNMASKED_LENGTH
# ...
else
false
end
steveklabnik
Aug 5, 2013
Member
case statements return nil which is falsy if none of the whens happen so you wouldn't need the else clause.
case statements return nil which is falsy if none of the whens happen so you wouldn't need the else clause.
egilburg
Aug 5, 2013
Contributor
Do you think it's good practice for Boolean methods to return explicit true or false, not just truthy or falsy?
Do you think it's good practice for Boolean methods to return explicit true or false, not just truthy or falsy?
steveklabnik
Aug 6, 2013
Member
I think it's actively bad practice.
I think it's actively bad practice.
|
One thing to note regarding bootstrapping: only three characters are necessary to get it going. Consider <meta content="csrf_token_which_is_base64_encoded=" name="csrf-token" />Note that if an attacker's query param is reflected in an attribute value anywhere in the page as in <input type="hidden" name="who_cares_but_not_the_token" value="asdfHACKERGUESS" />then the Also, I emphasize b64 encoding here, since that gives the attacker some idea of how the secret will end, and allows them to use the same idea to bootstrap from the end, instead of the beginning. |
| if @logger | ||
| @logger.warn "The client is using an unmasked CSRF token. This " + | ||
| "should only happen immediately after you upgrade to masked " + | ||
| "tokens; if this persists, something is wrong." |
NZKoz
Aug 5, 2013
Member
This doesn't actually matter, if an attacker has the raw token, they can generate a valid masked token, I think we can safely ignore it rather than log.
This doesn't actually matter, if an attacker has the raw token, they can generate a valid masked token, I think we can safely ignore it rather than log.
homakov
Aug 5, 2013
Contributor
yes, such a notification is very rare. only for those who opened page just before deployment
yes, such a notification is very rare. only for those who opened page just before deployment
|
discussion above is pretty long but @NZKoz is right, w/o reflection it doesn't work! you need it to put guesses. |
|
One more comment: when we were doing our research, we definitely took a look at Rails to see how easy/difficult the attack would be there. It didn't make life easy for us (for essentially the reasons discussed above), and wouldn't have been a good candidate for a demo. That being said, as I think everyone here realizes: Rails is vulnerable in in principle. Anyway, the fact that @angeloprado, @ygluck, and I didn't find a good way to attack Rails shouldn't provide too much comfort. There are lots of people out there that are way more industrious and clever than we are. |
| "tokens; if this persists, something is wrong." | ||
| end | ||
|
|
||
| masked_token == @master_csrf_token |
tarcieri
Oct 2, 2013
Contributor
Is this vulnerable to a timing attack?
Is this vulnerable to a timing attack?
| encrypted_csrf_token = masked_token.last(LENGTH) | ||
| csrf_token = self.class.xor_byte_strings(one_time_pad, encrypted_csrf_token) | ||
|
|
||
| csrf_token == @master_csrf_token |
tarcieri
Oct 2, 2013
Contributor
Likewise, is this also vulnerable to a timing attack?
Likewise, is this also vulnerable to a timing attack?
NZKoz
Oct 2, 2013
Member
timing attacks are unlikely to be an issue here, CSRF tokens are particular to the session, not server side. The CSRF attacker doesn't have the ability to see the requests and time their responses unless you're talking about a fairly specific set of scenarios like an active MITM attacker. I'd love to seem someone demo that as it'd be quite a neat trick, but seems an unlikely vector at present.
We could change it, but it'd be belt-and-suspenders style stuff not a critical fix.
timing attacks are unlikely to be an issue here, CSRF tokens are particular to the session, not server side. The CSRF attacker doesn't have the ability to see the requests and time their responses unless you're talking about a fairly specific set of scenarios like an active MITM attacker. I'd love to seem someone demo that as it'd be quite a neat trick, but seems an unlikely vector at present.
We could change it, but it'd be belt-and-suspenders style stuff not a critical fix.
tarcieri
Oct 2, 2013
Contributor
Well, you're trying to mitigate BREACH here, which already requires the attacker MITM with a TCP proxy and can drive the victim's browser. A timing attack on CSRF would have the exact same setup, but rather than using the TCP proxy to measure the length of the response, they'd use it to measure timing information (which, granted, is harder than measuring the number of bytes in the response ;)
Well, you're trying to mitigate BREACH here, which already requires the attacker MITM with a TCP proxy and can drive the victim's browser. A timing attack on CSRF would have the exact same setup, but rather than using the TCP proxy to measure the length of the response, they'd use it to measure timing information (which, granted, is harder than measuring the number of bytes in the response ;)
NZKoz
Oct 2, 2013
Member
Sure, I'm not denying it's theoretically possible, nor rejecting the idea of changing the code. Just not FREAKING OUT at every use of String#== ;).
Sure, I'm not denying it's theoretically possible, nor rejecting the idea of changing the code. Just not FREAKING OUT at every use of String#== ;).
tarcieri
Oct 2, 2013
Contributor
Note that I am not one to FREAK OUT at every use of String#== either [1] :P But given the attack this patch is intending to mitigate, I think it's warranted
Note that I am not one to FREAK OUT at every use of String#== either [1] :P But given the attack this patch is intending to mitigate, I think it's warranted
btoews
Oct 2, 2013
Contributor
Just playing around a bit, you can get millisecond level timing info on a cross-domain POST, but that obviously isn't enough for a timing attack against string comparisons.
Just playing around a bit, you can get millisecond level timing info on a cross-domain POST, but that obviously isn't enough for a timing attack against string comparisons.
tarcieri
Oct 2, 2013
Contributor
What if the attacker is MitMing you with a TCP proxy (like they would have to do for BREACH anyway?)
What if the attacker is MitMing you with a TCP proxy (like they would have to do for BREACH anyway?)
| masked_token = SecureRandom.random_bytes(LENGTH) | ||
|
|
||
| # XOR the random bits with the real token and concatenate them | ||
| encrypted_csrf_token = self.class.xor_byte_strings(masked_token, @master_csrf_token) |
tarcieri
Oct 2, 2013
Contributor
"encrypted?"
"encrypted?"
NZKoz
Oct 2, 2013
Member
yeah, masked is a better name for it
yeah, masked is a better name for it
elithrar
Nov 11, 2013
It may make more sense to rename masked_token to one_time_pad as per #L45.
masked_token is likely a better name for the XOR of the real CSRF token and our one-time pad.
It may make more sense to rename masked_token to one_time_pad as per #L45.
masked_token is likely a better name for the XOR of the real CSRF token and our one-time pad.
|
Hey @bradleybuda have you had time to react to the feedback from @NZKoz and @tarcieri? Seems like this is something important to get fixed... |
essentially copied from rails#11729.
|
For people waiting for this to get merged consider using https://github.com/meldium/breach-mitigation-rails in the meantime |
|
See #16570 for the simple masking merged from breach-mitigation-rails. This PR nicely encapsulates the authenticity token, would be welcome to rebase and continue with its abstraction. |
|
Closing due inactivity and since an alternative solution was already merged. |
The BREACH attack described at Black Hat this year allows an attacker to recover plaintext from SSL sessions if they have some idea what they're looking for. One high-value thing to steal that has a predictable plaintext format is the CSRF token (because it always appears in a meta tag and frequently in form tags as well).
The researchers who discovered the attack suggest mitigating it by "masking" secret tokens so they are different on each request. This implements their suggested masking approach from section 3.4 of the paper (PDF). The authenticity token is delivered as a 64-byte string, instead of a 32-byte string. The first 32 bytes are a one-time pad, and the second 32 are an XOR between the pad and the "real" CSRF token. The point is not to hide the token from the client, but to make sure it is different on every request so it's impossible for an attacker to recover by measuring compressability.
The code should be backwards-compatible with existing Rails installs; the format of
session[:_csrf_token]is unchanged, and unmasked tokens will still be accepted from clients (with a warning) so that you don't invalidate all your users' sessions on deploy. However, if users have overriddenActionController#verfied_request?, this may break them (depending on whether or not they're callingsuper).This is not a blanket fix for BREACH, just a way of protecting against one particular variant of attack. I am not a security expert; I've just implemented the fix as suggested in the paper. This should be reviewed by someone who knows what they're doing.