Implement equal_const_time/2 as nif #2749

starbelly · 2020-09-15T01:16:02Z

This PR implements equal_const_time/2 as a nif via delegation to CRYPTO_memcmp and intends to expose this function as part of the public api via documentation. If this PR is approved I will gladly document the function as part of the public API.

Discussions around this PR were had with @josevalim , @voltone , and @peerst of the erlef security working group.

One particular topic that I feel is worth discussing and possibly adjusting this PR for is around having either a pure erlang implementation of a constant time comparisson function on binary as secure_compare/2 or as BIF on binary of the same name and arity. The idea for having this in addition to crypto:equal_const_time/2 is prompted by the case where one does not have crypto (openssl) available. This option may even be more advantageous with BeamAsm on the horizon, such that the nif implementation is not desired at all. However, no benchmarks were done using BeasmAsm against the NIF or BIF implementations.

I have implemented the BIF but did not include it in this PR initially but am happy to do so either in this PR or a subsequent PR. Likewise, if a BIF is undesirable I'm also happy to add an erlang implementation on binary either as part of this PR or a subsequent PR.

It should be noted that the BIF implementation was found to be twice as fast as the NIF implementation at a cursory glance, however, there is concern around maintaining the implementation, specifically around keeping up with compilers not respecting thevolatile qualifier (see openssl/openssl#102 for more details around this subject).

Benchmarks for the NIF implementation

Benchmarks were performed with erlperf and as with all benchmarks take the ones below with a grain of salt, but they give a good idea of the improvements.For brevity I did not include all my runs which measured function calls with more samples and in parallel (i.e., new nif impl vs old erlang impl). I'm happy to provide these if desired, but the quick benchmarks below show the same findings.

Findings for binaries

The performance improvement was found to be significant.

Original erl implementation:

$ erlperf 'crypto:old_equal_const_time(<<"onexxxxx">>, <<"twoxxxxx">>).'
Code                                                               ||        QPS     Rel
crypto:old_equal_const_time(<<"onexxxxx">>, <<"twoxxxxx">>).        1    1522 Ki    100%

New NIF implementation:

$erlperf 'crypto:equal_const_time(<<"onexxxxx">>, <<"twoxxxxx">>).'
Code                                                             ||        QPS     Rel
crypto:equal_const_time(<<"onexxxxx">>, <<"twoxxxxx">>).          1    5909 Ki    100%

Findings for strings

A performance decrease was observed for small strings, only at 64 byte strings was performance roughly the same. As such
equal_const_time/2 as part of this PR will only delegate to the nif for binaries.

Original erl implementation:

$ erlperf 'crypto:old_equal_const_time("onexxxxx", "twoxxxxx").'
Code                                                         ||        QPS     Rel
crypto:old_equal_const_time("onexxxxx", "twoxxxxx").          1    5267 Ki    100%

New NIF Implementation

$ erlperf 'crypto:equal_const_time("onexxxxx", "twoxxxxx").'
Code                                                     ||        QPS     Rel
crypto:equal_const_time("onexxxxx", "twoxxxxx").          1    2644 Ki    100%

- add C implementation for equal_const_binary/2 (delegates to CRYPTO_memcmp) - modify crypto.erl to delegate to nif implementation when arguments are binaries, otherwise go with the existing implementation for lists. - add minimal tests.

starbelly · 2020-09-15T01:17:36Z

lib/crypto/c_src/str.c

+    if (!enif_inspect_binary(env, argv[1], &s2))
+        goto bad_arg;
+
+    if (s1.size != s2.size)


This is where the erlang list implementation and the nif implementation differ currently (i.e., the length of the strings isn't checked explicitly in the erlang implementation and probably for good reason), but the behavior remains the same for both.

starbelly · 2020-09-15T01:18:13Z

lib/crypto/test/crypto_SUITE.erl

+equal_const_time() ->
+    [{doc, "Test equal_const_time"}].
+equal_const_time(Config) when is_list(Config) ->
+    true  = crypto:equal_const_time(<<"">>, <<"">>),


We can do better tests surely, but will wait for approval.

starbelly · 2020-09-15T01:54:30Z

Example erlang implementation provided should that be preferred over the NIF or along side it on binary as secure_compare/2 :

secure_compare(L, R) when byte_size(L) == byte_size(R) ->
   secure_compare(L, R, 0);
secure_compare(_L, _R) -> false.
    
secure_compare(<<X1/integer,L1/binary>>, <<Y1/integer,R1/binary>>, Acc) ->
    Xorred = X1 bxor Y1,
    secure_compare(L1, R1, Acc bor Xorred);
secure_compare(<<>>, <<>>, Acc) ->
    Acc =:= 0.

The implementation is based off the implementation in Plug.Crypto by @whatyouhide

lib/crypto/src/crypto.erl

josevalim · 2020-09-15T07:09:16Z

Thank you @starbelly for working on this! FWIW, I think there is an argument to make this binary:constant_compare/2 even if the implementation is a NIF. There are other native functions in the binary module and making this available without a need to add a dep on crypto can be a nice positive.

EDIT: however there is an argument that the NIF should delegate to the OpenSSL implementation of said functionality. If that's the case, then it obviously belongs in crypto. :)

voltone · 2020-09-15T07:19:35Z

FWIW, I think there is an argument to make this binary:constant_compare/2 even if the implementation is a NIF.

I understands the benefits of a dependency-free NIF, but such code would have to be carefully maintained. OpenSSL and other libraries that implement such a function often implement this function in assembly, to avoid getting bitten by compiler optimisations, and fallback implementations in C have to resort to features (such as volatile variables) that are not consistently implemented across compilers. There is also an argument to be made to delegate this to proven implementations, is what I'm saying.

EDIT: response written before the edit of the previous comment :)

starbelly · 2020-09-15T13:54:28Z

@josevalim @voltone To be clear I'd like to have both 😁 I don't see any reason why there can't be so long as the documentation is clear.

Co-authored-by: José Valim <jose.valim@gmail.com>

HansN · 2020-09-16T08:29:14Z

First of all, thanks for engaging in speeding up the "constant"-time check, a contribution is valuable and wanted!

However, the CRYPTO_memcmp is not documented in other OpenSSL versions than 1.1.1, not even the 3.0 that is soon in beta state. LibreSSL documents it. The function is available in all versions we have to support, but relying on a not documented function is not good. The CRYPTO_memcmp is a very simple implementation looping over the two strings OR-ing the comparison - no fancy stuff and I can't see any assembler support, but that might of course come later.

Having two functions in two modules both with the same functionality is something I dislike and can't see the benefit of, so I would prefer having it in just one module. If you don't have a cryptolib like OpenSSL you don't have the crypto erlang module either.

voltone · 2020-09-16T08:41:09Z

The CRYPTO_memcmp is a very simple implementation looping over the two strings OR-ing the comparison - no fancy stuff and I can't see any assembler support, but that might of course come later.

In theory, yes. In practice, you have to be wary of compiler optimisations that short-circuit the comparison loop, removing the constant-time property and therefore the justification for having such a function in the first place. That's why some crypto libraries implement this in assembly: not for additional speed, but to more reliably rule out such compiler optimisations.

Please have a look at the OpenSSL discussion @starbelly mentioned. That should convince you this is a non-trivial function to get right and to maintain.

The fact that CRYPTO_memcmp is not public is indeed a problem, I don't really know how to get past that...

starbelly · 2020-09-16T13:45:51Z

@HansN Let me offer two options per what you stated :

Implement this function as a BIF (binary:secure_compare/2) Yes, there can be pitfalls with compilers per the OpenSSL discussion, but I also think these are old issues (AFAIK most volatile issues fixed in all major compilers around 2008). New compiler bugs (optimizations) could come up, but so could a lot of things. Thus, I think it's manageable, and aside from the volatile issue, as you stated it is a trivial function.
Implement this function as pure erlang (binary:secure_compare/2)

With either option we would delegate to binary:secure_compare/2 within equal_const_time/2. Whether or not to even document equal_const_time/2 with the addition of binary:secure_compare/2 I'm not sure, I suppose that depends on how common and valuable the case for strings with such a function is.

If one of the options listed above is desired I will gladly revise this PR or open a new one.

okeuday · 2020-09-16T15:05:23Z

@starbelly If you go down the pure erlang path, it is best to not rely on a length check being equal, to ensure the comparison is constant time (bounded based on the input size):

https://github.com/CloudI/CloudI/blob/c2ff9b3fc236d2c1ca639eb3cb3280a0dc240fab/src/lib/cloudi_core/src/cloudi_string.erl#L283-L293

compare_constant(Test, [_ | _] = Correct) ->
    compare_constant(Test, Correct, 0) =:= 0.

compare_constant([], [], Bits) ->
    Bits;
compare_constant([], [_ | _], _) ->
    1;
compare_constant([C | Test], [] = Correct, Bits) ->
    compare_constant(Test, Correct, Bits bor (C bxor -1));
compare_constant([C1 | Test], [C2 | Correct], Bits) ->
    compare_constant(Test, Correct, Bits bor (C1 bxor C2)).

peerst · 2020-09-16T15:08:31Z

Implement this function as a BIF (binary:secure_compare/2) Yes, there can be pitfalls with compilers per the OpenSSL discussion, but I also think these are old issues (AFAIK most volatile issues fixed in all major compilers around 2008). New compiler bugs (optimizations) could come up, but so could a lot of things. Thus, I think it's manageable, and aside from the volatile issue, as you stated it is a trivial function.

Especially since it would not be too hard to add a test that checks if the comparison is constant time e.g. if two extreme cases are compared (equal vs. first byte different on a long binary)

ferd · 2020-09-16T15:53:15Z

@okeuday

If you go down the pure erlang path, it is best to not rely on a length check being equal, to ensure the comparison is constant time (bounded based on the input size):

Arguably those may not actually be constant time since you'll still bail out early if the first list is shorter: by gradually increasing the length of the second string, you can guess the size of the first one; by reducing the length of the first one you can guess the size of the second one as well. Your algorithm needs to do both types of comparisons, something like the following (for binaries):

verify_in_constant_time(X, Y) ->
    verify_in_constant_time(X, Y, 0).

verify_in_constant_time(<<X, RestX/binary>>, <<Y, RestY/binary>>, Result) ->
    verify_in_constant_time(RestX, RestY, (X bxor Y) bor Result);
verify_in_constant_time(<<X, RestX/binary>>, <<>>, Result) ->
    verify_in_constant_time(RestX, <<>>, (X bxor -1) bor Result);
verify_in_constant_time(<<>>, <<Y, RestY/binary>>, Result) ->
    verify_in_constant_time(<<>>, RestY, (Y bxor -1) bor Result);
verify_in_constant_time(<<>>, <<>>, Result) ->
    Result == 0.

Regardless, the type of operations vary (comparing immediate values like -1 instead of accessing them through a list iteration) and someone with knowledge of the algorithm should in theory be able to access it, and you'd need to demonstrate that recent compiler optimizations won't interfere with that scheme.

This is not a huge issue because size is generally not considered a problem since these comparison functions should generally be used on hashed on encrypted values that are expected to have a fixed size; the size being known does not leak significant amounts of information in such cases.

The one weird case where you end up caring about length being information is on more oddball cases like doing comparisons of cleartext passwords where the differences between the submitted and expected string suddenly becomes significant. Those should usually not be used either though.

starbelly · 2020-09-16T16:49:06Z

@starbelly If you go down the pure erlang path, it is best to not rely on a length check being equal, to ensure the comparison is constant time (bounded based on the input size):

Yeah, I originally had it written without a length check for that exact reason, but consulting with some, people tended to prefer checking length with the argument being the bad actor would most likely known the length of the input. IMO, we should not check the length though.

EDIT: I'd add that a bad actor knowing the length of an input your checking against is not always the case.

EDIT: @okeuday @ferd 😁

EDIT: The alternative is to check the length AFTER you cycle through the input your checking against to avoid blunder.

okeuday · 2020-09-16T17:24:55Z

@ferd The first list is always the external data input. With the external data input controlling the constant time it isn't possible to look at the timing results to determine the correct size for further brute-forcing.

It makes sense that it shouldn't be a huge issue in practice if the data is always using the same hash, but if other hashes are used then you would care more. With data likely needing to change the hash algorithm gradually as it ages, it is good to avoid leaking the length.

starbelly · 2020-09-20T20:23:55Z

@HansN I have pushed up an implementation of binary:secure_compare/2 as bif as an alternative. If accepted, I would document this function and possibly delegate to this bif from crypto in the case of binaries for equal_const_time/2. I hope this finds you will.

starbelly · 2020-09-20T20:25:33Z

erts/emulator/beam/erl_bif_binary.c

+
+    size1 = binary_size(bin1);
+
+    if (size1 != binary_size(bin2))


There has been discussion around whether to check the length or not. I have also suggested checking the length after the loop. The rationale behind checking the length at that point is a) we avoid leaking length information b). it acts as a safe guard in case a user has mixed up the argument order.

starbelly · 2020-09-20T20:27:52Z

lib/stdlib/test/binary_module_SUITE.erl

+
+    T2 =  erlang:convert_time_unit(erlang:monotonic_time(), native, microsecond),
+
+    %% Given a 10 megabyte binary we can say with a fair amount of certainty that


This test and comment might need some more thought. On an intel i9 it takes between 12 to 14 milliseconds to complete the op, so I think we can say even if you have a processor that is twice as fast it will still take 6 to 8. I would think that a bif call would never take over 1ms, but we could increase the comparison to => 3 if we are concerned about that. Note that I tested the sad path by putting a break after the first iteration in the loop. Mostly saw < 1ms, a few times saw 1ms.

garazdawi · 2020-09-21T06:17:54Z

erts/emulator/beam/erl_bif_binary.c

+        goto bad_arg;
+
+    ERTS_GET_BINARY_BYTES(bin1, bytes1, bitoffs1, bitsize1); 
+    ERTS_GET_BINARY_BYTES(bin2, bytes2, bitoffs2, bitsize2);


This macro will do a realloc of the binary if it is an un-aligned sub binary, which means that it will take time depending on the size of the binary. Come to think about it, so will enif_inspect_binary.

Unaligned subbinaries is what Bin is if you do this:

<<_:1, Bin:1024/binary, _/bitstring>> = large_binary().

Not sure what do want to do when you get such a binary, either badarg or do some other trick.

@garazdawi is the concern around performance or just general oddities that could happen as a result of unaligned subbinaries?

The concern is that the realloc will take time proportional to the size of the binary. So if you want to have a constant time compare, you should think about this case.

Gotcha. Let me do some spelunking.

HansN · 2020-09-24T07:15:57Z

@starbelly wrote:

@HansN Let me offer two options per what you stated :
1. Implement this function as a BIF (`binary:secure_compare/2`) Yes, there can be pitfalls with compilers per the OpenSSL discussion, but I also think these are old issues (AFAIK most volatile issues fixed in all major compilers around 2008). New compiler bugs (optimizations) could come up, but so could a lot of things. Thus, I think it's manageable, and aside from the volatile issue, as you stated it is a trivial function.

2. Implement this function as pure erlang (`binary:secure_compare/2`)
With either option we would delegate to binary:secure_compare/2 within equal_const_time/2. Whether or not to even document equal_const_time/2 with the addition of binary:secure_compare/2 I'm not sure, I suppose that depends on how common and valuable the case for strings with such a function is.

If one of the options listed above is desired I will gladly revise this PR or open a new one.

I think it is hard to motivate a new BIF (alt 1) for this very specialized purpose. I personally is happy with:

a new function in binary and not in crypto since there seem to be a need for it even if no cryptolib (and as a consequence no crypto application) is available
either in Erlnag or in C is good for me
I see no need for a function crypto:equal_const_time/2 delegating to the new one except for compatibility reasons. So we probably need one for people outside OTP that uses it (if any). It could be implemented, deprecated in OTP-24.0, with removal in OTP-25.0

starbelly · 2020-09-26T21:00:07Z

@HansN wrote:

I think it is hard to motivate a new BIF (alt 1) for this very specialized purpose. I personally is happy with:

Do you mean not doing a BIF for binary (i.e., erl_bif_binary.c) ? I didn't see any NIFs for binary... Can you clarify?

* a new function in binary and not in crypto since there seem to be a need for it even if no cryptolib (and as a consequence no crypto application) is available

* either in Erlnag or in C is good for me

Doing some tests in erlang showed that odds of deviation was much greater than compared to the C implementation, in particularly with unaligned sub binaries (differences >= 10 microseconds). With C, even with unaligned sub-binaries, the results were more consistent and of course faster (less chance of deviation, but generally only by a microsecond when it happens). Thus, C is the best way to go IMHO.

* I see no need for a function  crypto:equal_const_time/2 delegating to the new one except for compatibility reasons.  So we probably need one for people outside OTP that uses it (if any).  It could be implemented, deprecated in OTP-24.0, with removal in OTP-25.0

At this point I definitely agree and I think this should pull request should be renamed, etc. and what happens with the undocumented equal_const_time/2 function should happen in another PR or just handled by OTP team. It's only used by ssh (lib/ssh/src/ssh_auth.erl and lib/ssh/src/ssh_transport.erl) to note.

Would you prefer a new PR or this one to be renamed? I think a new PR might be cleaner since the branch name would reflect the intent and can close but reference this one in the PR for the history.

HansN · 2020-09-28T07:35:07Z

@starbelly

Would you prefer a new PR or this one to be renamed? I think a new PR might be cleaner since the branch name would reflect the intent and can close but reference this one in the PR for the history.

I agree that a new PR is preferred.

starbelly · 2020-09-29T15:54:16Z

Closing in favor of #2778

Implement equal_const_time/2 as nif

153a9fc

- add C implementation for equal_const_binary/2 (delegates to CRYPTO_memcmp) - modify crypto.erl to delegate to nif implementation when arguments are binaries, otherwise go with the existing implementation for lists. - add minimal tests.

starbelly commented Sep 15, 2020

View reviewed changes

josevalim reviewed Sep 15, 2020

View reviewed changes

lib/crypto/src/crypto.erl Outdated Show resolved Hide resolved

HansN self-assigned this Sep 15, 2020

HansN added the team:PS Assigned to OTP team PS label Sep 15, 2020

Update lib/crypto/src/crypto.erl

d7e63e8

Co-authored-by: José Valim <jose.valim@gmail.com>

add secure_compare/2 as binary bif

a1ed04a

starbelly commented Sep 20, 2020

View reviewed changes

garazdawi reviewed Sep 21, 2020

View reviewed changes

starbelly closed this Sep 29, 2020

starbelly mentioned this pull request Sep 29, 2020

Implement binary:linear_compare/2 #2778

Closed

jhogberg mentioned this pull request Apr 20, 2021

Add crypto:hash_equals/2 #4750

Merged

voltone mentioned this pull request Sep 2, 2022

Timing attack page describes non-existing function erlef/security-wg#24

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement equal_const_time/2 as nif #2749

Implement equal_const_time/2 as nif #2749

starbelly commented Sep 15, 2020 •

edited

Loading

starbelly Sep 15, 2020 •

edited

Loading

starbelly Sep 15, 2020

starbelly commented Sep 15, 2020 •

edited

Loading

josevalim commented Sep 15, 2020 •

edited

Loading

voltone commented Sep 15, 2020 •

edited

Loading

starbelly commented Sep 15, 2020

HansN commented Sep 16, 2020

voltone commented Sep 16, 2020

starbelly commented Sep 16, 2020

okeuday commented Sep 16, 2020

peerst commented Sep 16, 2020

ferd commented Sep 16, 2020

starbelly commented Sep 16, 2020 •

edited

Loading

okeuday commented Sep 16, 2020

starbelly commented Sep 20, 2020

starbelly Sep 20, 2020

starbelly Sep 20, 2020 •

edited

Loading

garazdawi Sep 21, 2020

starbelly Sep 21, 2020

garazdawi Sep 22, 2020

starbelly Sep 22, 2020

HansN commented Sep 24, 2020

starbelly commented Sep 26, 2020

HansN commented Sep 28, 2020

starbelly commented Sep 29, 2020


		T2 = erlang:convert_time_unit(erlang:monotonic_time(), native, microsecond),

		%% Given a 10 megabyte binary we can say with a fair amount of certainty that

Implement equal_const_time/2 as nif #2749

Implement equal_const_time/2 as nif #2749

Conversation

starbelly commented Sep 15, 2020 • edited Loading

Benchmarks for the NIF implementation

Findings for binaries

Original erl implementation:

New NIF implementation:

Findings for strings

Original erl implementation:

New NIF Implementation

starbelly Sep 15, 2020 • edited Loading

Choose a reason for hiding this comment

starbelly Sep 15, 2020

Choose a reason for hiding this comment

starbelly commented Sep 15, 2020 • edited Loading

josevalim commented Sep 15, 2020 • edited Loading

voltone commented Sep 15, 2020 • edited Loading

starbelly commented Sep 15, 2020

HansN commented Sep 16, 2020

voltone commented Sep 16, 2020

starbelly commented Sep 16, 2020

okeuday commented Sep 16, 2020

peerst commented Sep 16, 2020

ferd commented Sep 16, 2020

starbelly commented Sep 16, 2020 • edited Loading

okeuday commented Sep 16, 2020

starbelly commented Sep 20, 2020

starbelly Sep 20, 2020

Choose a reason for hiding this comment

starbelly Sep 20, 2020 • edited Loading

Choose a reason for hiding this comment

garazdawi Sep 21, 2020

Choose a reason for hiding this comment

starbelly Sep 21, 2020

Choose a reason for hiding this comment

garazdawi Sep 22, 2020

Choose a reason for hiding this comment

starbelly Sep 22, 2020

Choose a reason for hiding this comment

HansN commented Sep 24, 2020

starbelly commented Sep 26, 2020

HansN commented Sep 28, 2020

starbelly commented Sep 29, 2020

starbelly commented Sep 15, 2020 •

edited

Loading

starbelly Sep 15, 2020 •

edited

Loading

starbelly commented Sep 15, 2020 •

edited

Loading

josevalim commented Sep 15, 2020 •

edited

Loading

voltone commented Sep 15, 2020 •

edited

Loading

starbelly commented Sep 16, 2020 •

edited

Loading

starbelly Sep 20, 2020 •

edited

Loading