New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Argon2 KDF support. #9444
Add Argon2 KDF support. #9444
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Started to look at this, but ran out of time. Here's the comments I had so far.
Can you clarify why you think we should add this to OpenSSL? We only try to add new algorithms that have been standardized, or where we see a clear need from the community. |
For some discussion read issue #4091. The RFC draft is active, authors want to publish it (see P-H-C/phc-winner-argon2/issues/271) and the last process state change ftom the last week. We use Argon2 in disk encryption LUKS2 format, that format is actually default in several distros already (Fedora, Debian, Arch, RHEL8, ...). |
I'm not sure how to best go with thread support. We currently have knowledge of threads, but it's very limited. We currently don't do have support for creating our own threads. If thread support is needed, we should add support for starting and stopping threads, and have the KDF argon2 implementation start and stop the thread itself. But I assume that is just a minor detail we can deal with later after the rest of the code is there and we can just run it sequentially at the start. |
The thread support is one of the "cost" parameter of the KDF algorithm, I am afraid the function is not really usable without supporting native threads. (Yes, it supports sequential execution, but then it will take much longer time if thread cost is set.) For Argon the threading support should be quite simple - it needs to create "thread cost" number of threads, then it synchronously performs the parallel caclulation and all threads are joined. The original libargon in fact includes simple portable wrapper around posix and windows threads, I guess something like that could be used here. |
We also support platforms that don't support threads, so something will need to be written for that case anyway. |
Or possibly we just don't support Argon2 on such a platform |
Another option in regards to threads would be to have the calling application setup the threads itself and call something like PerformThreadWork() function from them. This would avoid having to add thread library calls from OpenSSL, however it would make the use of Argon from applications much more complicated. |
Note this is submission of code where the copyright is not owned by the submitter. Thanks, |
Thank you all for expressing your views and comments! I will force push the minimal changes requested in a short while, as well as try to sort out the appropriate ICLA. That said, I would be delighted if we could come to a compromise regarding thread usage in Argon2 -- this is really the crucial bit of Argon2 design and the family of algorithms suffers significant performance drawbacks without threading enabled. The reference implementation comes with threading enabled and it is a small matter backporting them to OpenSSL, using appropriate OpenSSL-provided thread wrappers. OpenSSL already compiles (at least on GNU/Linux) with the appropriate flags, so either of these approaches poses no change to build process as such (can't speak for Windows just yet). So far, two proposals were made in terms of threading:
While I have no issue with implementing the either of the two approaches myself as part of this backport, it doesn't fall upon me to determine which of these are ideal for OpenSSL design-wise; I'd be very grateful if we could start to discuss which way ought to be implemented (or if this is a topic more appropriate for a mailing list, we can discuss it there). Thank you! |
How about:
|
Isn't this the entire point of KDFs? 😁 |
Not if an attacker can use faster threaded version 🙄 |
@t-j-h So the original authors should send ICLA now, or what is the exaxt process? I can try to contact them... |
We need ICLAs at least from the copyright owners (which are the original authors) - see https://www.openssl.org/policies/cla.html for those details. And they need to submit the code themselves or indicate to the OMC that you are authorised to submit this code on their behalf. For your own changes we need at least an ICLA and if you are employed then a CCLA. |
IMO OMC should think about giving a general exception to a code that is licensed with CC0 - this license allows you to do anything with the code including licensing it under any license even a license incompatible with Apache 2.0 license. I understand that allowing any code with compatible license without ICLA is something OMC does not want but the CC0 should be effectively the same as the authors signing the ICLA for you so that should be taken differently. |
If something is licensed under the CC0 (or is in the public
domain), or really any other license, I don't see how we can
claim a different license. Maybe we can say someting like it's
based on CC0 licenced software, with a link to were to find
the original.
|
You can take the code and integrate it in any work which is licensed under any license. Yes, the parts of the code that were not modified will truly still have the CC0 license but that does not mean you have to mention that anywhere, the CC0 license does not require you to do that. |
Exactly. CC0 is effectively public domain. You can do anything you like with it, including relicense and remove any existing comments about licensing.
I don't think even this is true. Once integrated into some other product under new licence terms it no longer has the old licence terms at all. Of course the original work can still be obtained under those terms - but not the integrated work. |
I did not mean that the integrated work is still licensed under CC0 or that you somehow have rights to ask the copyright owner of the integrated work to extract the original CC0 work from the integrated one. I just meant that the fact that you integrate some CC0 work into something bigger under some arbitrary license does not mean that the original (unmodified parts) work somehow loses the CC0 license. |
@t-j-h According to email, the authors seem to agree with the ICLA, so I hope they can send it. But I am quite not sure how they can "submit the code themselves", could you please be more specific? Or send an email (where?) would be enough? |
It is sufficient for them to indicate to us in some way that they agree to you submitting this code on their behalf. That could be in the email they send us with their ICLA, or by commenting on this PR (note that all the authors would need to do this). |
If the authors provide the ICLA that is what we need - along with some indication you are approved to submit on their behalf. The form of communicating that approval isn't fixed - but the need for an ICLA from the authors under our policy is. Hopefully that is clearer. If they include a comment when they send in their ICLA noting you will be submitting that then it makes it easier for us to match things up. Thanks, |
@mattcaswell for reference - public domain has a very explicit meaning - and claiming copyright on a submission makes it non-public domain. If the author's intent was to make the contribution public domain then the form in which they have done that is problematic - it actually takes care to make something public domain. Public domain refers to items that are not protected by copyright, trademark, patent, or any other intellectual property laws - which means any statement or assertion of copyright, trademark, patent, or and other IP law defeats that intent. The explicit claim of copyright in the LICENSE file works against the reference implementation actually being in the public domain. It might be the authors intent - but it isn't the effect of the license as stated. |
As far as I understand, nobody can claim copyright over anything in the public domain. Of course, the USA has bizarre IP laws. |
@mattcaswell @t-j-h I hope all ICLA forms were sent, do we still missing anything from the CLA side? (If anything is missing for @ckalina-rh as a submitter, it will be fixed promptly as we can talk internally in Red Hat.) |
@mbroz - I sent you and email with the current CLA status, i.e. we're still missing one CLA from the original authors. We also need the CLA for @ckalina-rh. |
@levitte These |
@levitte As for moving the whole of Argon2 to providers. It is certainly possible. My reasons for keeping it separate were:
What do you think? |
Hi, For the Argon2 RFC, draft already moved to next step, I think just final minor change review are blocking the standard to become final, see https://mailarchive.ietf.org/arch/msg/cfrg/5JvKGs91HEO3kXqbck5W6Itz1SQ |
Throwing in my 2 cents and responding to a couple of the comments above:
I really don't think they belong in the async directory. I'd prefer a new crypto/thread sub-directory.
To me moving it into providers in the long term makes sense. Algorithms are currently implemented under "crypto/", and the provider implementations under "provider/implementations" where the latter use the code in the former. The reason for this structure is historic. Prior to OpenSSL 3.0 all the algorithms were implemented in libcrypto. Now we are moving all of that code to providers. The Argon2 implementation is somewhat different in that it is the first new crypto implementation which was never in the old libcrypto. However I don't think it right to hold this PR up while we figure out how we should be structuring things moving forwards. That really should be the subject of a different PR. Therefore I'd suggest that, for now at least, all the low-level code should remain under crypto. |
Some algorithms (e.g., Argon2 KDF) are designed to be executed in parallel. This patch introduces support for threading to OpenSSL. Threading support is implemented in two ways: - Internal threading: OpenSSL creates threads itself - External threading: user-application spawns workers for OpenSSL to use (POSIX threads only) Threading must be explicitly enabled by the user application. Support for signal masking is also introduced. Signed-off-by: Čestmír Kalina <ckalina@redhat.com>
Blake2 message digest, unlike MAC, does not allow for a variable sized output. Allow for an empty-key blake2 MAC. To be used within Argon2 implementation. Signed-off-by: Čestmír Kalina <ckalina@redhat.com>
Signed-off-by: Čestmír Kalina <ckalina@redhat.com>
Status overview:
With that, I'd like to re-open the discussion of threading support that this patchset requires/introduces -- specifically which approach will land in OpenSSL (I have no objections to using both of them/one or the other) -- and I'd be very grateful for your comments. We already had a good discussion regarding the pros and cons of the two approaches above, but nothing final. Thank you! |
Since the consensus is to focus on FIPS for 3.0, is it likely this will be happening for the next release? |
@richsalz code-wise Argon2 as such is ready to be merged, there are no outstanding requests for changes. One outstanding thing to resolve is centered around threading, as that was the primary concern expressed in the discussion above. To recap [1], two different conceptual approaches arose and both of them were implemented; so all that remains is to decide which do we want in OpenSSL or whether we wish to use both. So either one needs to be removed from the code, or we can simply support both in which case there is no action required. I would be very glad if there'd be time enough to resolve this last point to have it upstreamed,so that I could follow up with [1] One mechanism leaves OpenSSL free to create threads, the other relies on externally provided workers. Neither is enabled by default and requires explicit calls to enable. If both are enabled, the former is the preferred method. |
Has there been any development on this, please? Is there anything that's in your opinion missing or anything to rework? Thanks! |
@levitte @mattcaswell ping - I would really like to move forward with this PR but we need the decision on the approach for thread creation. My personal preference is to leave it up to OpenSSL to create the threads as that is much simpler and avoids needs for new API calls. Of course on platforms without thread support it will fallback to linear computation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have started to review this, but ran out of time. My detailed comments so far are below.
I am confused by all the ASYNC plumbing in the threading code which doesn't seem necessary (or I don't understand what you are attempting to do). I also think this PR needs to be split into two. The underlying threading code needs to be pulled out as a separate PR, so that we can properly focus on just that (and the review of that code may involve different people to those that review the argon2 code). My suspicion is that once we have an agreement on the underlying threading API, getting agreement on the argon2 bit should be relatively straight forward.
In general I like the idea of having both threading models available to end users. I need to think more about the specific details of how it is implemented - and seeing that pulled out as a separate PR will help with that. I also think we need to see much more testing focused just on the threading aspect. I foresee problems if we don't get this right.
|
||
void ossl_dealloc(uint8_t *memory, size_t bytes) | ||
{ | ||
if (bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bytes > 0
context->out = OPENSSL_zalloc(context->outlen+1); | ||
if (context->out == NULL) { | ||
return ARGON2_MEMORY_ALLOCATION_ERROR; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use {} for single line ifs. Here and numerous other instances below.
} | ||
|
||
/* 3. Initialization: Hashing inputs, allocating memory, filling first | ||
* blocks. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use multi-line comment style:
/*
* Comment here line 1
* Comment line 2
*/
argon2_type type; | ||
|
||
uint32_t flags; /* array of bool options */ | ||
} argon2_context; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intermediate type necessary? Why not just:
typedef struct {
...
} ARGON2_CTX;
Argon2_d = 0, | ||
Argon2_i = 1, | ||
Argon2_id = 2 | ||
} argon2_type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're quite inconsistent about enum types, but I personally prefer uppercase: ARGON2_TYPE
int CRYPTO_THREAD_exit(CRYPTO_THREAD_RETVAL retval) | ||
{ | ||
if (CRYPTO_THREAD_EXTERN_enabled == 1 && ASYNC_is_capable()) { | ||
CRYPTO_THREAD_INTERN_exit(retval); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you mean CRYPTO_THREAD_EXTERN_exit() here?
|
||
task = (CRYPTO_THREAD_TASK*)arg; | ||
|
||
if ((currjob = ASYNC_get_current_job()) == NULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why store this in currjob
if we never then refer to currjob
again?
return 0; | ||
|
||
ASYNC_block_pause(); | ||
task->retval = task->task(task->data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused by what you are attempting to achieve with all this ASYNC code. You seem to spend a lot of effort setting up an ASYNC job, only to then block pauses during it, which defeats the object. Why do you need to do anything with ASYNC at all?
static struct list CRYPTO_THREAD_EXTERN_task_done; | ||
static CRYPTO_MUTEX CRYPTO_THREAD_EXTERN_task_lock; | ||
static CRYPTO_CONDVAR CRYPTO_THREAD_EXTERN_task_cond_create; | ||
static CRYPTO_CONDVAR CRYPTO_THREAD_EXTERN_task_cond_finish; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of these global variables should be stored in the OPENSSL_CTX instead, and then we should pass the OPENSSL_CTX to the various thread functions that need it.
@@ -160,6 +160,41 @@ static int test_kdf_pbkdf2(void) | |||
return ret; | |||
} | |||
|
|||
#ifndef OPENSSL_NO_ARGON2 | |||
static int test_kdf_argon2(void) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing seems very light. In particular I would expect to see separate tests for all the new threading plumbing that this PR puts in.
|
||
*memory = OPENSSL_zalloc(bytes); | ||
|
||
return memory != NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't that be "*memory" ?
My preference is also that OpenSSL creates the threads itself.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some real things, many nits. not fully reviewed.
if (datalen > UINT32_MAX) | ||
return 0; | ||
|
||
if (data) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explicit NULL test
return result; | ||
} | ||
|
||
if (Argon2_d != context->type && Argon2_i != context->type && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
linebreak before the conditional. indent the second line an extra shift-width.
/* 3. Initialization: Hashing inputs, allocating memory, filling first | ||
* blocks. */ | ||
result = initialize(&instance, context); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should really get rid of the extra blank line here, so things look like:
foo = calculate...
if (ARGON2_OK != foo)
...
The "constantvariable" construct is very uncommon in openssl code.
|
||
int ARGON2_Final(uint8_t *md, ARGON2_CTX *c) | ||
{ | ||
if (md == NULL || c == NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general OpenSSL doesn'st test for NULL parameters. This is not a universal opinion, others will have to comment.
{ | ||
int ret = 0; | ||
|
||
EVP_MAC *mac; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initialize these variables to NULL then you can have just a single "goto" label.
if (is_endian.little) { | ||
uint64_t w; | ||
memcpy(&w, src, sizeof(w)); | ||
return w; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above.
uint8_t *p = (uint8_t *)dst; | ||
int i; | ||
|
||
for (i = 0; i < 4; i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is "4" (and the "8" below) right? I guess so, but it looks ... not quite portable.
return x + y + 2 * xy; | ||
} | ||
|
||
# ifndef G |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you use about the ifndef ? Maybe just undef if it already exists?
@@ -0,0 +1,225 @@ | |||
/* | |||
* Argon2 reference source code package - reference C implementations | |||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably this has changes to fit into openssl. The copyright should look like other contribute-code copyrights.
|
||
/*************************Argon2 core functions********************************/ | ||
|
||
/* Allocates memory to the given pointer, uses the appropriate allocator as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See @mattcaswell's comment about multi-line comments.
If OpenSSL is going to create the threads, then an application should be able to specify a limit so that it doesn't get DoS'd. |
If OpenSSL is going to create the threads, then an application should be able to specify a limit so that it doesn't get DoS'd.
If we go that way, we need to add rate limits for EVP_* calls too.
I don't think it's up to the library to do that.
|
I don't understand this. |
Checklist
Hello,
this is an initial RFC pull request aiming at getting Argon2 support in OpenSSL. Currently only single threaded Argon2 mode is enforced. I tried to keep as close to reference implementation (which is dual licensed under the CC0 License and the Apache 2.0 License and as such should be compatible with OpenSSL) as possible to aid in future backports.
I'd appreciate any feedback you could provide (especially regarding threading).
Related: #4091
Thank you!
Best regards,
Cestmir