-
Notifications
You must be signed in to change notification settings - Fork 57
[RFC] Add argon2 backend. #113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Regarding the advices given in #104:
Things to consider:
I would love to get some feedback on this. You are free to edit my branch or add comments and let me know what you think! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid this is not acceptable as is. The most serious problem, called out in the code comments, is the use of sscanf
in crypt_argon2_i_rn
. Also, known-answer tests are missing (see test/ka-tester.c
and test/ka-table-gen.py
). Ideally, ka-table-gen.py would use a different implementation of argon2 to generate the known answers, but I understand that there may not be two implementations yet. I presume libargon2 has its own unit tests, so we don't need alg-*
tests for argon, but we do need enough known-answer tests to exercise setting string parsing thoroughly.
There are a few other, less important issues mentioned in the code comments.
Finally, before we merge this we need to make some decisions about these issues:
-
I said in the code comments that I don't think we want to switch the default for new hashes from yescrypt just yet. We might want to think about whether and when it would be appropriate. The main reason I don't want to switch immediately is that argon2 is not yet widely deployed enough for it to be safe to distribute
$argon2i$
hashes from network login servers (NIS+, LDAP, etc). There was, IIRC, also some dispute over whether argon2 is really better than yescrypt.(Hopefully, sometime in 2021 I will get around to implementing run-time configurability for default hash and cost parameters, which will make this much less of an issue,)
-
Should we support argon2d and argon2id as well as argon2i? If so, I'd want support for all variants to go in in the same release.
-
Re
I have reviewed the NetBSD implementation. They are not compatible with the output of argon2 executable. They store the salt as plaintext. It just looks like it is encoded in some base64 form, but in reality it is put into the argon2 function call as it is. Mixing a plaintext salt and a base64 encoded hash looks wrong to me and makes it difficult to experiment.
In principle, I agree that we should be compatible with the output of the argon2 executable. However, is the NetBSD implementation also using
$argon2i$
as its prefix? -
Re
I had to add some base64 encoding with a A-Za-z0-9/+ alphabet because that is what the reference implementation uses. The code is taken from reference implementation and marked with appropriate copyright. Maybe you prefer to switch to the base64 encoding used in scrypt?
No, we should definitely match the reference implementation (and the Internet-Draft)'s choice of base64 variant.
Refactoring out all the duplicate code for base64 variants has been on my to-do list for quite some time, but should not block this patch.
NEWS
Outdated
@@ -4,6 +4,7 @@ Please send bug reports, questions and suggestions to | |||
<https://github.com/besser82/libxcrypt/issues>. | |||
|
|||
Version 4.4.18 | |||
* Implement the argon2_i ($argon2i$) hashing algorithm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change "Implement" to "Add support for" -- we have not reimplemented argon ourselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It occurs to me that this is probably a significant enough change to call this version 4.5. (We won't go to 5.x until we drop support for --enable-obsolete-api
, which may never happen.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have adjusted the wording and the version.
configure.ac
Outdated
case "$hashes_enabled" in | ||
*argon2*) | ||
AC_SUBST(LIBARGON2) | ||
AC_CHECK_LIB(argon2, argon2i_hash_encoded, [argon2_lib=yes], [argon2_lib=no]) | ||
if test x"$argon2_lib" != xyes; then | ||
AC_MSG_ERROR(argon2_i requires libargon2, which was not found) | ||
fi | ||
LIBARGON2=-largon2 | ||
LIBS="$LIBS $LIBARGON2" | ||
;; | ||
esac |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has several problems: All AC_SUBST
s must be executed unconditionally. It's unreliable to put most other AC_*
macros inside hand-written shell control flow constructs at the top level of a configure.ac. Although it's unlikely to matter in this case, using AC_CHECK_LIB
instead of AC_SEARCH_LIBS
is almost always a mistake. And you haven't used enough M4 quotation.
You don't actually use the LIBARGON2 variable anywhere, so I'd suggest this rewrite:
AS_CASE(["$hashes_enabled"],
[*argon2*], [
AC_SEARCH_LIBS([argon2i_hash_encoded], [argon2])
AS_IF([test x"$ac_cv_search_argon2i_hash_encoded" = xno],
[AC_MSG_ERROR([Argon2 hashes require libargon2, which was not found])])
])
Using AS_CASE
instead of a bare case
makes the nested AC_SEARCH_LIBS
safe, and also gets rid of the unbalanced parentheses; AC_MSG_ERROR
is specifically documented as safe for use inside a hand-written shell if
, but since we are using AS_CASE
I think it's cleaner to use AS_IF
as well, rather than mixing and matching styles. AC_SEARCH_LIBS
automatically sets LIBS
, so no need to do it by hand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally the autoconf script would also try using PKG_CHECK_MODULES
to detect argon2. In addition to handling libargon2 installed in a nonstandard location but accessible via .pc
files, this would give builders the ability to use LIBARGON2_CFLAGS
and LIBARGON2_LIBS
to specify its location manually. But I'm not insisting on that change, we can always implement it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for your detailed explanation of the macros and the code example. I have applied it.
doc/crypt.5
Outdated
argon2_i is a password-hashing function that summarizes the state of the art | ||
in the design of memory-hard functions and can be used to hash passwords | ||
for credential storage, key derivation, or other applications. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reads like marketing copy from the argon2 website. Please think of a more NPOV way to describe it.
We should probably cite the argon2 Internet-Draft here, and make a note to update the citation once it gets finalized as an RFC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. The text has been copied from their website. I did not want to add a text written by myself to keep the description as precise as possible. I took the Internet-Draft as recommended now.
lib/hashes.conf
Outdated
argon2_i $argon2i$ 16 STRONG,DEFAULT | ||
yescrypt $y$ 16 STRONG,DEFAULT,ALT,FEDORA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DEFAULT flag should only appear on one entry in hashes.conf. (I should have put something in the build scripts to make them bomb out on this mistake.) I think we do not want to switch our default from yescrypt yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that DEFAULT can exist multiple times in this file. The last paragraph in the descriptive comment describes that the first entry wins. I do not mind moving argon2 further down. Also I removed the DEFAULT flag for argon2 hashes.
lib/crypt-argon2.c
Outdated
} | ||
|
||
/* Setting is invalid. */ | ||
if (sscanf (setting, "$argon2i$v=%u$m=%u,t=%u,p=%u$%s", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The scanf
functions are broken-as-specified, and may not ever be used in libxcrypt proper (it's OK in tests, but only when we 100% control the input). I'm going to have to insist you hand-write code to parse the setting string.
The only C library functions for converting textual numbers to machine numbers, that are adequately defensive against malformed or malicious input, are the strto*
family. If we don't already have a convenience wrapper for strtoul
we should probably grow one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used sscanf to keep the code as small as possible to make code reviews and catching the idea of the code as easy as possible. The code uses strtoul now and follows macro handling from bcrypt code.
/* | ||
* Some macros for constant-time comparisons. These work over values in | ||
* the 0..255 range. Returned value is 0x00 on "false", 0xFF on "true". | ||
*/ | ||
#define EQ(x, y) ((((0U - ((unsigned)(x) ^ (unsigned)(y))) >> 8) & 0xFF) ^ 0xFF) | ||
#define GT(x, y) ((((unsigned)(y) - (unsigned)(x)) >> 8) & 0xFF) | ||
#define GE(x, y) (GT(y, x) ^ 0xFF) | ||
#define LT(x, y) GT(y, x) | ||
#define LE(x, y) GE(y, x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please explain why these conversion functions need to use constant-time comparisons?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In an earlier version of my code I have used a lookup table just like ascii64 in crypto-common.c. Eventually I gave up that code for these reasons:
- The base64 code used in Argon2 differs from RFC because it skips the padding character '='. My code would therefore not follow RFC either.
- The file alg-argon2-encoding.c is a direct copy from Argon2 reference implementation.
- Having a separate file makes it easy to put proper copyright into that file as well, especially since the code in the file is not modified
I would not mind adjusting it if it is requested. Especially since a lookup table is easier to review and complies with rest of the code base.
Github isn't letting me add this to the specific code comment thread where it's relevant: I think the inadequate M4 quotation in configure.ac is why all the Travis builds bombed out with weird shell errors. Come to think of it, this PR should adjust the Travis configuration so we have libargon2 available at build time in most of the configurations. There should be one configuration that exercises the "no libargon2 at build time" code paths, but we already have so many Travis variant builds that I'd recommend piggybacking that on an existing variant rather than adding a new one. |
Argon2 is a memory-hard function for password-hashing and proof-of-work applications. It aims at the highest memory filling rate and effective use of multiple computing units, while still providing defense against tradeoff attacks. See: https://github.com/P-H-C/phc-winner-argon2 The argon2 hashes require libargon2, which is an optional dependency. Compatible with output of argon2 executable.
I extended the pull request to contain Argon2id, Argon2i, and Argon2d. Argon2id is used by NetBSD as I have to fix the Codacy findings. Therefore the pull request is WIP for now. Thank you for your valuable feedback @zackw! |
I have checked the reports but I do not know if I have to fix them or not. I think the argon2 code handles strings just like the other crypt files, e.g. bcrypt. In crypt-bcrypt.c, the size argument for setting is actually ignored. Do you have an advice about it or do you think as well that these findings can or should be ignored? |
Issues
======
+ Solved 1
- Added 5
See the complete overview on Codacy |
|
||
if (strncmp (nptr, arg, strlen (arg))) | ||
return 0; | ||
nptr += strlen (arg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{ | ||
unsigned long val; | ||
|
||
if (strncmp (nptr, arg, strlen (arg))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GET_VALUE(t, "t=", ','); | ||
GET_VALUE(p, "p=", '$'); | ||
|
||
strncpy (buf, s, s_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errno = EINVAL; | ||
return; | ||
} | ||
s += strlen(type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | ||
|
||
/* Setting for a different hash algorithm. */ | ||
if (strncmp (s, type, strlen(type))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I hope it is okay to ask for feedback again. Are there any news on this subject? |
I wanted to ask the same but... no activity in 2 years? :-/ |
@eleius Given the timing of your comment, I guess it's related to Arch Linux's decision to switch to yescrypt - https://archlinux.org/news/changes-to-default-password-hashing-algorithm-and-umask-settings/ - where among other things they wrote "Although the winner of the Password Hashing Competition has been argon2, this even more resilient algorithm is not yet available in libxcrypt (attempt one, attempt two)." Disclaimer: I've designed yescrypt, so obviously I am biased. I also was on PHC's panel (obviously not voting for my own entry), so I have studied all entries at least to some extent (and Argon2 pretty well, with both yescrypt and Argon2 influencing the final PHC versions of each other). I think calling Argon2 an "even more resilient algorithm" is wrong. For practical purposes, I think it is not - on the contrary, yescrypt is likely more resilient. We give a comparison of yescrypt vs. Argon2 on the yescrypt homepage and in documentation, including in terms of attack resilience. There were multiple criteria in the competition, with attack resilience being only one of them. Another was simplicity. Argon2 is simpler than yescrypt, and I accept that complexity is a drawback of yescrypt. I have quite a backlog of tasks related to both yescrypt and Argon2. I maintain the upstream implementation of yescrypt and have some things to improve in there. I also co-maintain John the Ripper password cracker, where I think we'll want to have similarly optimized implementations of both, on both CPU and GPU, to allow for better comparison. It's non-trivial to optimize these things for GPUs well. So in terms of my own time, these tasks I just mentioned are a higher priority than helping get Argon2 into libxcrypt. I think it eventually should be in libxcrypt, at least for compatibility with other systems that use Argon2. However, I think ideally the related JtR work (or/and similar in hashcat) would be completed first, so that people (and even more importantly distros' defaults) would not be switching from yescrypt to Argon2 based on the "even more resilient" likely misconception. ;-) As to eventual implementation of Argon2 support in here, I think it could be best to have parts of the upstream Argon2 tree https://github.com/P-H-C/phc-winner-argon2 merged into libxcrypt in a subdirectory and occasionally sync'ed to upstream, in the same way it's currently being done for yescrypt, instead of having a dependency on an external project. |
As this seems to lead to confusion/ skewed expectations, I have adjusted the wording in the mentioned news item. |
We've since got Argon2 supported on GPUs via OpenCL in John the Ripper bleeding-jumbo, here's an example of it in use on a hash with moderately high cost settings (above libxcrypt's current defaults for yescrypt): https://www.openwall.com/lists/john-users/2024/01/23/4 |
argon2 is a password-hashing function that summarizes the state of the
art in the design of memory-hard functions and can be used to hash
passwords for credential storage, key derivation, or other applications.
See: https://github.com/P-H-C/phc-winner-argon2
The argon2_i hash requires libargon2, which is an optional dependency.
Compatible with output of argon2 executable.