Skip to content

[RFC] Add argon2 backend. #113

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from
Open

[RFC] Add argon2 backend. #113

wants to merge 2 commits into from

Conversation

ferivoz
Copy link
Contributor

@ferivoz ferivoz commented Oct 10, 2020

argon2 is a password-hashing function that summarizes the state of the
art in the design of memory-hard functions and can be used to hash
passwords for credential storage, key derivation, or other applications.

See: https://github.com/P-H-C/phc-winner-argon2

The argon2_i hash requires libargon2, which is an optional dependency.

Compatible with output of argon2 executable.

@ferivoz
Copy link
Contributor Author

ferivoz commented Oct 10, 2020

Regarding the advices given in #104:

  • I have added libargon2 as an optional dependency. The library is added if argon2_i has been selected as a hash, which is selected by default.
  • I have reviewed the NetBSD implementation. They are not compatible with the output of argon2 executable. They store the salt as plaintext. It just looks like it is encoded in some base64 form, but in reality it is put into the argon2 function call as it is. Mixing a plaintext salt and a base64 encoded hash looks wrong to me and makes it difficult to experiment.
  • I had to add some base64 encoding with a A-Za-z0-9/+ alphabet because that is what the reference implementation uses. The code is taken from reference implementation and marked with appropriate copyright. Maybe you prefer to switch to the base64 encoding used in scrypt?

Things to consider:

  • Tests are missing yet. I can copy and paste argon2 encoded output into my /etc/shadow and login though
  • Only argon2_i is supported right now. Adding argon2_d and argon2_id is trivial. I decided to not do it right now so I can integrate feedback much faster and add the missing functions when everything is settled.

I would love to get some feedback on this. You are free to edit my branch or add comments and let me know what you think!

@ferivoz ferivoz mentioned this pull request Oct 10, 2020
@besser82 besser82 mentioned this pull request Oct 11, 2020
Copy link
Collaborator

@zackw zackw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid this is not acceptable as is. The most serious problem, called out in the code comments, is the use of sscanf in crypt_argon2_i_rn. Also, known-answer tests are missing (see test/ka-tester.c and test/ka-table-gen.py). Ideally, ka-table-gen.py would use a different implementation of argon2 to generate the known answers, but I understand that there may not be two implementations yet. I presume libargon2 has its own unit tests, so we don't need alg-* tests for argon, but we do need enough known-answer tests to exercise setting string parsing thoroughly.

There are a few other, less important issues mentioned in the code comments.

Finally, before we merge this we need to make some decisions about these issues:

  • I said in the code comments that I don't think we want to switch the default for new hashes from yescrypt just yet. We might want to think about whether and when it would be appropriate. The main reason I don't want to switch immediately is that argon2 is not yet widely deployed enough for it to be safe to distribute $argon2i$ hashes from network login servers (NIS+, LDAP, etc). There was, IIRC, also some dispute over whether argon2 is really better than yescrypt.

    (Hopefully, sometime in 2021 I will get around to implementing run-time configurability for default hash and cost parameters, which will make this much less of an issue,)

  • Should we support argon2d and argon2id as well as argon2i? If so, I'd want support for all variants to go in in the same release.

  • Re

    I have reviewed the NetBSD implementation. They are not compatible with the output of argon2 executable. They store the salt as plaintext. It just looks like it is encoded in some base64 form, but in reality it is put into the argon2 function call as it is. Mixing a plaintext salt and a base64 encoded hash looks wrong to me and makes it difficult to experiment.

    In principle, I agree that we should be compatible with the output of the argon2 executable. However, is the NetBSD implementation also using $argon2i$ as its prefix?

  • Re

    I had to add some base64 encoding with a A-Za-z0-9/+ alphabet because that is what the reference implementation uses. The code is taken from reference implementation and marked with appropriate copyright. Maybe you prefer to switch to the base64 encoding used in scrypt?

    No, we should definitely match the reference implementation (and the Internet-Draft)'s choice of base64 variant.

    Refactoring out all the duplicate code for base64 variants has been on my to-do list for quite some time, but should not block this patch.

NEWS Outdated
@@ -4,6 +4,7 @@ Please send bug reports, questions and suggestions to
<https://github.com/besser82/libxcrypt/issues>.

Version 4.4.18
* Implement the argon2_i ($argon2i$) hashing algorithm.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change "Implement" to "Add support for" -- we have not reimplemented argon ourselves.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It occurs to me that this is probably a significant enough change to call this version 4.5. (We won't go to 5.x until we drop support for --enable-obsolete-api, which may never happen.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have adjusted the wording and the version.

configure.ac Outdated
Comment on lines 409 to 419
case "$hashes_enabled" in
*argon2*)
AC_SUBST(LIBARGON2)
AC_CHECK_LIB(argon2, argon2i_hash_encoded, [argon2_lib=yes], [argon2_lib=no])
if test x"$argon2_lib" != xyes; then
AC_MSG_ERROR(argon2_i requires libargon2, which was not found)
fi
LIBARGON2=-largon2
LIBS="$LIBS $LIBARGON2"
;;
esac
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has several problems: All AC_SUBSTs must be executed unconditionally. It's unreliable to put most other AC_* macros inside hand-written shell control flow constructs at the top level of a configure.ac. Although it's unlikely to matter in this case, using AC_CHECK_LIB instead of AC_SEARCH_LIBS is almost always a mistake. And you haven't used enough M4 quotation.

You don't actually use the LIBARGON2 variable anywhere, so I'd suggest this rewrite:

AS_CASE(["$hashes_enabled"],
    [*argon2*], [
        AC_SEARCH_LIBS([argon2i_hash_encoded], [argon2])
        AS_IF([test x"$ac_cv_search_argon2i_hash_encoded" = xno],
            [AC_MSG_ERROR([Argon2 hashes require libargon2, which was not found])])
    ])

Using AS_CASE instead of a bare case makes the nested AC_SEARCH_LIBS safe, and also gets rid of the unbalanced parentheses; AC_MSG_ERROR is specifically documented as safe for use inside a hand-written shell if, but since we are using AS_CASE I think it's cleaner to use AS_IF as well, rather than mixing and matching styles. AC_SEARCH_LIBS automatically sets LIBS, so no need to do it by hand.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally the autoconf script would also try using PKG_CHECK_MODULES to detect argon2. In addition to handling libargon2 installed in a nonstandard location but accessible via .pc files, this would give builders the ability to use LIBARGON2_CFLAGS and LIBARGON2_LIBS to specify its location manually. But I'm not insisting on that change, we can always implement it later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for your detailed explanation of the macros and the code example. I have applied it.

doc/crypt.5 Outdated
Comment on lines 170 to 172
argon2_i is a password-hashing function that summarizes the state of the art
in the design of memory-hard functions and can be used to hash passwords
for credential storage, key derivation, or other applications.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reads like marketing copy from the argon2 website. Please think of a more NPOV way to describe it.

We should probably cite the argon2 Internet-Draft here, and make a note to update the citation once it gets finalized as an RFC.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. The text has been copied from their website. I did not want to add a text written by myself to keep the description as precise as possible. I took the Internet-Draft as recommended now.

lib/hashes.conf Outdated
Comment on lines 42 to 43
argon2_i $argon2i$ 16 STRONG,DEFAULT
yescrypt $y$ 16 STRONG,DEFAULT,ALT,FEDORA
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DEFAULT flag should only appear on one entry in hashes.conf. (I should have put something in the build scripts to make them bomb out on this mistake.) I think we do not want to switch our default from yescrypt yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that DEFAULT can exist multiple times in this file. The last paragraph in the descriptive comment describes that the first entry wins. I do not mind moving argon2 further down. Also I removed the DEFAULT flag for argon2 hashes.

}

/* Setting is invalid. */
if (sscanf (setting, "$argon2i$v=%u$m=%u,t=%u,p=%u$%s",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scanf functions are broken-as-specified, and may not ever be used in libxcrypt proper (it's OK in tests, but only when we 100% control the input). I'm going to have to insist you hand-write code to parse the setting string.

The only C library functions for converting textual numbers to machine numbers, that are adequately defensive against malformed or malicious input, are the strto* family. If we don't already have a convenience wrapper for strtoul we should probably grow one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used sscanf to keep the code as small as possible to make code reviews and catching the idea of the code as easy as possible. The code uses strtoul now and follows macro handling from bcrypt code.

Comment on lines +20 to +28
/*
* Some macros for constant-time comparisons. These work over values in
* the 0..255 range. Returned value is 0x00 on "false", 0xFF on "true".
*/
#define EQ(x, y) ((((0U - ((unsigned)(x) ^ (unsigned)(y))) >> 8) & 0xFF) ^ 0xFF)
#define GT(x, y) ((((unsigned)(y) - (unsigned)(x)) >> 8) & 0xFF)
#define GE(x, y) (GT(y, x) ^ 0xFF)
#define LT(x, y) GT(y, x)
#define LE(x, y) GE(y, x)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain why these conversion functions need to use constant-time comparisons?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In an earlier version of my code I have used a lookup table just like ascii64 in crypto-common.c. Eventually I gave up that code for these reasons:

  • The base64 code used in Argon2 differs from RFC because it skips the padding character '='. My code would therefore not follow RFC either.
  • The file alg-argon2-encoding.c is a direct copy from Argon2 reference implementation.
  • Having a separate file makes it easy to put proper copyright into that file as well, especially since the code in the file is not modified

I would not mind adjusting it if it is requested. Especially since a lookup table is easier to review and complies with rest of the code base.

@zackw
Copy link
Collaborator

zackw commented Oct 12, 2020

Github isn't letting me add this to the specific code comment thread where it's relevant:

I think the inadequate M4 quotation in configure.ac is why all the Travis builds bombed out with weird shell errors.

Come to think of it, this PR should adjust the Travis configuration so we have libargon2 available at build time in most of the configurations. There should be one configuration that exercises the "no libargon2 at build time" code paths, but we already have so many Travis variant builds that I'd recommend piggybacking that on an existing variant rather than adding a new one.

Argon2 is a memory-hard function for password-hashing and proof-of-work
applications. It aims at the highest memory filling rate and effective use
of multiple computing units, while still providing defense against tradeoff
attacks.

See: https://github.com/P-H-C/phc-winner-argon2

The argon2 hashes require libargon2, which is an optional dependency.

Compatible with output of argon2 executable.
@ferivoz
Copy link
Contributor Author

ferivoz commented Oct 14, 2020

I extended the pull request to contain Argon2id, Argon2i, and Argon2d. Argon2id is used by NetBSD as $argon2id$ and marked with NETBSD in hashes.conf. The NetBSD version is not compatible with argon2 output, regardless of specific hash.

I have to fix the Codacy findings. Therefore the pull request is WIP for now.

Thank you for your valuable feedback @zackw!

@ferivoz ferivoz changed the title Add argon2 backend. WIP: Add argon2 backend. Oct 14, 2020
@ferivoz ferivoz changed the title WIP: Add argon2 backend. [RFC] Add argon2 backend. Oct 16, 2020
@ferivoz
Copy link
Contributor Author

ferivoz commented Oct 16, 2020

I have checked the reports but I do not know if I have to fix them or not. I think the argon2 code handles strings just like the other crypt files, e.g. bcrypt. In crypt-bcrypt.c, the size argument for setting is actually ignored.

Do you have an advice about it or do you think as well that these findings can or should be ignored?

@besser82
Copy link
Owner

Codacy Here is an overview of what got changed by this pull request:

Issues
======
+ Solved 1
- Added 5
           

See the complete overview on Codacy


if (strncmp (nptr, arg, strlen (arg)))
return 0;
nptr += strlen (arg);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{
unsigned long val;

if (strncmp (nptr, arg, strlen (arg)))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GET_VALUE(t, "t=", ',');
GET_VALUE(p, "p=", '$');

strncpy (buf, s, s_size);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errno = EINVAL;
return;
}
s += strlen(type);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

/* Setting for a different hash algorithm. */
if (strncmp (s, type, strlen(type)))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ferivoz
Copy link
Contributor Author

ferivoz commented Jun 1, 2021

Hi, I hope it is okay to ask for feedback again. Are there any news on this subject?

@eleius
Copy link

eleius commented Sep 23, 2023

Hi, I hope it is okay to ask for feedback again. Are there any news on this subject?

I wanted to ask the same but... no activity in 2 years? :-/

@jonas-w
Copy link

jonas-w commented Sep 23, 2023

#150

@solardiz
Copy link
Collaborator

@eleius Given the timing of your comment, I guess it's related to Arch Linux's decision to switch to yescrypt - https://archlinux.org/news/changes-to-default-password-hashing-algorithm-and-umask-settings/ - where among other things they wrote "Although the winner of the Password Hashing Competition has been argon2, this even more resilient algorithm is not yet available in libxcrypt (attempt one, attempt two)."

Disclaimer: I've designed yescrypt, so obviously I am biased. I also was on PHC's panel (obviously not voting for my own entry), so I have studied all entries at least to some extent (and Argon2 pretty well, with both yescrypt and Argon2 influencing the final PHC versions of each other).

I think calling Argon2 an "even more resilient algorithm" is wrong. For practical purposes, I think it is not - on the contrary, yescrypt is likely more resilient. We give a comparison of yescrypt vs. Argon2 on the yescrypt homepage and in documentation, including in terms of attack resilience.

There were multiple criteria in the competition, with attack resilience being only one of them. Another was simplicity. Argon2 is simpler than yescrypt, and I accept that complexity is a drawback of yescrypt.

I have quite a backlog of tasks related to both yescrypt and Argon2. I maintain the upstream implementation of yescrypt and have some things to improve in there. I also co-maintain John the Ripper password cracker, where I think we'll want to have similarly optimized implementations of both, on both CPU and GPU, to allow for better comparison. It's non-trivial to optimize these things for GPUs well. So in terms of my own time, these tasks I just mentioned are a higher priority than helping get Argon2 into libxcrypt. I think it eventually should be in libxcrypt, at least for compatibility with other systems that use Argon2. However, I think ideally the related JtR work (or/and similar in hashcat) would be completed first, so that people (and even more importantly distros' defaults) would not be switching from yescrypt to Argon2 based on the "even more resilient" likely misconception. ;-)

As to eventual implementation of Argon2 support in here, I think it could be best to have parts of the upstream Argon2 tree https://github.com/P-H-C/phc-winner-argon2 merged into libxcrypt in a subdirectory and occasionally sync'ed to upstream, in the same way it's currently being done for yescrypt, instead of having a dependency on an external project.

@dvzrv
Copy link

dvzrv commented Sep 23, 2023

I think calling Argon2 an "even more resilient algorithm" is wrong.

As this seems to lead to confusion/ skewed expectations, I have adjusted the wording in the mentioned news item.
@solardiz thanks for reaching out via email and pointing it out!

@solardiz
Copy link
Collaborator

solardiz commented Mar 27, 2024

I think ideally the related JtR work (or/and similar in hashcat) would be completed first, so that people (and even more importantly distros' defaults) would not be switching from yescrypt to Argon2 based on the "even more resilient" likely misconception. ;-)

We've since got Argon2 supported on GPUs via OpenCL in John the Ripper bleeding-jumbo, here's an example of it in use on a hash with moderately high cost settings (above libxcrypt's current defaults for yescrypt): https://www.openwall.com/lists/john-users/2024/01/23/4
The break-even point for CPU vs. GPU for Argon2 appears to be at 100 to 200+ MB per hash (depending on which CPUs vs. which GPUs we consider comparable), which is much higher than libxcrypt's current default of 16 MiB for yescrypt.
To test at 16 MiB, running against $argon2d$v=19$m=16384,t=3,p=1$c2hvcnRfc2FsdA$TLSTPihIo+5F67Y1vJdfWdB9 (note the t=3, which is 2.25 times more iterations over data than we currently use with yescrypt by default), I got 1742 c/s on GTX 1080 (before it fully heat up, so turbo speed, would reduce somewhat) vs. 438 c/s on 2x E5-2670 + 8x DDR3-1600 (pretty old hardware, I know). So ~4x speedup for a 180W, 2016 gaming GPU vs. 2x 115W = 230W 2012 server CPUs. Just sharing the data points we have so far. Edit: detail for these benchmarks is at openwall/john#5382 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants