Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After implementation cleanup for yescrypt. #27

Merged
merged 16 commits into from
Sep 7, 2018

Conversation

besser82
Copy link
Owner

@besser82 besser82 commented Aug 26, 2018

This PR addresses (almost) all changes requested in #12 by @zackw.

The documentation changes in crypt.5 for scrypt will be in the next PR, which adds gensalt for scrypt and additional tests for the $7$ prefix.

@besser82 besser82 requested a review from zackw August 26, 2018 12:18
@besser82 besser82 force-pushed the besser82/yescrypt_cleanup branch 2 times, most recently from 4d5b9a2 to 09da602 Compare August 26, 2018 12:54
@codecov-io
Copy link

codecov-io commented Aug 26, 2018

Codecov Report

Merging #27 into develop will increase coverage by 0.98%.
The diff coverage is 86.84%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop      #27      +/-   ##
===========================================
+ Coverage    87.56%   88.54%   +0.98%     
===========================================
  Files           28       28              
  Lines         3055     3065      +10     
===========================================
+ Hits          2675     2714      +39     
+ Misses         380      351      -29
Impacted Files Coverage Δ
alg-yescrypt-opt.c 77.28% <ø> (ø)
alg-yescrypt-platform.c 93.93% <ø> (ø)
alg-yescrypt-sysendian.h 100% <ø> (ø)
alg-yescrypt-sha256.c 73.66% <ø> (ø)
alg-yescrypt-common.c 62.27% <ø> (ø)
crypt-yescrypt.c 87.17% <86.84%> (+42.35%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7c06a31...3a3ffba. Read the comment docs.

crypt-yescrypt.c Outdated
The author of yescrypt recommends in the documentation to use
r=8 for total sizes of 2 MiB and less, and r=32 above that.
This has to do with the typical per-core last-level cache sizes
of current CPUs. */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment have missed true explanation of parameters. count specifies memory usage of yescrypt. This memory usage is set up by setting N and r parameters. (In below code) that fact that 11 -> 262144 does not tell too much. Truly 11 is 262144 * 4096 i.e. 1073741824 i.e. 1 GiB of memory. It's better to say 11 -> 1 GiB there (correcting other values too).

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a some more verbose comment in short.

@vt-alt vt-alt mentioned this pull request Aug 26, 2018
crypt-yescrypt.c Outdated
if (yescrypt_init_local (&intbuf->local))
{
if (errno == 0)
errno = ENOMEM;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think proposed change is invalid. As we should never rely on reading of errno, which is never set or reinitialized to zero. And it would be non-deterministic behavior for libcrypt user if function result is dependent on reading of uninitialized value (for example when called before or after of some syscall which changes errno). I think, as @zackw suggested and @solardiz agreed this call should "just return".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording in C11 relating to the initial value of errno leaves something to be desired, but I could argue that it's a conformance violation if it doesn't start out as zero upon entry to main. However, I agree that this makes the behavior nondeterministic based on what has happened earlier in the program, and we should "just return" instead.

@solardiz
Copy link
Collaborator

the next PR, which adds gensalt for scrypt

In case you really want a gensalt for $7$, here's what it was in yescrypt-0.8.1 (last submission to PHC):

uint8_t *
yescrypt_gensalt_r(uint32_t N_log2, uint32_t r, uint32_t p,
    yescrypt_flags_t flags,
    const uint8_t * src, size_t srclen,
    uint8_t * buf, size_t buflen)
{
        uint8_t * dst;
        size_t prefixlen = 3 + 1 + 5 + 5;
        size_t saltlen = BYTES2CHARS(srclen);
        size_t need;

        if (flags) {
                if (flags & ~0x3f)
                        return NULL;

                prefixlen++;
                if (flags != YESCRYPT_RW)
                        prefixlen++;
        }

        need = prefixlen + saltlen + 1;
        if (need > buflen || need < saltlen || saltlen < srclen)
                return NULL;

        if (N_log2 > 63 || ((uint64_t)r * (uint64_t)p >= (1U << 30)))
                return NULL;

        dst = buf;
        *dst++ = '$';
        *dst++ = '7';
        if (flags) {
                *dst++ = 'X'; /* eXperimental, subject to change */
                if (flags != YESCRYPT_RW)
                        *dst++ = itoa64[flags];
        }
        *dst++ = '$';

        *dst++ = itoa64[N_log2];

        dst = encode64_uint32(dst, buflen - (dst - buf), r, 30);
        if (!dst) /* Can't happen */
                return NULL;

        dst = encode64_uint32(dst, buflen - (dst - buf), p, 30);
        if (!dst) /* Can't happen */
                return NULL;

        dst = encode64(dst, buflen - (dst - buf), src, srclen);
        if (!dst || dst >= buf + buflen) /* Can't happen */
                return NULL;

        *dst = 0; /* NUL termination */

        return buf;
}

Since flags is 0 for classic scrypt, you can simplify this accordingly. In fact, if you insist on supporting this, I might reconsider and re-add this into yescrypt_encode_params_r instead of its current use of the $y$. prefix for classic scrypt - or otherwise we need a way to request which prefix/encoding to use, beyond the hash type request via flags. Hmm.

@besser82 besser82 force-pushed the besser82/yescrypt_cleanup branch 5 times, most recently from f005109 to 50a75e1 Compare August 26, 2018 18:29
@besser82
Copy link
Owner Author

In case you really want a gensalt for $7$, here's what it was in yescrypt-0.8.1 (last submission to PHC):

Thanks for the hint, but I've already written a simple implementation.

Since flags is 0 for classic scrypt, you can simplify this accordingly. In fact, if you insist on supporting this, I might reconsider and re-add this into yescrypt_encode_params_r instead of its current use of the $y$. prefix for classic scrypt - or otherwise we need a way to request which prefix/encoding to use, beyond the hash type request via flags. Hmm.

My current implementation lives in a separate file, which provides a proxy for the crypt function and a simple implementation for the gensalt function.

@besser82 besser82 force-pushed the besser82/yescrypt_cleanup branch 3 times, most recently from aaa6d2f to a44c9cf Compare August 27, 2018 09:09
Copy link
Collaborator

@zackw zackw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with just a few small adjustments described inline.

crypt-private.h Outdated
@@ -32,6 +32,9 @@ extern void gensalt_sha_rn (char tag, size_t maxsalt, unsigned long defcount,
const uint8_t *rbytes, size_t nrbytes,
uint8_t *output, size_t output_size);

/* For estimating the output size of base64 encoding. */
#define BASE64_LEN(bytes) ((((bytes) * 8) + 5) / 6)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to scratch my head a bit to realize what the + 5 was for. I suggest "Estimate the size of a base64 encoding of N bytes: 6 bits per output byte, rounded up."

Copy link
Collaborator

@solardiz solardiz Aug 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While (...+(N-1))/N is an obvious to me idiom for rounding up, I think the word "estimate" is what's in fact wrong here - we're doing exact calculation (not estimating), and we probably rely on it being exact.

crypt-yescrypt.c Outdated
large enough to hold the maximum size of the generated salt and
is filled with '\0' bytes before use. */
uint8_t outbuf[CRYPT_GENSALT_OUTPUT_SIZE];
XCRYPT_SECURE_MEMSET(outbuf, CRYPT_GENSALT_OUTPUT_SIZE);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why it's necessary to zero this buffer before use?

crypt-yescrypt.c Outdated
return;
}

memcpy(output, outbuf, o_size);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copying o_size bytes is likely to read past the end of the string written by yescrypt_encode_params_r. Since strlcpy isn't generally available, I suggest instead

assert (o_size >= strlen (outbuf) + 1);
memcpy (output, outbuf, strlen (outbuf) + 1);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, memcpy (output, outbuf, strlen (outbuf) + 1); is just a cryptic way to write strcpy (output, outbuf);, so if that's what we mean we should in fact use strcpy() and not try to hide it.

There's a trivial alternative to strlcpy() through using strncat() into a buffer with NUL in its first char, but we probably don't want a truncating function here, so if we do any copying like this at all (I'm out of context as to why we need(?) that), then assert() + strcpy() sounds appropriate.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote assert and then memcpy because it's likely to trip stupid automated linters that ding you for using strcpy at all. Maybe we need another wrapper inline a la XCRYPT_SECURE_MEMSET: XCRYPT_STRCPY_OR_ABORT(dest, destsize, src).

The reason for the copy is that both yescrypt_encode_params_r and yescrypt_r might write to their output buffer even when they fail. outbuf already has a failure token set up, so we want to preserve that until we know the process is successful.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead of copying, we should (re)introduce the failure token into outbuf once we know the process isn't successful?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what @vt-alt originally did. I would've been fine with that if it had taken care to preserve the *0/*1 alternation, but @vt-alt insisted that it didn't need to, and I guess @besser82 decided to avoid the issue entirely. Copying does also allow slightly simpler control logic in crypt_yescrypt_rn.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to side-step these issues could be to be (re)introducing the failure token by a call to a function that always takes care of the *0 / *1 alternation.

crypt-yescrypt.c Outdated
params.N = 1ULL << (count + 7); // 3 -> 1024, 4 -> 2048, ... 11 -> 262144
}

if (!yescrypt_encode_params_r(&params, rbytes, nrbytes, outbuf, o_size))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GNU style also dictates putting a space before the open paren of a function call.

crypt-yescrypt.c Outdated
if (yescrypt_free_local (&intbuf->local) || !intbuf->retval)
return;

memcpy(output, intbuf->outbuf, o_size);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same kibitz here as for gensalt.

crypt.5 Outdated
.hash "$y$" "\e$y\e$[./A-Za-z0-9]+\e$[./A-Za-z0-9]*\e$[./A-Za-z0-9]{43}" unlimited 8 "" 256 "up to 512" "0 (default), 1 to 11 (logarithmic)"
yescrypt is a scalable password hashing scheme designed by
Solar Designer, which is based on Colin Percival's scrypt.
.hash "$y$" "\e$y\e$[./A-Za-z0-9]+\e$[./A-Za-z0-9]{25,…}\e$[./A-Za-z0-9]{43}" unlimited 8 256 256 "128 to (theoretically) 17895696" "1 to 11 (logarithmic)"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it actually safe to use raw non-ASCII characters in nroff source? I would have expected not. Since this is a POSIX ERE, {25,} would be the canonical way to indicate no upper limit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definitely should be a valid regexp, and besides the changes look wrong to me, while the original by @vt-alt looked correct to me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I just took @besser82's word for these changes, but you're right, some of them appear to be wrong. Not all, though.

The changes I originally requested to the .hash line were correct. Changing 8 "" 256 to 8 256 256 makes the rendered manpage say "Hash size 256 bits" instead of "Hash size 256 bits (effectively )" [sic]. That middle argument is only for DES, where the output is 64 bits long but only 56 of them are meaningful -- hopefully we don't get any more like that. And I also asked for the removal of "0 (default), " for consistency with the other entries in the table.

The changes to the regex and the "up to 512" / "128 to theoretically 17895696" parameter are describing the supported salt size. Skimming yescrypt_encode_params_r, I don't see any lower limit on the salt size, and the upper limit is SIZE_MAX / 16 bytes, which is ABI-dependent; if my calculations are correct, it comes out to 1,073,741,816 bits with 32-bit size_t, and 4,611,686,018,427,387,896 bits with 64-bit size_t. Clearly nobody is going to be using salts anywhere near that huge. The practical limit on the salt is 192 bytes (CRYPT_GENSALT_OUTPUT_SIZE) - the length of the other pieces of the setting string ($y$j9T$$\0). I can't tell from the code what the maximum possible length of the parameters component could be for arbitrary calls to yescrypt_encode_params_r, but I can only manage to produce three-character parameters components by calling crypt_gensalt_rn with all the supported "count" values. So that gives us a practical upper limit of 183 bytes of base64-encoded salt, or 1098 bits. I have no idea where 17,895,696 came from.

It concerns me a little that there doesn't seem to be a lower limit on the salt size. I suppose even if we did, people could still subvert the salt mechanism by passing N bytes of binary zeroes to crypt_gensalt. The default (16 bytes drawn from urandom if you pass NULL as rbytes) is sane, that's probably good enough.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I think I'm the original author of the .hash macro, first created for Owl in 2000, but I'm not yet familiar with how it's changed in libxcrypt since then - the word "effectively" isn't in my version, and instead my version doesn't confuse limit on effective input key size and hash size, which are not the same - maybe this is a bug introduced in libxcrypt that needs to be fixed (someone forgot about salts when thinking of the effect of key size on effective hash size?) Proper reporting for "Traditional DES-based" on Owl is.:

       prefix "" (empty string);
              a string matching ^[./0-9A-Za-z]{2} (see regex(7))

       Encoding syntax
              [./0-9A-Za-z]{13}

       Maximum password length
              8 (uses 7-bit characters)

       Effective key size
              up to 56 bits

       Hash size
              64 bits

       Salt size
              12 bits

       Iteration count
              25

yescrypt's salt size is limited to 512 bits max through unsigned char saltbin[64] in yescrypt_r(), which is chosen this way in part because yescrypt's builtin salt+hash encryption feature (which we don't expose here) also has this limit for more fundamental reason (it's max block size of the variable-block-size cipher used there, and there's no extra code to implement a block cipher mode in there, so we always encrypt/decrypt just one block).

I'd rather not introduce a minimum salt size in order to make the crypt(3)-encoding interface to yescrypt almost as capable as the KDF interface (where allowing empty salts is also needed for compatibility with scrypt and makes sense for some use cases), without artificial limitations. (The upper limit of 512 bits isn't entirely artificial - there would have to be some since the salt is encoded/decoded, and sometimes even encrypted/decrypted.)

Copy link
Collaborator

@zackw zackw Aug 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@solardiz

For comparison, here is what we get for DES with the current .hash macro in libxcrypt's crypt.5:

       prefix "" (empty string)

       Encoded passphrase format
              [./0-9A-Za-z]{13}

       Maximum password length
              8 characters (ignores 8th bit)

       Hash size
              64 bits (effectively 56)

       Salt size
              12 bits

       CPU time cost parameter
              25

my version doesn't confuse limit on effective input key size and hash size, which are not the same - maybe this is a bug introduced in libxcrypt that needs to be fixed (someone forgot about salts when thinking of the effect of key size on effective hash size?)

I don't understand this. The "hash size" is the size of the output of the one-way function, which is independent of the size of the salt, isn't it? Anyway, let's split this off to a separate discussion: whether or not the hash size / key size terminology currently used in crypt.5 is ideal or even correct, I hope you'll agree that the table entry for yescrypt should not read

       Hash size
              256 bits (effectively )

and that's the only thing I was concerned with here. Would you please read through man -l crypt.5 using our crypt.5, and file a fresh issue describing any concerns you have?

yescrypt's salt size is limited to 512 bits max through unsigned char saltbin[64] in yescrypt_r(), which is chosen this way in part because yescrypt's builtin salt+hash encryption feature (which we don't expose here) also has this limit for more fundamental reason

Aha. Thanks. I think that means the correct regex is

\$y\$[./A-Za-z0-9]+\$[./A-Za-z0-9]{,86}\$[./A-Za-z0-9]{43}

and the salt size should remain "up to 512 bits" as it originally was. Why doesn't yescrypt_encode_params_r enforce that limit, though?

I'd rather not introduce a minimum salt size in order to make the crypt(3)-encoding interface to yescrypt almost as capable as the KDF interface

Fair enough.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree the "256 bits (effectively )" display is obviously wrong, and the reporting of "64 bits (effectively 56)" for descrypt, which I think is wrong, should be discussed separately. Your suggested regexp for yescrypt looks correct to me, although I'd consistently list 0-9 before A-Z in these regexp's as that's the order used in the encoding (except for bcrypt's, where what you wrote here for yescrypt would be "more correct"). Good point about possibly enforcing the salt size limit in yescrypt_encode_params_r(), I'll consider making this change.

zackw added a commit that referenced this pull request Sep 3, 2018
Solar Designer points out that a phrase used to describe DES-crypt,
“Hash size 64 bits (effectively 56),” is misleading.  Revert to
separate “hash size” and “effective key size” table entries as is
used in the Openwall version of this manpage, but omit the “effective
key size” table entry if it would be the same number as “hash size.”
Correct the `.hash` line for yescrypt as discussed in the review of
pull request #27.

Also swap two of the entries in the list of attacks that salt defeats,
for better prose flow; reference RFC 4648 when talking about ‘the
common “base64” encoding’ that hashes don’t use; and tighten up the
implementation of the `.hash` macro.
@besser82
Copy link
Owner Author

besser82 commented Sep 7, 2018

All small remarks should be adressed. Waiting for Travis and then merging.

@besser82
Copy link
Owner Author

besser82 commented Sep 7, 2018

Travis looks good. Merging on cli.

@besser82 besser82 merged commit 3a3ffba into develop Sep 7, 2018
@besser82 besser82 deleted the besser82/yescrypt_cleanup branch September 7, 2018 12:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants