Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test/drbgtest.c: Fix error check test #11195

Closed
wants to merge 7 commits into from
Closed

test/drbgtest.c: Fix error check test #11195

wants to merge 7 commits into from

Conversation

@ciz
Copy link
Contributor

@ciz ciz commented Feb 27, 2020

The condition in test_error_checks() was inverted, so the test succeeded
as long as error_check() failed. Incidently, error_check() contained
several bugs that assured it always failed, thus giving overall drbg
test success.

  • tests are added or updated
@@ -358,16 +358,18 @@ static int error_check(DRBG_SELFTEST_DATA *td)
goto err;

/* Test insufficient entropy */
if (!init(drbg, td, &t))
goto err;
t.entropylen = drbg->min_entropylen - 1;

This comment has been minimized.

@ciz

ciz Feb 27, 2020
Author Contributor

t is reset to the defaults in init(), so t.entropylen needs to be set after the init(), otherwise the following generation will succeed.

@@ -414,7 +418,7 @@ static int error_check(DRBG_SELFTEST_DATA *td)
* failure.
*/
t.entropylen = 0;
if (TEST_false(RAND_DRBG_generate(drbg, buff, td->exlen, 1,
if (!TEST_false(RAND_DRBG_generate(drbg, buff, td->exlen, 1,

This comment has been minimized.

@ciz

ciz Feb 27, 2020
Author Contributor

This is an expected fail. I must say these !TEST_something are really confusing and hard to parse.

This comment has been minimized.

@mspncp

mspncp Feb 27, 2020
Contributor

This is an expected fail.

Correct. This error seems to have slipped in already when this module was added.

I must say these !TEST_something are really confusing and hard to parse.

I fully agree. I had a hard time too reading these conditions at the beginning. My strategy is to just ignore the surrounding if (! .... ), because it simply means "break on failure of the test condition", so stripping it leaves only the test condition: Instead of

if (!TEST_false(RAND_DRBG_generate(...)))

I just read

TEST_false(RAND_DRBG_generate(...))
@@ -423,7 +427,7 @@ static int error_check(DRBG_SELFTEST_DATA *td)
if (!instantiate(drbg, td, &t))
goto err;
reseed_counter_tmp = drbg->reseed_gen_counter;
drbg->reseed_gen_counter = drbg->reseed_interval;
drbg->reseed_gen_counter = drbg->reseed_interval + 1;

This comment has been minimized.

@ciz

ciz Feb 27, 2020
Author Contributor

This no longer works. Since 8bf3665#diff-9181ac017a6177a5f2619f65c9b7a346R682, the counter must be larger than the interval to trigger the reseeding.

This comment has been minimized.

@mspncp

mspncp Feb 27, 2020
Contributor

I'll need to check that tomorrow...

This comment has been minimized.

@mspncp

mspncp Feb 28, 2020
Contributor

IIRC this was a deliberate change by @slontis because in the NIST document the counter counts from 1 and not from 0. (see 8bf3665#diff-9181ac017a6177a5f2619f65c9b7a346R516.) Is that correct @slontis?

@slontis slontis requested a review from mspncp Feb 27, 2020
@mspncp mspncp self-assigned this Feb 27, 2020
Copy link
Contributor

@mspncp mspncp left a comment

Looks good at first sight. I'll have to do some more checks tomorrow, it's too late now. Here are some first comments.

(Note: the patch applies cleanly to the 1.1.1 stable branch as well).

/*
* Nice idea, however RAND_DRBG_uninstantiate() cleanses the data via
* drbg_ctr_uninstantiate(), but right after that it resets drbg->data.ctr
* using RAND_DRBG_set(), so the following memcmp will fail.
*/
#if 0
if (!TEST_mem_eq(zero, sizeof(drbg->data), &drbg->data, sizeof(drbg->data)))
goto err;
#endif
Comment on lines 498 to 506

This comment has been minimized.

@mspncp

mspncp Feb 27, 2020
Contributor

Interesting. Note that there is a similar test in providers/fips/self_test_kats.c, which was added only recently by me in #11111. To make it work, I delayed the reinitialization of the drbg->data until the next instantiation. I didn't know that this was tested here, too.

Meanwhile this test should succeed, since your branch already contains my fixes:

  • e704521 Check that the DRBG's internal state has been zeroized after uninstantiation
  • 75ff4f7 DRBG: delay initialization of DRBG method until instantiation

So please revert this change.

This comment has been minimized.

@mspncp

mspncp Feb 27, 2020
Contributor

Some part of your comment could remain, to explain the check.

This comment has been minimized.

@ciz

ciz Feb 28, 2020
Author Contributor

You're right. It works on git master. My code was originally written for 1.1.1d, where the test failed. I'll re-enable the check, which should also fix the Travis build broken due to the unused variable zero.

This comment has been minimized.

@mspncp

mspncp Feb 28, 2020
Contributor

You are right, I was already thinking about that. But I did not have the time to write a reply yet.

We could indeed cherry-pick 75ff4f7 to 1.1.1 as a bug fix, since it breaks this test. @paulidale @slontis what's your opinion?

If this is not possible, we would have to disable or remove the test in 1.1.1. It might be necessary to craft a separate pull request for 1.1.1. But for the moment, let's concentrate on master.

This comment has been minimized.

@paulidale

paulidale Feb 28, 2020
Contributor

I'm inclined to not cherry-pick the delayed reinitialisation. 1.1.1 has no need to formally verify that the internal state has been zeroed and there isn't a security problem since we know it will have been.

This comment has been minimized.

@mspncp

mspncp Feb 28, 2020
Contributor

Ok, then we'll just drop the test when cherry-picking to 1.1.1.

@@ -414,7 +418,7 @@ static int error_check(DRBG_SELFTEST_DATA *td)
* failure.
*/
t.entropylen = 0;
if (TEST_false(RAND_DRBG_generate(drbg, buff, td->exlen, 1,
if (!TEST_false(RAND_DRBG_generate(drbg, buff, td->exlen, 1,

This comment has been minimized.

@mspncp

mspncp Feb 27, 2020
Contributor

This is an expected fail.

Correct. This error seems to have slipped in already when this module was added.

I must say these !TEST_something are really confusing and hard to parse.

I fully agree. I had a hard time too reading these conditions at the beginning. My strategy is to just ignore the surrounding if (! .... ), because it simply means "break on failure of the test condition", so stripping it leaves only the test condition: Instead of

if (!TEST_false(RAND_DRBG_generate(...)))

I just read

TEST_false(RAND_DRBG_generate(...))
@@ -423,7 +427,7 @@ static int error_check(DRBG_SELFTEST_DATA *td)
if (!instantiate(drbg, td, &t))
goto err;
reseed_counter_tmp = drbg->reseed_gen_counter;
drbg->reseed_gen_counter = drbg->reseed_interval;
drbg->reseed_gen_counter = drbg->reseed_interval + 1;

This comment has been minimized.

@mspncp

mspncp Feb 27, 2020
Contributor

I'll need to check that tomorrow...

test/drbgtest.c Show resolved Hide resolved
@ciz
Copy link
Contributor Author

@ciz ciz commented Feb 28, 2020

(Note: the patch applies cleanly to the 1.1.1 stable branch as well).

When applying this to 1.1.1, either the removal of check
if (!TEST_mem_eq(zero, sizeof(drbg->data)
needs to stay, or we need to delay the drbg->data re-initialization similar to #11111.

Also, the reseed counter adjustment below isn't needed, as the breaking change was added to master only (commit 8bf3665)

-    drbg->reseed_gen_counter = drbg->reseed_interval;
+    drbg->reseed_gen_counter = drbg->reseed_interval + 1;`

I can create a separate pull request once this one is accepted.

@mspncp
Copy link
Contributor

@mspncp mspncp commented Feb 29, 2020

@slontis, I'm beginning to doubt whether it was correct to change this reset value from zero to one:

drbg->reseed_gen_counter = 1;

If I understand it correctly, the reseed_gen_counter is supposed to count the number of generate calls since the last reseed. However, when executing the following test

openssl/test/drbgtest.c

Lines 430 to 439 in e0f2639

drbg->reseed_gen_counter = drbg->reseed_interval + 1;
/* Generate output and check entropy has been requested for reseed */
t.entropycnt = 0;
if (!TEST_true(RAND_DRBG_generate(drbg, buff, td->exlen, 0,
td->adin, td->adinlen))
|| !TEST_int_eq(t.entropycnt, 1)
|| !TEST_int_eq(drbg->reseed_gen_counter, reseed_counter_tmp + 1)
|| !uninstantiate(drbg))
goto err;

the counter ends up at 2, not 1, although there is only one generate call after the reseed. The reason is that this counter is incremented at the end of every RAND_DRBG_generate() call (second watchpoint hit). This is probably the reason why the counter used to be reset to zero.

Here's a transcript of my debug session. The interesting parts are the two watchpoint hits.

Start of test

+bt
#0  error_check (td=0x5555558e2440 <drbg_test>) at test/drbgtest.c:433
#1  0x00005555555a388d in test_error_checks (i=0) at test/drbgtest.c:533
#2  0x00005555555a74d2 in run_tests (test_prog_name=0x7fffffffe371 "/home/msp/src/openssl/test/drbgtest") at test/testutil/driver.c:358
#3  0x00005555555a787b in main (argc=1, argv=0x7fffffffe048) at test/testutil/main.c:30
+p  drbg->reseed_gen_counter
$1 = 257
+info b
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x00005555555a3272 in error_check at test/drbgtest.c:433
2       hw watchpoint  keep y                      -location drbg->reseed_gen_counter
3       breakpoint     keep y   0x00005555555a3368 in error_check at test/drbgtest.c:445
+c
Continuing.

First watchpoint hit

drbg->reseed_gen_counter = 1;

Hardware watchpoint 2: -location drbg->reseed_gen_counter

Old value = 257
New value = 1
RAND_DRBG_reseed (drbg=0x5555558f52a0, adin=0x5555557f3720 <aes_128_no_df_additionalinput> "K\"F\030\002{\322\033\"B|7\331\366\350\233\022\060_\351\220\350\b$O\006f\333\031+\023\225.\226pd\372\337\337W\265\202\356\326\355>", <incomplete sequence \302>, adinlen=32, prediction_resistance=0) at crypto/rand/drbg_lib.c:752
752	    drbg->reseed_time = time(NULL);
+bt
#0  RAND_DRBG_reseed (drbg=0x5555558f52a0, adin=0x5555557f3720 <aes_128_no_df_additionalinput> "K\"F\030\002{\322\033\"B|7\331\366\350\233\022\060_\351\220\350\b$O\006f\333\031+\023\225.\226pd\372\337\337W\265\202\356\326\355>", <incomplete sequence \302>, adinlen=32, prediction_resistance=0) at crypto/rand/drbg_lib.c:752
#1  0x00005555555ebe01 in RAND_DRBG_generate (drbg=0x5555558f52a0, out=0x7fffffffca60 ".\226pd\372\337\337W\265\202\356\326\355>e\302p'1ےp!\376\026\266\310Q4\207e\320N\375\376h\354\254ܓA8\222\220\264\224\371\r\244\367N\200\222gH@\247\bǼf", outlen=16, prediction_resistance=0, adin=0x5555557f3720 <aes_128_no_df_additionalinput> "K\"F\030\002{\322\033\"B|7\331\366\350\233\022\060_\351\220\350\b$O\006f\333\031+\023\225.\226pd\372\337\337W\265\202\356\326\355>", <incomplete sequence \302>, adinlen=32) at crypto/rand/drbg_lib.c:930
#2  0x00005555555a32bb in error_check (td=0x5555558e2440 <drbg_test>) at test/drbgtest.c:434
#3  0x00005555555a388d in test_error_checks (i=0) at test/drbgtest.c:533
#4  0x00005555555a74d2 in run_tests (test_prog_name=0x7fffffffe371 "/home/msp/src/openssl/test/drbgtest") at test/testutil/driver.c:358
#5  0x00005555555a787b in main (argc=1, argv=0x7fffffffe048) at test/testutil/main.c:30
+c
Continuing.

Second watchpoint hit

drbg->reseed_gen_counter++;

Hardware watchpoint 2: -location drbg->reseed_gen_counter

Old value = 1
New value = 2
RAND_DRBG_generate (drbg=0x5555558f52a0, out=0x7fffffffca60 "f!'\206F", outlen=16, prediction_resistance=0, adin=0x0, adinlen=0) at crypto/rand/drbg_lib.c:946
946	    return 1;
+bt
#0  RAND_DRBG_generate (drbg=0x5555558f52a0, out=0x7fffffffca60 "f!'\206F", outlen=16, prediction_resistance=0, adin=0x0, adinlen=0) at crypto/rand/drbg_lib.c:946
#1  0x00005555555a32bb in error_check (td=0x5555558e2440 <drbg_test>) at test/drbgtest.c:434
#2  0x00005555555a388d in test_error_checks (i=0) at test/drbgtest.c:533
#3  0x00005555555a74d2 in run_tests (test_prog_name=0x7fffffffe371 "/home/msp/src/openssl/test/drbgtest") at test/testutil/driver.c:358
#4  0x00005555555a787b in main (argc=1, argv=0x7fffffffe048) at test/testutil/main.c:30
+c
Continuing.

End of test

Breakpoint 3, error_check (td=0x5555558e2440 <drbg_test>) at test/drbgtest.c:445
445	    t.entropylen = 0;
(gdb) info b
+info b
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x00005555555a3272 in error_check at test/drbgtest.c:433
	breakpoint already hit 1 time
2       hw watchpoint  keep y                      -location drbg->reseed_gen_counter
	breakpoint already hit 2 times
3       breakpoint     keep y   0x00005555555a3368 in error_check at test/drbgtest.c:445
	breakpoint already hit 1 time
(gdb) p  drbg->reseed_gen_counter
+p  drbg->reseed_gen_counter
$2 = 2
@slontis
Copy link
Contributor

@slontis slontis commented Feb 29, 2020

Hmm it was back in July last year that I did this change so forgive me if its not real clear what I did in there..

The commit message does however discuss my reasoning
i.e:
renamed generate_counter back to reseed_counter so it is not confusing and different from standard. (The value is actually used by the Hash DRBG, so I didnt want to use generate_counter+1 in that code.)

So as long as the Hash DRBG doesnt get broken I dont mind what happens. This one is the only type that used the value as part of its calculation.

@mspncp
Copy link
Contributor

@mspncp mspncp commented Feb 29, 2020

So as long as the Hash DRBG doesnt get broken I dont mind what happens. This one is the only type that used the value as part of its calculation.

Ok, thanks for the explanation. I'll take a look - maybe do a comparison with the old FOM implementation - whether it is worth reverting the change or not. Maybe it's better to leave this question for a separate pr anyway.

@ciz
Copy link
Contributor Author

@ciz ciz commented Mar 19, 2020

Is the pull request fine now or are there any changes required from me?

@mspncp
Copy link
Contributor

@mspncp mspncp commented Mar 19, 2020

Is the pull request fine now or are there any changes required from me?

Sorry for letting this pr stall. I had too much work lately due to corona lockout in my company. I'll try to take a look in the next days.

@ciz
Copy link
Contributor Author

@ciz ciz commented Mar 20, 2020

Is the pull request fine now or are there any changes required from me?

Sorry for letting this pr stall. I had too much work lately due to corona lockout in my company. I'll try to take a look in the next days.

No problem at all. I was just wondering whether I hadn't miss something.

@mspncp
Copy link
Contributor

@mspncp mspncp commented Mar 22, 2020

@ciz here is finally my reply to your comment:

This no longer works. Since 8bf3665#diff-9181ac017a6177a5f2619f65c9b7a346R682, the counter must be larger than the interval to trigger the reseeding.

After digging a little into history I came to the conclusion that it was me in commit a93ba40 and not @slontis in commit 8bf3665 who introduced the problem:

Original Code (OpenSSL-fips-2_0-stable)

In the original FIPS code, the counter was initialized to 1

dctx->reseed_counter = 1;

dctx->reseed_counter = 1;

and the stop condition was dctx->reseed_counter >= dctx->reseed_interval:

else if (dctx->reseed_counter >= dctx->reseed_interval)

if (dctx->reseed_counter >= dctx->reseed_interval)

In the next section you can see that I lowered the initial counter value from 1 to 0 but kept the stop condition. IIRC, I believed at that time that the stop condition was off-by-one such that the number of generate calls between consecutive reseeds was one less than it should have been. Revisiting the code it seems that I was wrong. Maybe I was a little bit confused by the strange logic in FIPS_drbg_generate(), in cunjunction with the DRBG_CUSTOM_RESEED logic:

if (dctx->iflags & DRBG_CUSTOM_RESEED)
dctx->generate(dctx, NULL, outlen, NULL, 0);
else if (dctx->reseed_counter >= dctx->reseed_interval)
dctx->status = DRBG_STATUS_RESEED;
if (dctx->status == DRBG_STATUS_RESEED || prediction_resistance)
{
/* If prediction resistance request don't do health check */
int hcheck = prediction_resistance ? 0 : 1;
if (!drbg_reseed(dctx, adin, adinlen, hcheck))
{
r = FIPS_R_RESEED_ERROR;
goto end;
}
adin = NULL;
adinlen = 0;
}
if (!dctx->generate(dctx, out, outlen, adin, adinlen))
{
r = FIPS_R_GENERATE_ERROR;
dctx->status = DRBG_STATUS_ERROR;
goto end;
}
if (!(dctx->iflags & DRBG_CUSTOM_RESEED))
{
if (dctx->reseed_counter >= dctx->reseed_interval)
dctx->status = DRBG_STATUS_RESEED;
else
dctx->reseed_counter++;
}

The condition dctx->reseed_counter >= dctx->reseed_interval is checked twice, once before the generate call (lines 422,423) and once after it (lines 447,448), both times under the condition !(dctx->iflags & DRBG_CUSTOM_RESEED)). Anyway, the check is done before incrementing the counter, so I was wrong when I increased the number of generate calls.

After #4402 (a93ba40)

Here you can see the situation after my change. (The DRBG_CUSTOM_RESEED logic had already been removed at that time, probably by @richsalz, I didn't check).

drbg->generate_counter = 0;

drbg->generate_counter = 0;

if (drbg->generate_counter >= drbg->reseed_interval)

drbg->generate_counter++;

After #6779 (8bf3665)

When @slontis noticed that the starting count was wrong, he incremented it back to 1 and adjusted the stop condition accordingly.

drbg->reseed_gen_counter = 1;

drbg->reseed_gen_counter = 1;

if (drbg->reseed_gen_counter > drbg->reseed_interval)

drbg->reseed_gen_counter++;

Conclusion

So I think the proper fix is to leave the initial value at 1 and just revert @slontis' change of the stop condition and your change in the test. Would you mind doing this for me in your pull request? (Note that @slontis' change is only on master, not on 1.1.1, so I'd recommend to revert it in a separate commit which can be dropped for 1.1.1.)

@mspncp
Copy link
Contributor

@mspncp mspncp commented Mar 23, 2020

(@slontis could you please double-check my last post?)

@slontis
Copy link
Contributor

@slontis slontis commented Mar 23, 2020

Your logic makes sense to me...

@ciz
Copy link
Contributor Author

@ciz ciz commented Jun 1, 2020

Sorry for taking so long, I've completely missed your comments.
Thanks for the detailed write-up. A twist with a twist.

I fixed the condition in drbg_lib.c and the manual setting of reseed_gen_counter at both places in drbtest.c.

@mspncp
Copy link
Contributor

@mspncp mspncp commented Jun 12, 2020

@ciz I was on vacation for two weeks. I'll take a look next week.

@mspncp
Copy link
Contributor

@mspncp mspncp commented Jul 3, 2020

@ciz unfortunately I let your pull request stall for too long, now your changes collided with @paulidale's heavy EVP_RAND replumbing (#11682). Instead of letting you suffer for something which was my fault, I did the rebase myself, which was quite nasty. In fact so nasty, that I decided to squash your fixup commits first before rebasing your branch (gh-11195-ori -> gh-11195-squashed -> gh-11195-squashed-and-rebased).

After some efforts, the outcome looks very reasonable, in fact it would have probably been easier to redo your changes on the tip of master instead of going through the pain of rebasing.

@ciz I would ask you to help me verify that I did not introduce any errors during my rebase. To make it easier for you, I published the three intermediate stages as pull requests in my personal fork:

You can verify my rebase in two steps:

  • Verify that the first pull request is your identical to your current one and that there are no code changes between the first and the second.
  • Open the second and the third pull request side-by-side and compare the hunks. They should match exactly.

If you think gh-11195-squashed-and-rebased is ok, you can force-push it to your ciz:drbgtest branch. Alternatively, you can ask me to force-push it to your branch.

@mspncp
Copy link
Contributor

@mspncp mspncp commented Jul 3, 2020

Unfortunately, the test_rand test fails in a strange way:

msp@msppc:~/src/openssl$ make tests V=1 TESTS=test_rand
make depend && make _tests
make[1]: Entering directory '/home/msp/src/openssl'
make[1]: Leaving directory '/home/msp/src/openssl'
make[1]: Entering directory '/home/msp/src/openssl'
( SRCTOP=. \
  BLDTOP=. \
  PERL="/usr/bin/perl" \
  FIPSKEY="f4556650ac31d35461610bac4ed81b1a181b2d8a43ea2854cbae22ca74560813" \
  EXE_EXT= \
  /usr/bin/perl ./test/run_tests.pl test_rand )
05-test_rand.t .. 
1..2
# The results of this test will end up in test-runs/test_rand
    # Subtest: ../../test/drbgtest
    1..7
        # Subtest: test_kats
        1..1
        ok 1 - iteration 1
    ok 1 - test_kats
        # Subtest: test_error_checks
        1..16
../../util/wrap.pl ../../test/drbgtest => 139
not ok 1

#   Failed test at test/recipes/05-test_rand.t line 16.

Running the test in the debugger, I get a SIGSEGV apparently caused by some misalignment. Strange... 🤔

msp@msppc:~/src/openssl$ cd test-runs/test_rand/

msp@msppc:~/src/openssl/test-runs/test_rand$ ../../util/wrap.pl gdb --args ../../test/drbgtest
GNU gdb (Gentoo 9.1 vanilla) 9.1
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ../../test/drbgtest...
(gdb) run
Starting program: /home/msp/src/openssl/test/drbgtest 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
1..7
    # Subtest: test_kats
    1..1
    ok 1 - iteration 1
ok 1 - test_kats
    # Subtest: test_error_checks
    1..16

Program received signal SIGSEGV, Segmentation fault.
__memset_avx2_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:151
151	../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: No such file or directory.
(gdb) bt
#0  __memset_avx2_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:151
#1  0x000055555564d98a in drbg_ctr_generate (drbg=0x55555594c0c0, out=0x7fffffffc270 "", outlen=65536, adin=0x55555582c720 <aes_128_no_df_additionalinput> "K\"F\030\002{\322\033\"B|7\331\366\350\233\022\060_\351\220\350\b$O\006f\333\031+\023\225.\226pd\372\337\337W\265\202\356\326\355>", <incomplete sequence \302>, adinlen=32) at providers/implementations/rands/drbg_ctr.c:406
#2  0x00005555557afa80 in PROV_DRBG_generate (drbg=0x55555594c0c0, out=0x7fffffffc270 "", outlen=65536, strength=0, prediction_resistance=0, adin=0x55555582c720 <aes_128_no_df_additionalinput> "K\"F\030\002{\322\033\"B|7\331\366\350\233\022\060_\351\220\350\b$O\006f\333\031+\023\225.\226pd\372\337\337W\265\202\356\326\355>", <incomplete sequence \302>, adinlen=32) at providers/implementations/rands/drbg.c:768
#3  0x000055555564db61 in drbg_ctr_generate_wrapper (vdrbg=0x55555594c0c0, out=0x7fffffffc270 "", outlen=65536, strength=0, prediction_resistance=0, adin=0x55555582c720 <aes_128_no_df_additionalinput> "K\"F\030\002{\322\033\"B|7\331\366\350\233\022\060_\351\220\350\b$O\006f\333\031+\023\225.\226pd\372\337\337W\265\202\356\326\355>", <incomplete sequence \302>, adin_len=32) at providers/implementations/rands/drbg_ctr.c:455
#4  0x00005555555d2a50 in EVP_RAND_generate (ctx=0x555555a127f0, out=0x7fffffffc270 "", outlen=65537, strength=0, prediction_resistance=0, addin=0x55555582c720 <aes_128_no_df_additionalinput> "K\"F\030\002{\322\033\"B|7\331\366\350\233\022\060_\351\220\350\b$O\006f\333\031+\023\225.\226pd\372\337\337W\265\202\356\326\355>", <incomplete sequence \302>, addin_len=32) at crypto/evp/evp_rand.c:451
#5  0x00005555555eb3f6 in RAND_DRBG_generate (drbg=0x55555593f630, out=0x7fffffffc270 "", outlen=65537, prediction_resistance=0, adin=0x55555582c720 <aes_128_no_df_additionalinput> "K\"F\030\002{\322\033\"B|7\331\366\350\233\022\060_\351\220\350\b$O\006f\333\031+\023\225.\226pd\372\337\337W\265\202\356\326\355>", <incomplete sequence \302>, adinlen=32) at crypto/rand/drbg_lib.c:522
#6  0x00005555555a96c4 in error_check (td=0x55555592c460 <drbg_test>) at test/drbgtest.c:495
#7  0x0000000000000000 in ?? ()
(gdb) 
@mspncp
Copy link
Contributor

@mspncp mspncp commented Jul 3, 2020

Ok, I found the reason for the memset crash: The following RAND_DRBG_generate() call actually get's executed, even though the requested buffer size is max_request(drbg) + 1 = 65537, but sizeof(buff) is 1024. This explains why the test crashes instead of reporting a failure.

    /* Request too much data for one request */
    if (!TEST_false(RAND_DRBG_generate(drbg, buff, max_request(drbg) + 1, 0,
                                       td->adin, td->adinlen)))
        goto err;

(all tests were done on commit 7befa2a of my gh-11195-squashed-and-rebased branch)

@mspncp
Copy link
Contributor

@mspncp mspncp commented Jul 4, 2020

I wanted to see whether it was an off-by-one error, so I tried max_request(drbg) + 2, and finally 2*max_request(drbg), both crashed.

msp@msppc:~/src/openssl/test-runs/test_rand$ git diff
diff --git a/test/drbgtest.c b/test/drbgtest.c
index d65185e2cf..a68849baa8 100644
--- a/test/drbgtest.c
+++ b/test/drbgtest.c
@@ -492,7 +492,7 @@ static int error_check(DRBG_SELFTEST_DATA *td)
         goto err;
 
     /* Request too much data for one request */
-    if (!TEST_false(RAND_DRBG_generate(drbg, buff, max_request(drbg) + 1, 0,
+    if (!TEST_false(RAND_DRBG_generate(drbg, buff, 2*max_request(drbg), 0,
                                        td->adin, td->adinlen)))
         goto err;

It looks like the random generator is truncating outlen from 131072 to 65536 instead of failing. If that is true, that would be an unacceptable behaviour. But I still don't understand all details. To be continued...

(gdb) bt
#0  __memset_avx2_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:151
#1  0x000055555564d98a in drbg_ctr_generate (drbg=0x55555594c0c0, out=0x7fffffffc270 "", outlen=65536, adin=0x55555582c720 <aes_128_no_df_additionalinput> "K\"F\030\002{\322\033\"B|7\331\366\350\233\022\060_\351\220\350\b$O\006f\333\031+\023\225.\226pd\372\337\337W\265\202\356\326\355>", <incomplete sequence \302>, adinlen=32) at providers/implementations/rands/drbg_ctr.c:406
#2  0x00005555557afa80 in PROV_DRBG_generate (drbg=0x55555594c0c0, out=0x7fffffffc270 "", outlen=65536, strength=0, prediction_resistance=0, adin=0x55555582c720 <aes_128_no_df_additionalinput> "K\"F\030\002{\322\033\"B|7\331\366\350\233\022\060_\351\220\350\b$O\006f\333\031+\023\225.\226pd\372\337\337W\265\202\356\326\355>", <incomplete sequence \302>, adinlen=32) at providers/implementations/rands/drbg.c:768
#3  0x000055555564db61 in drbg_ctr_generate_wrapper (vdrbg=0x55555594c0c0, out=0x7fffffffc270 "", outlen=65536, strength=0, prediction_resistance=0, adin=0x55555582c720 <aes_128_no_df_additionalinput> "K\"F\030\002{\322\033\"B|7\331\366\350\233\022\060_\351\220\350\b$O\006f\333\031+\023\225.\226pd\372\337\337W\265\202\356\326\355>", <incomplete sequence \302>, adin_len=32) at providers/implementations/rands/drbg_ctr.c:455
#4  0x00005555555d2a50 in EVP_RAND_generate (ctx=0x555555a127f0, out=0x7fffffffc270 "", outlen=131072, strength=0, prediction_resistance=0, addin=0x55555582c720 <aes_128_no_df_additionalinput> "K\"F\030\002{\322\033\"B|7\331\366\350\233\022\060_\351\220\350\b$O\006f\333\031+\023\225.\226pd\372\337\337W\265\202\356\326\355>", <incomplete sequence \302>, addin_len=32) at crypto/evp/evp_rand.c:451
#5  0x00005555555eb3f6 in RAND_DRBG_generate (drbg=0x55555593f630, out=0x7fffffffc270 "", outlen=131072, prediction_resistance=0, adin=0x55555582c720 <aes_128_no_df_additionalinput> "K\"F\030\002{\322\033\"B|7\331\366\350\233\022\060_\351\220\350\b$O\006f\333\031+\023\225.\226pd\372\337\337W\265\202\356\326\355>", <incomplete sequence \302>, adinlen=32) at crypto/rand/drbg_lib.c:522
#6  0x00005555555a96c4 in error_check (td=0x55555592c460 <drbg_test>) at test/drbgtest.c:495
#7  0x0000000000000000 in ?? ()
@mspncp
Copy link
Contributor

@mspncp mspncp commented Jul 4, 2020

On master (bb2d726) the test succeeds, so it must have something to do with some change on gh-11195-squashed-and-rebased.

@mspncp
Copy link
Contributor

@mspncp mspncp commented Jul 4, 2020

Ok, the explanation why the test doesn't crash on master is simple: due to the errors in the test logic, the test Request too much data for one request is never executed. So the error is probably already present in master, but it didn't show up until now.

@mspncp
Copy link
Contributor

@mspncp mspncp commented Jul 4, 2020

I found the root cause. It has nothing to do with your pull request. Need some time to think about the fix, though.

@paulidale
Copy link
Contributor

@paulidale paulidale commented Jul 5, 2020

The current master breaks down long requests into smaller pieces.

@mspncp
Copy link
Contributor

@mspncp mspncp commented Jul 5, 2020

Yes thanks, I know. I looked in the wrong place for a while, because I thought that the test worked on master and failed in this branch only. But when I noticed that the test wasn't executed on master because of the broken test logic (which this pr fixes), started looking in your code and it was easy to find.

@mspncp
Copy link
Contributor

@mspncp mspncp commented Jul 12, 2020

Congratulations @ciz, it looks like your fix of the test logic uncovered another bug which slipped into master because the test suite didn't detect it properly before. I took a short look at it yesterday night and I think I know what's going on there. It has nothing to do with your your changes, they are fine. Your pull request only needs to hang around a little more, until I have the fix for the error. I'll take care of it, all you need to do is to wait (unless you want to hunt the bug for yourself).

@mspncp
Copy link
Contributor

@mspncp mspncp commented Jul 13, 2020

Next bugfix, still not the last one. I'm already hunting the next one.

@mspncp
Copy link
Contributor

@mspncp mspncp commented Jul 13, 2020

@paulidale with the next bug I need your assistance: The upcoming drbgtest failure (Travis is still running) is caused by the fact that the callbacks of the PROV_DRBG (get_entropy_fn, cleanup_entropy_fn, get_nonce_fn, cleanup_nonce_fn) get cleared when the RAND_DRBG is uninstantiated via RAND_DRBG_uninstantiate(). This deviates from the behaviour of the original RAND_DRBG and is not expected by the test suite. That's the reason why the test fails.

Actually, the callbacks get cleared by RAND_DRBG_set() (which is called from RAND_DRBG_uninstantiate()) because it replaces its entire EVP_RAND_CTX. This is the callstack immediately before it happens:

(gdb) bt
#0  RAND_DRBG_uninstantiate (drbg=0x55555593b600) at crypto/rand/drbg_lib.c:470
#1  0x00005555555a86b8 in uninstantiate (drbg=0x55555593b600) at test/drbgtest.c:282
#2  0x00005555555a930b in error_check (td=0x555555928440 <drbg_test>) at test/drbgtest.c:451
#3  0x00005555555a9ee8 in test_error_checks (i=0) at test/drbgtest.c:619
#4  0x00005555555ad2b8 in run_tests (test_prog_name=0x7fffffffdbd8 "/home/msp/src/openssl-gh-11195-drbgtest/test/drbgtest") at test/testutil/driver.c:354
#5  0x00005555555ad6cd in main (argc=1, argv=0x7fffffffd848) at test/testutil/main.c:30

The RAND_DRBG callbacks are kat_entropy and kat_nonce

(gdb) p *drbg
$4 = {
  lock = 0x0,
  libctx = 0x0,
  parent = 0x0,
  type = 904,
  flags = 1,
  ex_data = {
    ctx = 0x0,
    sk = 0x0
  },
  rand = 0x555555a0e7c0,
  get_entropy = 0x5555555a85d2 <kat_entropy>,
  cleanup_entropy = 0x0,
  get_nonce = 0x5555555a862a <kat_nonce>,
  cleanup_nonce = 0x0,
  callback_data = 0x7fffffffc250
}

and the associated PROV_DRBG callbacks are rand_drbg_{get,cleanup}_entropy_cb, rand_drbg_{get,cleanup}_nonce_cb

(gdb) p *(struct prov_drbg_st *)(drbg->rand->data)
$5 = {
  lock = 0x0,
  provctx = 0x555555949cc0,
  instantiate = 0x55555564bbbf <drbg_ctr_instantiate>,
  uninstantiate = 0x55555564c127 <drbg_ctr_uninstantiate>,
  reseed = 0x55555564bcfb <drbg_ctr_reseed>,
  generate = 0x55555564be19 <drbg_ctr_generate>,
  parent = 0x0,

  ...
  
  state = DRBG_ERROR,
  data = 0x5555559d8a60,
  callback_arg = 0x55555593b600,
  get_entropy_fn = 0x5555555e9a05 <rand_drbg_get_entroy_cb>,
  cleanup_entropy_fn = 0x5555555e9bee <rand_drbg_cleanup_entropy_cb>,
  get_nonce_fn = 0x5555555e9cd5 <rand_drbg_get_nonce_cb>,
  cleanup_nonce_fn = 0x5555555e9e71 <rand_drbg_cleanup_nonce_cb>
}

Here is the situation after RAND_DRBG_set(): the RAND_DRBG callbacks are still there

(gdb) p *drbg
$14 = {
  lock = 0x0,
  libctx = 0x0,
  parent = 0x0,
  type = 904,
  flags = 1,
  ex_data = {
    ctx = 0x0,
    sk = 0x0
  },
  rand = 0x555555a0e7c0,
  get_entropy = 0x5555555a85d2 <kat_entropy>,
  cleanup_entropy = 0x0,
  get_nonce = 0x5555555a862a <kat_nonce>,
  cleanup_nonce = 0x0,
  callback_data = 0x7fffffffc250
}

but the PROV_DRBG callbacks have been cleared:

(gdb) p *(struct prov_drbg_st *)(drbg->rand->data)
$13 = {
  lock = 0x0,
  provctx = 0x555555949cc0,
  instantiate = 0x55555564bbbf <drbg_ctr_instantiate>,
  uninstantiate = 0x55555564c127 <drbg_ctr_uninstantiate>,
  reseed = 0x55555564bcfb <drbg_ctr_reseed>,
  generate = 0x55555564be19 <drbg_ctr_generate>,
  parent = 0x0,

  ...
  
  state = DRBG_UNINITIALISED,
  data = 0x5555559d8a60,
  callback_arg = 0x0,
  get_entropy_fn = 0x0,
  cleanup_entropy_fn = 0x0,
  get_nonce_fn = 0x0,
  cleanup_nonce_fn = 0x0
}

Now my question: is there a way to uninstantiate the RAND_DRBG without having to call RAND_DRBG_set() (e.g., just call EVP_RAND_uninstantiate(drbg->rand) and reset a few more RAND_DRBG members if necessary)? If not, what is the best way to save and restore the PROV_DRBG callbacks ?

@mspncp
Copy link
Contributor

@mspncp mspncp commented Jul 13, 2020

@paulidale as I already anticipated in #11195 (comment),

@ciz unfortunately I let your pull request stall for too long, now your changes collided with @paulidale's heavy EVP_RAND replumbing (#11682).

I am paying a bitter penance for letting @ciz' pull request stall too long. If it had gone into master before #11682, then you would have had to fix your bugs yourself ;-).

@mspncp mspncp force-pushed the ciz:drbgtest branch Jul 21, 2020
ciz and others added 7 commits Feb 27, 2020
The condition in test_error_checks() was inverted, so it succeeded
as long as error_check() failed. Incidently, error_check() contained
several bugs that assured it always failed, thus giving overall drbg
test success.
The reseed counter condition was broken since a93ba40, where the
initial value was wrongly changed from one to zero.
Commit 8bf3665 fixed the initialization, but also adjusted the check,
so the problem remained.
This change restores original (OpenSSL-fips-2_0-stable) behavior.
The behaviour of RAND_DRBG_generate() has changed. Previously, it
would fail for requests larger than max_request, now it automatically
splits large input into chunks (which was previously done only
by RAND_DRBG_bytes() before calling RAND_DRBG_generate()).

So this test has not only become obsolete, the fact that it succeeded
unexpectedly also caused a buffer overflow that terminated the test.
Fixes a compiler warning which was treated as an error:

test/drbgtest.c:179:13: error: unused function 'max_request' [-Werror,-Wunused-function]
DRBG_SIZE_T(max_request)
It's the generate counter (drbg->reseed_gen_counter), not the
reseed counter which needs to be raised above the reseed_interval.
This mix-up was partially caused by some recent renamings of DRBG
members variables, but that will be dealt with in a separate commit.
The RAND_DRBG callbacks are wrappers around the EVP_RAND callbacks.
During uninstantiation, the EVP_RAND callbacks got lost while the
RAND_DRBG callbacks remained, because RAND_DRBG_uninstantiate()
calls RAND_DRBG_set(), which recreates the EVP_RAND object.
This was causing drbgtest failures.

This commit fixes the problem by adding code to RAND_DRBG_set() for
saving and restoring the EVP_RAND callbacks.
@mspncp mspncp force-pushed the ciz:drbgtest branch to a7c264d Jul 21, 2020
@mspncp
Copy link
Contributor

@mspncp mspncp commented Jul 21, 2020

The second force-push was a rebase on master without further changes in order to pickup @paulidale's change ce3080e.

@mspncp
mspncp approved these changes Jul 21, 2020
Copy link
Contributor

@mspncp mspncp left a comment

My approval holds for @ciz's commits, in addition to the implicit approval of my own commits, which I added to this pull request. :-)

@mspncp mspncp requested a review from paulidale Jul 21, 2020
Copy link
Contributor

@paulidale paulidale left a comment

Looks good.

@openssl-machine
Copy link

@openssl-machine openssl-machine commented Jul 22, 2020

This pull request is ready to merge

openssl-machine pushed a commit that referenced this pull request Jul 22, 2020
The condition in test_error_checks() was inverted, so it succeeded
as long as error_check() failed. Incidently, error_check() contained
several bugs that assured it always failed, thus giving overall drbg
test success.

Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Matthias St. Pierre <Matthias.St.Pierre@ncp-e.com>
(Merged from #11195)
openssl-machine pushed a commit that referenced this pull request Jul 22, 2020
The reseed counter condition was broken since a93ba40, where the
initial value was wrongly changed from one to zero.
Commit 8bf3665 fixed the initialization, but also adjusted the check,
so the problem remained.
This change restores original (OpenSSL-fips-2_0-stable) behavior.

Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Matthias St. Pierre <Matthias.St.Pierre@ncp-e.com>
(Merged from #11195)
openssl-machine pushed a commit that referenced this pull request Jul 22, 2020
The behaviour of RAND_DRBG_generate() has changed. Previously, it
would fail for requests larger than max_request, now it automatically
splits large input into chunks (which was previously done only
by RAND_DRBG_bytes() before calling RAND_DRBG_generate()).

So this test has not only become obsolete, the fact that it succeeded
unexpectedly also caused a buffer overflow that terminated the test.

Reviewed-by: Paul Dale <paul.dale@oracle.com>
(Merged from #11195)
openssl-machine pushed a commit that referenced this pull request Jul 22, 2020
It's the generate counter (drbg->reseed_gen_counter), not the
reseed counter which needs to be raised above the reseed_interval.
This mix-up was partially caused by some recent renamings of DRBG
members variables, but that will be dealt with in a separate commit.

Reviewed-by: Paul Dale <paul.dale@oracle.com>
(Merged from #11195)
openssl-machine pushed a commit that referenced this pull request Jul 22, 2020
The RAND_DRBG callbacks are wrappers around the EVP_RAND callbacks.
During uninstantiation, the EVP_RAND callbacks got lost while the
RAND_DRBG callbacks remained, because RAND_DRBG_uninstantiate()
calls RAND_DRBG_set(), which recreates the EVP_RAND object.
This was causing drbgtest failures.

This commit fixes the problem by adding code to RAND_DRBG_set() for
saving and restoring the EVP_RAND callbacks.

Reviewed-by: Paul Dale <paul.dale@oracle.com>
(Merged from #11195)
@mspncp
Copy link
Contributor

@mspncp mspncp commented Jul 22, 2020

Merged to the master branch. Thank you for your patience, @ciz!

a27cb95 Fix: uninstantiation breaks the RAND_DRBG callback mechanism
d1768e8 test/drbgtest.c: set the correct counter to trigger reseeding
8e3e1df test/drbgtest.c: Remove error check for large generate requests
9fb6692 Fix DRBG reseed counter condition.
11a6d6f test/drbgtest.c: Fix error check test

@mspncp mspncp closed this Jul 22, 2020
openssl-machine pushed a commit that referenced this pull request Jul 22, 2020
The condition in test_error_checks() was inverted, so the test succeeded
as long as error_check() failed. Incidently, error_check() contained
several bugs that assured it always failed, thus giving overall drbg
test success.

Remove the broken explicit zero check.
RAND_DRBG_uninstantiate() cleanses the data via drbg_ctr_uninstantiate(),
but right after that it resets drbg->data.ctr using RAND_DRBG_set(),
so TEST_mem_eq(zero, sizeof(drbg->data)) always failed.

(backport from #11195)

Signed-off-by: Vitezslav Cizek <vcizek@suse.com>

Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Matthias St. Pierre <Matthias.St.Pierre@ncp-e.com>
(Merged from #12517)
@mspncp mspncp mentioned this pull request Jul 22, 2020
1 of 1 task complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants