Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.0 FIPS provider is not compatible with tests in 3.0.5 #19171

Closed
slontis opened this issue Sep 8, 2022 · 28 comments
Closed

3.0 FIPS provider is not compatible with tests in 3.0.5 #19171

slontis opened this issue Sep 8, 2022 · 28 comments
Labels
hold: need omc decision The OMC needs to make a decision triaged: bug The issue/pr is/fixes a bug

Comments

@slontis
Copy link
Member

slontis commented Sep 8, 2022

I am not sure why we ended up testing the stable 3.0.X branch against master and NOT
3.0.0 source tarball against latest stable 3.0.X branch..

The consequence of this is that the end user can not run tests without failures if they run
3.0.0 source tarball against latest stable 3.0.X branch..
I have verified that there is at least one breaking change (see comment below related to key mismatch) when using the OpenSSL 3.0 fips provider with OpenSSL 3.0.5.

The first failure I looked at happens because the tests were updated to match changes inside the provider..
e.g:

Inside provider/exchange/kdf_exch.c the new code is

static int kdf_derive(void *vpkdfctx, unsigned char *secret, size_t *secretlen, size_t outlen)
{
    PROV_KDF_CTX *pkdfctx = (PROV_KDF_CTX *)vpkdfctx;
    size_t kdfsize;
    int ret;

    if (!ossl_prov_is_running())
        return 0;

    kdfsize = EVP_KDF_CTX_get_kdf_size(pkdfctx->kdfctx);

    if (secret == NULL) {
        *secretlen = kdfsize;
        return 1;
    }

    if (kdfsize != SIZE_MAX) {
        if (outlen < kdfsize) {
            ERR_raise(ERR_LIB_PROV, PROV_R_OUTPUT_BUFFER_TOO_SMALL);
            return 0;
        }
        outlen = kdfsize;
    }

    ret = EVP_KDF_derive(pkdfctx->kdfctx, secret, outlen, NULL);
    if (ret <= 0)
        return 0;

    *secretlen = outlen;
    return 1;
}

Inside the 3.0 fips provider the code is much simpler and does not do the

if (kdfsize != SIZE_MAX) check..

As evp_test was then updated to deliberately double the len this will fail with the 3.0 fips provider...

How do we fix this?
What is the process that users should follow to deploy and test the OpenSSL 3.0 FIPS provider?


Test failures

04-test_encoder_decoder.t (Wstat: 256 Tests: 3 Failed: 1)
Failed test: 3
Non-zero exit status: 1
25-test_verify.t (Wstat: 512 Tests: 163 Failed: 2)
Failed tests: 115-116
Non-zero exit status: 2
30-test_evp.t (Wstat: 3072 Tests: 103 Failed: 12)
Failed tests: 38, 48, 51-52, 54, 57-63
Non-zero exit status: 12
65-test_cmp_client.t (Wstat: 256 Tests: 3 Failed: 1)
Failed test: 3
Non-zero exit status: 1
65-test_cmp_protect.t (Wstat: 256 Tests: 3 Failed: 1)
Failed test: 3
Non-zero exit status: 1
80-test_cms.t (Wstat: 1792 Tests: 12 Failed: 7)
Failed tests: 2-6, 9, 11
Non-zero exit status: 7
80-test_ssl_new.t (Wstat: 7168 Tests: 30 Failed: 28)
Failed tests: 1-21, 23-28, 30
Non-zero exit status: 28
80-test_ssl_old.t (Wstat: 512 Tests: 11 Failed: 2)
Failed tests: 7-8
Non-zero exit status: 2
90-test_sslapi.t (Wstat: 256 Tests: 2 Failed: 1)
Failed test: 2
Non-zero exit status: 1
Files=243, Tests=3437, 474 wallclock secs ( 6.86 usr 0.60 sys + 375.70 cusr 41.42 csys = 424.58 CPU)

NOTE there are quite a few.. keypair mismatch errors such as..
X509_check_private_key:key values mismatch:crypto/x509/x509_cmp.c:405 happens quite

@slontis slontis added issue: bug report The issue was opened to report a bug hold: need otc decision The OTC needs to make a decision labels Sep 8, 2022
@paulidale
Copy link
Contributor

We should be testing the 3.0.0 FIPS provider against 3.0.x and master. We wouldn't necessarily have to build the 3.0.0 FIPS provider every time, it's immutable after all.

It would be nice to keep our current test of the 3.0.x and master FIPS providers against each other (but not critical IMO).

@slontis
Copy link
Member Author

slontis commented Sep 8, 2022

The key mismatch issue is a BREAKING CHANGE..

EVP_PKEY_eq() was changed from just checking the public key to potentially checking the private key...
This changed the selection to KEYPAIR inside EVP_PKEY_eq() (it used to select just the publickey - as that was how the function was documented)
and then also changed the keymanagers match function to handle this...

e..g.
rsa_match does this in the old fips provider...

  ok = e value matches.
  if (selection & publickey)
    ok = ok && n value matches
 if (selection & privatekey)
   ok = ok  && d value matches

This will break on the d value check..

The newer 3.0.5 provider code for rsa_match() (as well as other match functions) conditionally checks the private key only if there is no public key.. This was to cater for keys that only had a private component.

@slontis
Copy link
Member Author

slontis commented Sep 8, 2022

@paulidale are you less confused now - I updated the comment

@paulidale
Copy link
Contributor

I was more concerned about us missing this change being breaking 😭

@slontis
Copy link
Member Author

slontis commented Sep 8, 2022

Not sure that changing the selection was a good idea. It was documented to do something, and it was changed..

@slontis
Copy link
Member Author

slontis commented Sep 8, 2022

This kind of thing will happen if there is no test, no matter what you do..

@slontis
Copy link
Member Author

slontis commented Sep 8, 2022

With the EVP_PKEY_eq() changed back to just select public I get a smaller set of failures..

04-test_encoder_decoder.t        (Wstat: 256 Tests: 3 Failed: 1)
  Failed test:  3
  Non-zero exit status: 1
25-test_verify.t                 (Wstat: 512 Tests: 163 Failed: 2)
  Failed tests:  115-116
  Non-zero exit status: 2
30-test_evp.t                    (Wstat: 1280 Tests: 103 Failed: 5)
  Failed tests:  38, 48, 51, 54, 57
  Non-zero exit status: 5
30-test_evp_extra.t              (Wstat: 512 Tests: 3 Failed: 2)
  Failed tests:  1-2

@slontis
Copy link
Member Author

slontis commented Sep 8, 2022

04 and 25 seem to be related to ECExplicit curves
evp_extra seems to be related to the EVP_PKEY_eq() check that I guess was checking private only keys..

The 30_test_evp ones I will check tomorrow but it looks bad if they are failing KAT's..

@t-j-h
Copy link
Member

t-j-h commented Sep 8, 2022

Our CI is meant to have a test in it explicitly for checking that we don't make any breaking changes in the provider ABI.
This was one of the requirements for 3.0 and how we handle things.
This change is going to need to be backed out - or implemented in a different form - so that we can always continue to use providers built with earlier releases no matter what. That is one of the fundamental promises of the approach we have taken.

And that applies for all of 3.x ... earlier release binaries must be useable with the newer releases.

@t-j-h t-j-h added the hold: need omc decision The OMC needs to make a decision label Sep 8, 2022
@slontis
Copy link
Member Author

slontis commented Sep 8, 2022

Yeah I was suprised when I learned what the CI loop was actually doing.. Basically we have a fips provider that cant currently be used with the latest security fixes.

@mattcaswell
Copy link
Member

I am not sure why we ended up testing the stable 3.0.X branch against master and NOT
3.0.0 source tarball against latest stable 3.0.X branch..

Yeah, that doesn't seem right. We should probably do both of those things.

@t8m
Copy link
Member

t8m commented Sep 8, 2022

One thing are true API breaking changes. We should not allow them. On the other hand, we really CANNOT use the unmodified 3.1 or even 3.0.x test suite to test against the 3.0.0 FIPS provider because it for sure will contain bugs, that were fixed in later versions. These WILL make the test suite fail unless each and every of such newly added testcases are excluded from the test if 3.0.0 FIPS provider is used.

@levitte
Copy link
Member

levitte commented Sep 8, 2022

I think I agree with @t8m. We should be able to argue that "doesn't behave according to spec" is subject for correction, and that the correction can't be deemed an API breaking change. It's a bit unlucky that the behavior fault was on both sides at the same time, sure, but that shouldn't matter.
"Bug compatibility" can only be taken so far before become a complete blockage of going forward and fixing things.

@slontis
Copy link
Member Author

slontis commented Sep 8, 2022

This also implies that no-one is using the fips provider correctly currently.
This probably means that the instructions for how to build are not up to scratch.
Users must be either:

  1. Just not be testing, when they use this scenario (very bad that the tests dont work)
  2. Just using the tarball from 3.0.0 (this is ok except for all the bug fixes and speedups in the core/default provider)
  3. Just using the tarball from latest 3.0.5 (not fips)

@slontis
Copy link
Member Author

slontis commented Sep 8, 2022

Once these issues are addressed the instructions required to build need to be very clear...

@t-j-h
Copy link
Member

t-j-h commented Sep 8, 2022

One thing are true API breaking changes. We should not allow them. On the other hand, we really CANNOT use the unmodified 3.1 or even 3.0.x test suite to test against the 3.0.0 FIPS provider because it for sure will contain bugs, that were fixed in later versions. These WILL make the test suite fail unless each and every of such newly added testcases are excluded from the test if 3.0.0 FIPS provider is used.

We must test that it works and is not broken for that usage. How we do that isn't relevant - that we add tests that don't let us use the test suite as-is happens to be problem that needs a solution.

We must maintain ABI (not just API) compatibility.

Now don't get me wrong on this but people simply do not use the test suite for operational checks. They will build a release and may decide to run the test suite. But they do not do that (in general) in the production environment where they will install a fips provider from one release and use it with a later OpenSSL release build. Also always without running the test suite on that combination - the production environment is not generally a development environment and testing would be done at build not at production time.

If using the test suite from later versions for this sort of testing is the strategy then the test suite needs to be able to cope with precisely this context when being used in that manner. i.e. we have to separate out those tests that cannot run on the older provider builds. This isn't just a fips thing - this is meant to work for all providers for all earlier releases - whether they are fips or non fips. We have an ABI guarantee here for a reason - so that providers from older releases can be used always on newer releases.

Any way you look at it - this is a major mistake in the testing which must be fixed.

@slontis
Copy link
Member Author

slontis commented Sep 8, 2022

evp_test 38

evp_test -config ../../test/fips-and-base.cnf evpciph_des3_common.txt'

The des failures are related to the fix for the incorrect ivlen in
d450eb8
These 2 tests would need to be moved out of evpciph_des3_common.txt since evp_test.c
cipher_test_run() does this check

if (!cdat->iv && EVP_CIPHER_get_iv_length(cdat->cipher)) {
    /* IV is optional and usually omitted in wrap mode */
    if (EVP_CIPHER_get_mode(cdat->cipher) != EVP_CIPH_WRAP_MODE) {
        t->err = "NO_IV";
        return 0;
    }
}

Which will fail with the old provider..

@slontis
Copy link
Member Author

slontis commented Sep 9, 2022

evp_test 48

The first failure if here (There could be more!!)

./test/evp_test -config ./test/fips-and-base.cnf ./test/recipes/30-test_evp_data/evpmac_common.txt

Code was added for reiniting macs inside the providers
c9ddc5a

Evp_test was updated with this PR to test reinit
e58ba18

Reinit obviously doesnt work with the 3.0 FIPS provider... If the fips provider ever gets updated this patch would be a good addition.

@slontis
Copy link
Member Author

slontis commented Sep 9, 2022

evp_test 54

./test/evp_test -config ./test/fips-and-base.cnf ./test/recipes/30-test_evp_data/evppkey_ffdhe.txt

Code was added here for always adding padding when using PROV_DH_KDF_X9_42_ASN1
01b1877

The offending test is:

Derive=ffdhe2048-2
PeerKey=ffdhe2048-1-pub
KDFType=X942KDF-ASN1
KDFOutlen=32
KDFDigest=SHA-256
CEKAlg=id-aes128-wrap
Ctrl = dh_pad:0
SharedSecret=89A249DF4EE9033B89C2B4E52072A736D94F51143A1ED5C8F1E91FCBEBE09654

i.e the old 3.0 provider used the padding set in the Ctrl. The new provider ignores this value and always does padding.

@slontis
Copy link
Member Author

slontis commented Sep 9, 2022

evp_test 57

test/evp_test -config ./test/fips-and-base.cnf ./test/recipes/30-test_evp_data/evpmac_cmac_des.txt

This has the same issue with the mac reinit as 48

@slontis
Copy link
Member Author

slontis commented Sep 9, 2022

evp_test 51

./test/evp_test -config ./test/fips-and-base.cnf ./test/recipes/30-test_evp_data/evppkey_kdf_hkdf.txt

Is the kdf_derive() issue described in the intro comments.

@slontis
Copy link
Member Author

slontis commented Sep 9, 2022

The Explicit curve errors are related to this:
#17998

For both 04-test_encoder_decoder.t and 25-test_verify.t
this is related to expecting a failure in fips mode when decoding explicit EC..
e.g:
One test failure in 25-test_verify.t

OPENSSL_CONF=test/fips-and-base.cnf ./apps/openssl verify -auth_level 1 -provider-path providers -trusted ./test/certs/root-cert.pem -untrusted ./test/certs/ca-cert-ec-named.pem ./test/certs/ee-cert-ec-explicit.pem
./test/certs/ee-cert-ec-explicit.pem

The expected output is (and the output from the default provider is)

CN=server.example
error 94 at 0 depth lookup: Certificate public key has explicit ECC parameters
error ./test/certs/ee-cert-ec-explicit.pem: verification failed

We get a unexpected pass with the old fips provider

./test/certs/ee-cert-ec-explicit.pem: OK

@slontis
Copy link
Member Author

slontis commented Sep 9, 2022

I noticed there is some dup code added into providers for either kdf or mac.. If we test this anywhere it would also fail.

@slontis
Copy link
Member Author

slontis commented Sep 9, 2022

I think that covers all the errors..

@t8m t8m added triaged: bug The issue/pr is/fixes a bug and removed issue: bug report The issue was opened to report a bug labels Sep 9, 2022
@slontis
Copy link
Member Author

slontis commented Sep 11, 2022

Trying to be tricky in EVP_PKEY_eq() by doing something like

if (EVP_PKEY_get_raw_public_key(a, NULL, &len) && len > 0) 
    selection |= OSSL_KEYMGMT_SELECT_PUBLIC_KEY; 
else                                                            
   selection |= OSSL_KEYMGMT_SELECT_KEYPAIR; 

Doesnt quite work since RSA keys dont work with EVP_PKEY_get_raw_public_key().. Arrgh!!

@t8m
Copy link
Member

t8m commented Sep 12, 2022

We could use keymgmt has() function to check whether the public key and/or private key is present.

@paulidale
Copy link
Contributor

I don't see a need for either hold on this issue. The tests need to be update and additional CIs run to catch similar sooner.

The only item here that constitutes a regression IMO is the EVP_PKEY_eq() change, perhaps this needs an OTC decision. All the rest are the result of a bug fix including a test case to trigger the bug.

@slontis
Copy link
Member Author

slontis commented Sep 12, 2022

I think the main driver for this is that a release needs to be done in order to fix this..

paulidale added a commit to paulidale/openssl that referenced this issue Sep 13, 2022
@t8m t8m removed the hold: need otc decision The OTC needs to make a decision label Sep 13, 2022
paulidale added a commit to paulidale/openssl that referenced this issue Sep 13, 2022
paulidale added a commit to paulidale/openssl that referenced this issue Sep 13, 2022
paulidale added a commit to paulidale/openssl that referenced this issue Sep 14, 2022
openssl-machine pushed a commit that referenced this issue Sep 15, 2022
Fixes #19171

Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
(Merged from #19201)

(cherry picked from commit eaac058)
sftcd pushed a commit to sftcd/openssl that referenced this issue Sep 24, 2022
Fixes openssl#19171

Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
(Merged from openssl#19201)
beldmit pushed a commit to beldmit/openssl that referenced this issue Dec 26, 2022
Fixes openssl#19171

Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Shane Lontis <shane.lontis@oracle.com>
(Merged from openssl#19201)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hold: need omc decision The OMC needs to make a decision triaged: bug The issue/pr is/fixes a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants