Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make check hangs in Running case 'TLS 1.3/key exchange groups' in 9.4.3 #752

Closed
pemensik opened this issue Nov 11, 2021 · 23 comments
Closed
Labels
Milestone

Comments

@pemensik
Copy link
Contributor

System (please complete the following information):

  • OS: Fedora 34
  • Kernel version (if applicable): 5.13.12-200.fc34.x86_64
  • strongSwan version(s): 5.9.4
  • Tested/confirmed with the latest version: yes

Describe the bug
make check hangs and does not continue even after timeout

To Reproduce
Steps to reproduce the behavior:

  1. make check

Expected behavior
It should fail or pass, not hang. When thist test case is commented out, all other are sucessful.

Logs/Backtraces
Backtrace from gdb:

    Running case 'TLS 1.3/key exchange groups': [New Thread 0x7ffff52d1640 (LWP 2005409)]
[New Thread 0x7ffff5ad2640 (LWP 2005410)]
[New Thread 0x7ffff62d3640 (LWP 2005411)]
...
(gdb) thread apply all bt

Thread 131 (Thread 0x7ffff5ad2640 (LWP 2005442) "tls_tests"):
#0  0x00007ffff7ee9a7f in __libc_accept (fd=fd@entry=6, addr=..., addr@entry=..., len=len@entry=0x0) at ../sysdeps/unix/sysv/linux/accept.c:26
#1  0x00005555555591ed in serve_echo (config=0x55555557ac50) at suites/test_socket.c:408
#2  0x00007ffff7f5e247 in execute () from /usr/lib64/strongswan/libstrongswan.so.0
#3  0x00007ffff7f65721 in process_jobs () from /usr/lib64/strongswan/libstrongswan.so.0
#4  0x00007ffff7f720be in thread_main () from /usr/lib64/strongswan/libstrongswan.so.0
#5  0x00007ffff7ee0299 in start_thread (arg=0x7ffff5ad2640) at pthread_create.c:481
#6  0x00007ffff7e08353 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 1 (Thread 0x7ffff7899880 (LWP 2005309) "tls_tests"):
#0  futex_wait (private=0, expected=2, futex_word=0x5555555fbe78) at ../sysdeps/nptl/futex-internal.h:146
#1  __lll_lock_wait (futex=futex@entry=0x5555555fbe78, private=0) at lowlevellock.c:52
#2  0x00007ffff7ee2553 in __GI___pthread_mutex_lock (mutex=0x5555555fbe78) at ../nptl/pthread_mutex_lock.c:80
#3  0x00007ffff7f739de in lock.lto_priv () from /usr/lib64/strongswan/libstrongswan.so.0
#4  0x00007ffff7f5e550 in cancel.lto_priv () from /usr/lib64/strongswan/libstrongswan.so.0
#5  0x00007ffff7f5e6bb in destroy.lto_priv () from /usr/lib64/strongswan/libstrongswan.so.0
#6  0x00007ffff7f48d9b in library_deinit () from /usr/lib64/strongswan/libstrongswan.so.0
#7  0x000055555555ccae in post_test (check_leaks=check_leaks@entry=false, failures=failures@entry=0x5555556046e0, name=<optimized out>, i=i@entry=4, leaks=leaks@entry=0x7fffffffd2b4, init=0x55555555c7b0 <test_runner_init>) at ../../libstrongswan/tests/test_runner.c:426
#8  0x0000555555556d90 in run_case (init=0x55555555c7b0 <test_runner_init>, level=LEVEL_SILENT, cfg=0x0, tcase=0x5555555f8040) at ../../libstrongswan/tests/test_runner.c:613
#9  run_suite (init=0x55555555c7b0 <test_runner_init>, level=LEVEL_SILENT, cfg=0x0, suite=0x5555555f7e00) at ../../libstrongswan/tests/test_runner.c:698
#10 test_runner_run (name=0x55555556a968 "libtls", configs=0x555555570400 <tests>, init=0x55555555c7b0 <test_runner_init>, init=0x55555555c7b0 <test_runner_init>, configs=0x555555570400 <tests>, name=0x55555556a968 "libtls") at ../../libstrongswan/tests/test_runner.c:758
#11 main (argc=<optimized out>, argv=<optimized out>) at /home/pemensik/fedora/strongswan/strongswan-5.9.4/src/libtls/tests/tls_tests.c:61

(gdb) thread apply all bt full

Thread 131 (Thread 0x7ffff5ad2640 (LWP 2005442) "tls_tests"):
#0  0x00007ffff7ee9a7f in __libc_accept (fd=fd@entry=6, addr=..., addr@entry=..., len=len@entry=0x0) at ../sysdeps/unix/sysv/linux/accept.c:26
        sc_ret = -512
        sc_cancel_oldtype = 1
        sc_ret = <optimized out>
#1  0x00005555555591ed in serve_echo (config=0x55555557ac50) at suites/test_socket.c:408
        tls = <optimized out>
        sfd = 6
        cfd = <optimized out>
        server = 0x7fffec000ba0
        client = 0x0
        len = <optimized out>
        total = <optimized out>
        done = <optimized out>
        buf = '\000' <repeats 16 times>, " \000\000\000\000\000\000\000\030\000\000\000\000\000\000\000\002\000\000\000:", '\000' <repeats 11 times>, "\001", '\000' <repeats 15 times>, "\002\000\000\000\060", '\000' <repeats 27 times>, "|\000\000\000w\000\000\000n\000\000\000[", '\000' <repeats 18 times>
#2  0x00007ffff7f5e247 in execute () from /usr/lib64/strongswan/libstrongswan.so.0
No symbol table info available.
#3  0x00007ffff7f65721 in process_jobs () from /usr/lib64/strongswan/libstrongswan.so.0
No symbol table info available.
#4  0x00007ffff7f720be in thread_main () from /usr/lib64/strongswan/libstrongswan.so.0
No symbol table info available.
#5  0x00007ffff7ee0299 in start_thread (arg=0x7ffff5ad2640) at pthread_create.c:481
        ret = <optimized out>
        pd = 0x7ffff5ad2640
        unwind_buf = {
          cancel_jmp_buf = {{
              jmp_buf = {140737315153472, -6506757615473399773, 140737488342894, 140737488342895, 0, 140737315153472, 6506770409015470115, 6506775330474420259},
              mask_was_saved = 0
            }},
          priv = {
--Type <RET> for more, q to quit, c to continue without paging--
            pad = {0x0, 0x0, 0x0, 0x0},
            data = {
              prev = 0x0,
              cleanup = 0x0,
              canceltype = 0
            }
          }
        }
        not_first_call = 0
#6  0x00007ffff7e08353 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
No locals.

Thread 1 (Thread 0x7ffff7899880 (LWP 2005309) "tls_tests"):
#0  futex_wait (private=0, expected=2, futex_word=0x5555555fbe78) at ../sysdeps/nptl/futex-internal.h:146
        __ret = -512
        err = <optimized out>
        err = <optimized out>
        __ret = <optimized out>
        resultvar = <optimized out>
        __arg4 = <optimized out>
        __arg3 = <optimized out>
        __arg2 = <optimized out>
        __arg1 = <optimized out>
        _a4 = <optimized out>
        _a3 = <optimized out>
        _a2 = <optimized out>
        _a1 = <optimized out>
#1  __lll_lock_wait (futex=futex@entry=0x5555555fbe78, private=0) at lowlevellock.c:52
No locals.
#2  0x00007ffff7ee2553 in __GI___pthread_mutex_lock (mutex=0x5555555fbe78) at ../nptl/pthread_mutex_lock.c:80
        __futex = 0x5555555fbe78
        type = <optimized out>
        __PRETTY_FUNCTION__ = "__pthread_mutex_lock"
        id = <optimized out>
#3  0x00007ffff7f739de in lock.lto_priv () from /usr/lib64/strongswan/libstrongswan.so.0
--Type <RET> for more, q to quit, c to continue without paging--
No symbol table info available.
#4  0x00007ffff7f5e550 in cancel.lto_priv () from /usr/lib64/strongswan/libstrongswan.so.0
No symbol table info available.
#5  0x00007ffff7f5e6bb in destroy.lto_priv () from /usr/lib64/strongswan/libstrongswan.so.0
No symbol table info available.
#6  0x00007ffff7f48d9b in library_deinit () from /usr/lib64/strongswan/libstrongswan.so.0
No symbol table info available.
#7  0x000055555555ccae in post_test (check_leaks=check_leaks@entry=false, failures=failures@entry=0x5555556046e0, name=<optimized out>, i=i@entry=4, leaks=leaks@entry=0x7fffffffd2b4, init=0x55555555c7b0 <test_runner_init>) at ../../libstrongswan/tests/test_runner.c:426
        data = {
          failures = 0x5555556046e0,
          name = 0x55555556554b "test_tls13_ke_groups",
          i = 4,
          leaks = 0
        }
#8  0x0000555555556d90 in run_case (init=0x55555555c7b0 <test_runner_init>, level=LEVEL_SILENT, cfg=0x0, tcase=0x5555555f8040) at ../../libstrongswan/tests/test_runner.c:613
        start = {
          tv_sec = 0,
          tv_nsec = 366997826
        }
        ok = false
        leaks = 0
        i = 4
        rounds = 4
        enumerator = <optimized out>
        total_time = 0
        tests = 7
        passed = 0
        tfun = 0x5555555f8310
        times = 0x55555559aad0
        ti = 4
        failures = 0x5555556046e0
        warnings = 0x5555556048c0
--Type <RET> for more, q to quit, c to continue without paging--
        enumerator = <optimized out>
        tfun = <optimized out>
        times = <optimized out>
        total_time = <optimized out>
        tests = <optimized out>
        ti = <optimized out>
        passed = <optimized out>
        failures = <optimized out>
        warnings = <optimized out>
        i = <optimized out>
        rounds = <optimized out>
        start = <optimized out>
        ok = <optimized out>
        leaks = <optimized out>
#9  run_suite (init=0x55555555c7b0 <test_runner_init>, level=LEVEL_SILENT, cfg=0x0, suite=0x5555555f7e00) at ../../libstrongswan/tests/test_runner.c:698
        enumerator = <optimized out>
        tcase = 0x5555555f8040
        passed = 3
        enumerator = <optimized out>
        tcase = <optimized out>
        passed = <optimized out>
#10 test_runner_run (name=0x55555556a968 "libtls", configs=0x555555570400 <tests>, init=0x55555555c7b0 <test_runner_init>, init=0x55555555c7b0 <test_runner_init>, configs=0x555555570400 <tests>, name=0x55555556a968 "libtls") at ../../libstrongswan/tests/test_runner.c:758
        enumerator = <optimized out>
        passed = 1
        runners = <optimized out>
        verbosity = <optimized out>
        suites = 0x5555555f7c60
        suite = 0x5555555f7e00
        result = <optimized out>
        level = LEVEL_SILENT
        cfg = 0x0
        suites = <optimized out>
--Type <RET> for more, q to quit, c to continue without paging--
        suite = <optimized out>
        enumerator = <optimized out>
        passed = <optimized out>
        result = <optimized out>
        level = <optimized out>
        cfg = <optimized out>
        runners = <optimized out>
        verbosity = <optimized out>
#11 main (argc=<optimized out>, argv=<optimized out>) at /home/pemensik/fedora/strongswan/strongswan-5.9.4/src/libtls/tests/tls_tests.c:61
No locals.

Additional context
I tried to enable make check to validate build in Fedora. However only one case hangs indefinitely on my 4 cores machine. Can it be something wrong in Fedora's libraries?

Is there fix available to make it pass?

@tobiasbrunner
Copy link
Member

You might be missing some crypto plugins. Possible that can see what problem the TLS stack has if you run this with TESTS_VERBOSITY=2 TESTS_CASES="TLS 1.3/key exchange groups"

@pemensik
Copy link
Contributor Author

Indeed. It seems it should have failed instead

Running function 'test_tls13_ke_groups' [4]:
builder L0 CRED_PRIVATE_KEY - RSA of plugin 'pkcs1'
builder L0 CRED_PRIVATE_KEY - RSA of plugin 'pkcs8'
builder L0 CRED_PRIVATE_KEY - RSA of plugin 'pgp'
builder L0 CRED_PRIVATE_KEY - RSA of plugin 'openssl'
builder L0 CRED_PRIVATE_KEY - RSA of plugin 'gcrypt'
builder L0 CRED_PRIVATE_KEY - RSA of plugin 'gmp'
builder L0 CRED_PRIVATE_KEY - RSA of plugin 'pem'
builder L1 CRED_PRIVATE_KEY - RSA of plugin 'pkcs1'
L0 - RSAPrivateKey:
L1 - version:
L1 - modulus:
L1 - publicExponent:
L1 - privateExponent:
L1 - prime1:
L1 - prime2:
L1 - exponent1:
L1 - exponent2:
L1 - coefficient:
builder L2 CRED_PRIVATE_KEY - RSA of plugin 'pkcs1'
builder L2 CRED_PRIVATE_KEY - RSA of plugin 'pkcs8'
builder L2 CRED_PRIVATE_KEY - RSA of plugin 'pgp'
builder L2 CRED_PRIVATE_KEY - RSA of plugin 'openssl'
builder L0 CRED_PRIVATE_KEY - ANY of plugin 'pkcs11'
builder L0 CRED_PRIVATE_KEY - ANY of plugin 'pkcs1'
builder L0 CRED_PRIVATE_KEY - ANY of plugin 'pkcs8'
builder L0 CRED_PRIVATE_KEY - ANY of plugin 'pgp'
builder L0 CRED_PRIVATE_KEY - ANY of plugin 'openssl'
builder L0 CRED_PRIVATE_KEY - ANY of plugin 'openssl'
builder L0 CRED_PRIVATE_KEY - ANY of plugin 'pem'
builder L1 CRED_PRIVATE_KEY - ANY of plugin 'pkcs11'
builder L1 CRED_PRIVATE_KEY - ANY of plugin 'pkcs1'
builder L2 CRED_PRIVATE_KEY - ECDSA of plugin 'pkcs8'
L0 - encryptedPrivateKeyInfo:
L1 - encryptionAlgorithm:
L0 - privateKeyInfo:
L1 - version:
L1 - privateKeyAlgorithm:
builder L2 CRED_PRIVATE_KEY - ECDSA of plugin 'openssl'
builder L0 CRED_CERTIFICATE - X509 of plugin 'x509'
builder L0 CRED_CERTIFICATE - X509 of plugin 'openssl'
builder L0 CRED_CERTIFICATE - X509 of plugin 'pem'
builder L1 CRED_CERTIFICATE - X509 of plugin 'x509'
L0 - x509:
L1 - tbsCertificate:
L2 - DEFAULT v1:
L3 - version:
  X.509v3
L2 - serialNumber:
L2 - signature:
L3 - algorithmIdentifier:
L4 - algorithm:
  'sha256WithRSAEncryption'
L4 - parameters:
L2 - issuer:
  'C=CH, O=strongSwan, CN=tls-rsa'
L2 - validity:
L3 - notBefore:
L4 - utcTime:
  'Mar 25 14:29:27 UTC 2020'
L3 - notAfter:
L4 - utcTime:
  'Mar 25 14:29:27 UTC 2023'
L2 - subject:
  'C=CH, O=strongSwan, CN=tls-rsa'
L2 - subjectPublicKeyInfo:
-- > --
builder L2 CRED_PUBLIC_KEY - ANY of plugin 'pkcs1'
L0 - subjectPublicKeyInfo:
L1 - algorithm:
L2 - algorithmIdentifier:
L3 - algorithm:
  'rsaEncryption'
L3 - parameters:
L1 - subjectPublicKey:
-- > --
builder L3 CRED_PUBLIC_KEY - RSA of plugin 'pkcs1'
L0 - RSAPublicKey:
L1 - modulus:
L1 - publicExponent:
builder L4 CRED_PUBLIC_KEY - RSA of plugin 'pkcs1'
builder L4 CRED_PUBLIC_KEY - RSA of plugin 'pgp'
builder L4 CRED_PUBLIC_KEY - RSA of plugin 'dnskey'
builder L4 CRED_PUBLIC_KEY - RSA of plugin 'openssl'
-- < --
-- < --
L2 - optional extensions:
L3 - extensions:
L4 - extension:
L5 - extnID:
  'subjectAltName'
L5 - critical:
  FALSE
L5 - extnValue:
L6 - generalNames:
L7 - generalName:
L8 - ipAddress:
  '127.0.0.1'
L1 - signatureAlgorithm:
L2 - algorithmIdentifier:
L3 - algorithm:
  'sha256WithRSAEncryption'
L3 - parameters:
L1 - signatureValue:
builder L0 CRED_CERTIFICATE - X509 of plugin 'x509'
builder L0 CRED_CERTIFICATE - X509 of plugin 'openssl'
builder L0 CRED_CERTIFICATE - X509 of plugin 'pem'
builder L1 CRED_CERTIFICATE - X509 of plugin 'x509'
L0 - x509:
L1 - tbsCertificate:
L2 - DEFAULT v1:
L3 - version:
  X.509v3
L2 - serialNumber:
L2 - signature:
L3 - algorithmIdentifier:
L4 - algorithm:
  'ecdsa-with-SHA384'
L2 - issuer:
  'C=CH, O=strongSwan, CN=tls-ecdsa'
L2 - validity:
L3 - notBefore:
L4 - utcTime:
  'Mar 25 14:30:24 UTC 2020'
L3 - notAfter:
L4 - utcTime:
  'Mar 25 14:30:24 UTC 2023'
L2 - subject:
  'C=CH, O=strongSwan, CN=tls-ecdsa'
L2 - subjectPublicKeyInfo:
-- > --
builder L2 CRED_PUBLIC_KEY - ANY of plugin 'pkcs1'
L0 - subjectPublicKeyInfo:
L1 - algorithm:
L2 - algorithmIdentifier:
L3 - algorithm:
  'id-ecPublicKey'
L3 - parameters:
builder L3 CRED_PUBLIC_KEY - ECDSA of plugin 'openssl'
-- < --
L2 - optional extensions:
L3 - extensions:
L4 - extension:
L5 - extnID:
  'subjectAltName'
L5 - critical:
  FALSE
L5 - extnValue:
L6 - generalNames:
L7 - generalName:
L8 - ipAddress:
  '127.0.0.1'
L1 - signatureAlgorithm:
L2 - algorithmIdentifier:
L3 - algorithm:
  'ecdsa-with-SHA384'
L1 - signatureValue:
spawning 8 worker threads
created thread 01 [2252143]
started worker thread 01
created thread 03 [2252145]
started worker thread 03
created thread 04 [2252146]
started worker thread 04
created thread 02 [2252144]
created thread 05 [2252147]
started worker thread 05
created thread 06 [2252148]
created thread 07 [2252150]
started worker thread 07
started worker thread 02
no events, waiting
started worker thread 06
created thread 08 [2252149]
started worker thread 08
41 supported TLS cipher suites:
  TLS_AES_256_GCM_SHA384
  TLS_AES_128_GCM_SHA256
  TLS_CHACHA20_POLY1305_SHA256
  TLS_AES_128_CCM_SHA256
  TLS_AES_128_CCM_8_SHA256
  TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
  TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384
  TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA
  TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
  TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256
  TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA
  TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
  TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384
  TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA
  TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
  TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
  TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA
  TLS_DHE_RSA_WITH_AES_256_GCM_SHA384
  TLS_DHE_RSA_WITH_AES_256_CBC_SHA256
  TLS_DHE_RSA_WITH_AES_256_CBC_SHA
  TLS_DHE_RSA_WITH_CAMELLIA_256_CBC_SHA256
  TLS_DHE_RSA_WITH_CAMELLIA_256_CBC_SHA
  TLS_DHE_RSA_WITH_AES_128_GCM_SHA256
  TLS_DHE_RSA_WITH_AES_128_CBC_SHA256
  TLS_DHE_RSA_WITH_AES_128_CBC_SHA
  TLS_DHE_RSA_WITH_CAMELLIA_128_CBC_SHA256
  TLS_DHE_RSA_WITH_CAMELLIA_128_CBC_SHA
  TLS_RSA_WITH_AES_256_GCM_SHA384
  TLS_RSA_WITH_AES_256_CBC_SHA256
  TLS_RSA_WITH_AES_256_CBC_SHA
  TLS_RSA_WITH_AES_128_GCM_SHA256
  TLS_RSA_WITH_AES_128_CBC_SHA256
  TLS_RSA_WITH_AES_128_CBC_SHA
  TLS_RSA_WITH_CAMELLIA_256_CBC_SHA256
  TLS_RSA_WITH_CAMELLIA_256_CBC_SHA
  TLS_RSA_WITH_CAMELLIA_128_CBC_SHA256
  TLS_RSA_WITH_CAMELLIA_128_CBC_SHA
  TLS_ECDHE_ECDSA_WITH_NULL_SHA
  TLS_ECDHE_RSA_WITH_NULL_SHA
  TLS_RSA_WITH_NULL_SHA256
  TLS_RSA_WITH_NULL_SHA
sending extension: supported groups
sending extension: supported versions
sending extension: signature algorithms
sending extension: signature algorithms cert
sending extension: key-share
sending fatal TLS alert 'internal error'
sending TLS Alert record (2 bytes)
processing TLS Alert record (2 bytes)
received fatal TLS alert 'internal error'

@pemensik
Copy link
Contributor Author

check.log

Can you find failure reason? I guess it might be pkcs11 plugin, because I make sometime tests with it. But failed to spot exact reason, internal error does not help?

@tobiasbrunner
Copy link
Member

It seems it should have failed instead

Due to how the TLS tests are structured (and in part how the test framework operates), they will hang for some TLS protocol failures.

Here it seems generating the key-share extension already on the client fails. Most likely because a DH group was selected that couldn't get instantiated. This particular test goes through all DH groups announced by plugins and tries to do a key exchange with each, which will fail if none of the plugins can actually instantiate it.

Looking at the tests that were successful, we see ecp256, ecp384, ecp521 and ecp224. The next DH group in that list is probably ecp192. Of the plugins you have loaded only the openssl plugin announces it (while the pkcs11 plugin theoretically supports it, DH handling via PKCS#11 is disabled by default).

The problem is that plugin features, including supported DH groups, are mostly statically configured at compile-time (in the openssl plugin based on OPENSSL_VERSION_NUMBER and in some cases OPENSSL_NO_* compile flags). So while the OpenSSL library you built against theoretically supports this group, it could still fail to actually instantiate it at runtime e.g. due to some security restrictions.

What version of OpenSSL do you use? Was it compiled in a way that disables ecp192 specifically (not sure if there is such an option, other crypto libraries have e.g. options to configure minimum bits for keys)? Or is FIPS-mode active even though the log says it isn't (maybe that reporting isn't working correctly)? (Not sure, though, if FIPS-mode would even disable it.) OpenSSL also has the concept of security levels that might affect this. But this is generally something that applies to TLS contexts, which we don't use (e.g. configurable via DEFAULT@SECLEVEL in cipher strings, the API and the compile option for the default also reference TLS specifically). So not sure if there is an effect on low-level crypto primitives too if the default is changed (e.g. on EC_KEY_new_by_curve_name or EC_KEY_generate_key, either of which probably fails here). Anyway, a global default level of 2 would prohibit the use of ECC keys < 224 bits.

@tobiasbrunner
Copy link
Member

Also, you haven't built with --enable-test-vectors so the test suite for crypto primitives didn't find this issue (it should, as there is a test vector for ecp192 and that's enabled if a plugin announces support for it), which is run by the libstrongswan test runner, so tests would have failed way before the libtls tests.

@pemensik
Copy link
Contributor Author

I am using openssl-1.1.1l-2.fc34.x86_64 on Fedora 34. With crypto-policies set to DEFAULT, but I doubt that is mapped to strongswan modules directly.

It seems there is also some different issue just on 32 bit systems, because only that builds have failed test during build. (That link would stay valid for ~5 days). There are also visible all configure parameters used. It is possible some weird combinations are used, I left there whatever previous maintainers enabled. FIPS mode is not be enabled on my machine.

My build passed also with --enable-test-vectors, it still hangs only on this socket test. All others passed.

@tobiasbrunner
Copy link
Member

With crypto-policies set to DEFAULT, but I doubt that is mapped to strongswan modules directly.

Well, if it impacts OpenSSL (or rather libcrypto), it might. What exactly does DEFAULT correspond to in that regard?

It seems there is also some different issue just on 32 bit systems,

Looks like a timeout in the NewHope tests (very well possible on a slow 32-bit machine). Probably doesn't make sense to enable that plugin (--enable-newhope) as it will get removed some time in the future.

My build passed also with --enable-test-vectors, it still hangs only on this socket test. All others passed.

Did you use make clean before running make check again? I don't think the test runner is rebuilt automatically to pick up the new plugin list. Running make clean is recommended whenever configure options are changed that don't modify config.h, in particular enabling/disabling plugins.

@pemensik
Copy link
Contributor Author

Yes, I did a new build of RPM package, during which it happens. I made a fresh build with reconfigure, it should not be affected. It switches multiple of services, but openssl configuration should be most important. Not sure how other defaults are set during the build.

$ cat /usr/share/crypto-policies/DEFAULT/opensslcnf.txt
CipherString = @SECLEVEL=2:kEECDH:kRSA:kEDH:kPSK:kDHEPSK:kECDHEPSK:kRSAPSK:-aDSS:-3DES:!DES:!RC4:!RC2:!IDEA:-SEED:!eNULL:!aNULL:!MD5:-SHA384:-CAMELLIA:-ARIA:-AESCCM8
Ciphersuites = TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256:TLS_AES_128_CCM_SHA256
MinProtocol = TLSv1.2
MaxProtocol = TLSv1.3
SignatureAlgorithms = ECDSA+SHA256:ECDSA+SHA384:ECDSA+SHA512:ed25519:ed448:rsa_pss_pss_sha256:rsa_pss_pss_sha384:rsa_pss_pss_sha512:rsa_pss_rsae_sha256:rsa_pss_rsae_sha384:rsa_pss_rsae_sha512:RSA+SHA256:RSA+SHA384:RSA+SHA512:ECDSA+SHA224:RSA+SHA224
$ cat /usr/share/crypto-policies/FUTURE/opensslcnf.txt
CipherString = @SECLEVEL=3:kEECDH:kEDH:kPSK:kDHEPSK:kECDHEPSK:-kRSAPSK:-kRSA:-aDSS:-AES128:-SHA256:-3DES:!DES:!RC4:!RC2:!IDEA:-SEED:!eNULL:!aNULL:-SHA1:!MD5:-SHA384:-CAMELLIA:-ARIA:-AESCCM8
Ciphersuites = TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
MinProtocol = TLSv1.2
MaxProtocol = TLSv1.3
SignatureAlgorithms = ECDSA+SHA256:ECDSA+SHA384:ECDSA+SHA512:ed25519:ed448:rsa_pss_pss_sha256:rsa_pss_pss_sha384:rsa_pss_pss_sha512:rsa_pss_rsae_sha256:rsa_pss_rsae_sha384:rsa_pss_rsae_sha512:RSA+SHA256:RSA+SHA384:RSA+SHA512

$ cat /usr/share/crypto-policies/LEGACY/opensslcnf.txt
CipherString = @SECLEVEL=1:kEECDH:kRSA:kEDH:kPSK:kDHEPSK:kECDHEPSK:kRSAPSK:!DES:!RC4:!RC2:!IDEA:-SEED:!eNULL:!aNULL:!MD5:-SHA384:-CAMELLIA:-ARIA:-AESCCM8
Ciphersuites = TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:TLS_AES_128_GCM_SHA256:TLS_AES_128_CCM_SHA256
MinProtocol = TLSv1
MaxProtocol = TLSv1.3
SignatureAlgorithms = ECDSA+SHA256:ECDSA+SHA384:ECDSA+SHA512:ed25519:ed448:rsa_pss_pss_sha256:rsa_pss_pss_sha384:rsa_pss_pss_sha512:rsa_pss_rsae_sha256:rsa_pss_rsae_sha384:rsa_pss_rsae_sha512:RSA+SHA256:RSA+SHA384:RSA+SHA512:ECDSA+SHA224:RSA+SHA224:DSA+SHA256:DSA+SHA384:DSA+SHA512:DSA+SHA224:ECDSA+SHA1:RSA+SHA1:DSA+SHA1

@tobiasbrunner
Copy link
Member

Does it work if you switch to LEGACY?

@pemensik
Copy link
Contributor Author

It does not seems to help, but I don't want to reboot my own system yet. It might be required for change to be properly updated.

@pemensik
Copy link
Contributor Author

pemensik commented Nov 11, 2021

I tried rebuilding it on a fresh VM, it seems this problem is somehow specific only for my machine. Fresh F35 machine fails on different test:

# openssl-1.1.1l-2.fc34.x86_64
  Running suite 'vectors':
    Running case 'transforms': +++-++++-
      Failure in 'test_vectors': success: test vector for HASH_MD2 from 'openssl' plugin failed (suites/test_vectors.c:48, i = 3)
      Failure in 'test_vectors': success: test vector for ECP_192 from 'openssl' plugin failed (suites/test_vectors.c:48, i = 8)
  Passed 0/1 'vectors' test cases
  Running suite 'ecdsa':
    Running case 'generate': +++
    Running case 'load': +++

But reported test passes without issues.

I tried to catch fatal TLS send in gdb. It is sent from this backtrace:

(gdb) bt
#0  get (this=0x5555555f15c0, level=0x7fffffff8620, desc=0x7fffffff8624)
    at /usr/src/debug/strongswan-5.9.4-3.fc36.x86_64/src/libtls/tls_alert.c:162
#1  0x00007ffff7fb2e2e in check_alerts.isra.0 (data=data@entry=0x7fffffff8710, this=<optimized out>, this=<optimized out>)
    at /usr/src/debug/strongswan-5.9.4-3.fc36.x86_64/src/libtls/tls_fragmentation.c:321
#2  0x00007ffff7fa4ffb in build (data=0x7fffffff8710, type=0x7fffffff870c, this=<optimized out>)
    at /usr/src/debug/strongswan-5.9.4-3.fc36.x86_64/src/libtls/tls_fragmentation.c:462
#3  build (this=0x555555601050, type=0x7fffffff870c, data=0x7fffffff8710)
    at /usr/src/debug/strongswan-5.9.4-3.fc36.x86_64/src/libtls/tls_fragmentation.c:420
#4  0x00007ffff7fa3fb1 in build (this=0x5555555f4560, type=0x7fffffff870c, data=0x7fffffff8710)
    at /usr/src/debug/strongswan-5.9.4-3.fc36.x86_64/src/libtls/tls_protection.c:99
#5  0x00007ffff7fb1a29 in build (this=0x5555555ce870, buf=0x7fffffff8780, buflen=0x7fffffff8778, msglen=0x0)
    at /usr/src/debug/strongswan-5.9.4-3.fc36.x86_64/src/libtls/tls.c:389
#6  0x00007ffff7fa9f4d in exchange (this=this@entry=0x555555583680, wr=wr@entry=true, block=block@entry=false)
    at /usr/src/debug/strongswan-5.9.4-3.fc36.x86_64/src/libtls/tls_socket.c:173
#7  0x00007ffff7faa234 in write_ (this=0x555555583680, buf=<optimized out>, len=<optimized out>)
    at /usr/src/debug/strongswan-5.9.4-3.fc36.x86_64/src/libtls/tls_socket.c:302
#8  0x000055555555bbc8 in run_echo_client (config=config@entry=0x5555555fad80) at suites/test_socket.c:499
#9  0x000055555555c4df in test_tls_ke_groups (version=TLS_1_3, port=5664, cauth=false, i=4) at suites/test_socket.c:593
#10 test_tls13_ke_groups (_i=4) at suites/test_socket.c:702
#11 0x000055555555cbd0 in run_test (tfun=0x5555555fbe10, i=i@entry=4) at ../../libstrongswan/tests/test_runner.c:278
#12 0x0000555555557057 in run_case (init=0x55555555c7b0 <test_runner_init>, level=LEVEL_RAW, cfg=0x0, tcase=0x5555555fbb40)
    at ../../libstrongswan/tests/test_runner.c:601
#13 run_suite (init=0x55555555c7b0 <test_runner_init>, level=LEVEL_RAW, cfg=0x0, suite=0x5555555fb930)
    at ../../libstrongswan/tests/test_runner.c:698
#14 test_runner_run (name=0x55555556a978 "libtls", configs=0x555555570400 <tests>, init=0x55555555c7b0 <test_runner_init>, 
    init=0x55555555c7b0 <test_runner_init>, configs=0x555555570400 <tests>, name=0x55555556a978 "libtls")
    at ../../libstrongswan/tests/test_runner.c:758
#15 main (argc=<optimized out>, argv=<optimized out>)
    at /home/pemensik/fedora/strongswan/strongswan-5.9.4/src/libtls/tests/tls_tests.c:61

@tobiasbrunner
Copy link
Member

I tried rebuilding it on a fresh VM, it seems this problem is somehow specific only for my machine. Fresh F35 machine fails on different test:
...
But reported test passes without issues.

Wait, even though we see a failure to use ecp192 when the test vectors are applied (which, as mentioned, you should see on your machine too), you say the TLS tests are all working if you run them individually? Even the one that uses ecp192? (You'll see the used groups in the TLS tests with TESTS_VERBOSITY=1. If you see using key exchange SECP192R1 there, that'd be very weird - if you see the other groups but not this one, that would be weird too because no test vector should have been applied then either.)

Weird that MD2 fails here while it doesn't with OpenSSL 3, I guess it's completely disabled in that build, i.e. OPENSSL_NO_MD2 is set, and it's not just an issue with the legacy provider that we saw in #753 for MD4 (there was no such legacy provider in OpenSSL 1.x, the inability to instantiate it while OPENSSL_NO_MD2 is not defined might again be due to the security level setting, or it's just a bug in OpenSSL 1.1.1). I guess if that's generally the case for the Fedora builds of OpenSSL, manually passing -DOPENSSL_NO_MD2 via CFLAGS when building strongSwan could be an option. Or we could just remove support for MD2 completely, as it's only required for some legacy PKCS#12 files (I doubt such files are really around anymore as the contained certificates have certainly expired by now).

I tried to catch fatal TLS send in gdb. It is sent from this backtrace:

Unfortunately, it's not really helpful, because this only shows the code path where the alert is sent, not where it was triggered (where it just causes a state change that later causes the alert to get sent). But it's pretty clear to me that it happens here

if (!tls_write_key_share(&key_share, this->dh))
{
this->alert->add(this->alert, TLS_FATAL, TLS_INTERNAL_ERROR);
extensions->destroy(extensions);
return NEED_MORE;
}
because this->dh is NULL due to the call at
this->dh = lib->crypto->create_dh(lib->crypto, group);
failing because instantiating ecp192 isn't possible.

@pemensik
Copy link
Contributor Author

Tried breakpoint on that line. Only once it stops there, it seems to be ecp256.

375				this->dh = lib->crypto->create_dh(lib->crypto, group);
(gdb) p lib->crypto
$1 = (crypto_factory_t *) 0x5555555759d0
(gdb) p lib->crypto->create_dh
$2 = (diffie_hellman_t *(*)(crypto_factory_t *, diffie_hellman_group_t, ...)) 0x7ffff7f4d0e0 <create_dh>
(gdb) p group
$3 = ECP_256_BIT
(gdb) bt
#0  send_client_hello (this=0x555555589e90, type=0x7fffffff85c0, writer=0x5555555d1100)
    at /home/pemensik/fedora/strongswan/strongswan-5.9.4/src/libtls/tls_peer.c:1375
#1  0x00007ffff7fa515c in build_handshake (this=0x5555555f5fe0)
    at /home/pemensik/fedora/strongswan/strongswan-5.9.4/src/libtls/tls_fragmentation.c:348
#2  build (data=0x7fffffff8660, type=0x7fffffff865c, this=<optimized out>)
    at /home/pemensik/fedora/strongswan/strongswan-5.9.4/src/libtls/tls_fragmentation.c:456
#3  build (this=0x5555555f5fe0, type=0x7fffffff865c, data=0x7fffffff8660)
    at /home/pemensik/fedora/strongswan/strongswan-5.9.4/src/libtls/tls_fragmentation.c:420
#4  0x00007ffff7fa3fb1 in build (this=0x5555555fb2c0, type=0x7fffffff865c, data=0x7fffffff8660)
    at /home/pemensik/fedora/strongswan/strongswan-5.9.4/src/libtls/tls_protection.c:99
#5  0x00007ffff7fb1a29 in build (this=0x5555555f6320, buf=0x7fffffff86d0, buflen=0x7fffffff86c8, msglen=0x0)
    at /home/pemensik/fedora/strongswan/strongswan-5.9.4/src/libtls/tls.c:389
#6  0x00007ffff7fa9f4d in exchange (this=this@entry=0x555555593590, wr=wr@entry=true, block=block@entry=false)
    at /home/pemensik/fedora/strongswan/strongswan-5.9.4/src/libtls/tls_socket.c:173
#7  0x00007ffff7faa234 in write_ (this=0x555555593590, buf=<optimized out>, len=<optimized out>)
    at /home/pemensik/fedora/strongswan/strongswan-5.9.4/src/libtls/tls_socket.c:302
#8  0x000055555555bbc8 in run_echo_client (config=config@entry=0x5555555e7170) at suites/test_socket.c:499
#9  0x000055555555c4df in test_tls_ke_groups (version=TLS_1_3, port=5664, cauth=false, i=0) at suites/test_socket.c:593
#10 test_tls13_ke_groups (_i=0) at suites/test_socket.c:702
#11 0x000055555555cbd0 in run_test (tfun=0x5555555ec600, i=i@entry=0) at ../../libstrongswan/tests/test_runner.c:278
#12 0x0000555555557057 in run_case (init=0x55555555c7b0 <test_runner_init>, level=LEVEL_RAW, cfg=0x0, tcase=0x5555555ec330)
    at ../../libstrongswan/tests/test_runner.c:601
#13 run_suite (init=0x55555555c7b0 <test_runner_init>, level=LEVEL_RAW, cfg=0x0, suite=0x5555555ec0f0)
    at ../../libstrongswan/tests/test_runner.c:698
#14 test_runner_run (name=0x55555556a978 "libtls", configs=0x555555570400 <tests>, init=0x55555555c7b0 <test_runner_init>, 
    init=0x55555555c7b0 <test_runner_init>, configs=0x555555570400 <tests>, name=0x55555556a978 "libtls")
    at ../../libstrongswan/tests/test_runner.c:758
#15 main (argc=<optimized out>, argv=<optimized out>)
    at /home/pemensik/fedora/strongswan/strongswan-5.9.4/src/libtls/tests/tls_tests.c:61
(gdb) frame
#0  send_client_hello (this=0x555555589e90, type=0x7fffffff85c0, writer=0x5555555d1100)
    at /home/pemensik/fedora/strongswan/strongswan-5.9.4/src/libtls/tls_peer.c:1375
1375				this->dh = lib->crypto->create_dh(lib->crypto, group);

(gdb) info local
suites = 0x5555555fd6a0
extensions = 0x55555557d070
curves = 0x5555555fd7e0
versions = <optimized out>
key_share = 0x55555559a790
signatures = <optimized out>
version_max = 4294935888
version_min = TLS_1_0
group = ECP_256_BIT
curve = TLS_SECP256R1
enumerator = <optimized out>
count = <optimized out>
i = <optimized out>
v = <optimized out>
rng = <optimized out>

@tobiasbrunner
Copy link
Member

That's the first one that will be tested. You'd need to continue until the group is ECP_192_BIT (it will be the fifth), then the result should be NULL. But this won't really help as the problem lies somewhere within OpenSSL's libcrypto where for some reason 192-bit ECC is rejected when we call one of these functions here:

key = EC_KEY_new_by_curve_name(NID_X9_62_prime192v1);


if (!this->key || !EVP_PKEY_assign_EC_KEY(this->key, key))

@pemensik
Copy link
Contributor Author

I see two different problem here. First is failing on ECP_192_BIT. I have issues with gdb to step it correctly.

Second is wrong design of tls socket tests. I think server should be terminated in fixture, not inside the test. I think it hangs, because server is not stopped in case of failed assert. Which is kind of point of tests. It seems run_echo_client(echo_server_config_t *config) should not destroy server, but leave it to teardown_creds function. There it would be stopped both on success and assertion failure. I think blocked testsuite is not good idea to use onl builders. It might stop important rebuilds on some harmless errors.

Changing it would require not small changes I am afraid. I can wrap make check into timeout command, ensuring it would end within minutes that or the other way. Kind of workaround, but better than nothing.

@tobiasbrunner
Copy link
Member

OK, I see what the problem with ecp192 is. Apparently, the Fedora OpenSSL package removes support for EC curves < 224 bits with a patch. Checking for that is kinda tricky (without instantiating it), as there is nothing like wolfSSL's ECC_MIN_KEY_SZ. So I guess we could either remove support for ecp192 in the openssl plugin (it really shouldn't be used anymore, so that might not be the worst - the botan plugin doesn't support it either and the wolfssl plugin probably has it disabled, by default, as the mentioned variable is set to 224 if not configured and not building in FIPS mode, where it strangely is 192), or you could just patch out the line that announces support for it in the Fedora strongSwan package (I guess you have to do that anyway for 5.9.4):

PLUGIN_PROVIDE(DH, ECP_192_BIT),

Regarding the stuck socket tests, shutting down the server socket in the teardown function is a good idea. But it isn't enough if the error happens on the server (not on the client as in this case). I've pushed a couple of commits to the 752-libtls-tests.

@pemensik
Copy link
Contributor Author

Oh, if that is downstream patch on Fedora, then I will patch that skip that algorithm. It would be nice if openssl_diffie_hellman_create could report more detailed failure reason. Test might then just print warning and continue if whole algorithm is not supported. It should fail with error if algorithm key can be created but cannot be used (if that makes sense).

For example FIPS mode on RHEL can change available features runtime. It does not always work to detect supported ciphers on compile time and rely they would stay available. Similar with crypto policy I have already noted.

I think the server part should maybe use different 'delayed' checks, which would save place of failure into variable and abort server setup. It would check server status before client is started. If server failed to start, no client test is needed. Just would have to find way to cancel server thread on any failure occuring inside it. Not sure if fixture can reliable stop it. If server stopping were moved there, it may work both in case of success and failure. Just ensure waiting for never happening action does not start.

Anyway, your change changes hang to failure, which is much better.

  Running suite 'socket':
    Running case 'TLS [1.0..1.3] client to TLS 1.3 server': ++++
    Running case 'TLS 1.3 client to TLS [1.0..1.3] server': ++++
    Running case 'TLS [1.0..1.3] client to TLS 1.2 server': ++++
    Running case 'TLS 1.3/key exchange groups': ++++-++
      Failure in 'test_tls13_ke_groups': len >= 0 (suites/test_socket.c:513, i = 4)

# passed with commented openssl ECP_192_BIT support
  Running suite 'socket':
    Running case 'TLS [1.0..1.3] client to TLS 1.3 server': ++++
    Running case 'TLS 1.3 client to TLS [1.0..1.3] server': ++++
    Running case 'TLS [1.0..1.3] client to TLS 1.2 server': ++++
    Running case 'TLS 1.3/key exchange groups': ++++++
    Running case 'TLS 1.3/signature schemes': ++++++++

If that is caused by downstream change in Fedora, there is no point of making configure check testing availability of it. I will just comment it out.

@pemensik
Copy link
Contributor Author

I have dug a bit into documentation and found there exists function EC_get_builtin_curves. It can obtain list of supported variants, which differs quite a lot between Fedora and Debian. Could it be used to initialize only supported EC types in openssl_plugin.c, get_features method? It is possible to initialize features in more dynamic way?

Here is snippet I used to list contents. Found very surprising difference, Debian has 82 curves, while Fedora has only 5! Could it initialize just enabled curve types during initialization and skip automagically both in tests and runtime?

// gcc -Wall test.c -o test $(pkg-config --libs --cflags openssl)
#include <openssl/ec.h>

int main(int argc, char *argv[])
{
        size_t i;
        size_t n;
        const unsigned max = 90;
        EC_builtin_curve info[max];

        n = EC_get_builtin_curves(info, sizeof(info)/sizeof(info[0]));

        for (i=0; i<n && i<max; i++) {
                printf("%02zu nid %X: %s\n", i, info[i].nid, info[i].comment);
        }
        if (i == max)
                printf("Not all printed, returned count is %zu\n", n);

        return 0;
}

Used following diff to disable failing vector tests.

diff --git a/src/libstrongswan/plugins/openssl/openssl_plugin.c b/src/libstrongswan/plugins/openssl/openssl_plugin.c
index 5009f4e3f..211c2b434 100644
index 5009f4e3f..211c2b434 100644
--- a/src/libstrongswan/plugins/openssl/openssl_plugin.c
+++ b/src/libstrongswan/plugins/openssl/openssl_plugin.c
@@ -528,7 +528,7 @@ METHOD(plugin_t, get_features, int,
                /* hashers */
                PLUGIN_REGISTER(HASHER, openssl_hasher_create),
 #ifndef OPENSSL_NO_MD2
-                       PLUGIN_PROVIDE(HASHER, HASH_MD2),
+                       //PLUGIN_PROVIDE(HASHER, HASH_MD2),
 #endif
 #ifndef OPENSSL_NO_MD4
                        PLUGIN_PROVIDE(HASHER, HASH_MD4),
@@ -639,8 +639,8 @@ METHOD(plugin_t, get_features, int,
                        PLUGIN_PROVIDE(DH, ECP_384_BIT),
                        PLUGIN_PROVIDE(DH, ECP_521_BIT),
                        PLUGIN_PROVIDE(DH, ECP_224_BIT),
-                       PLUGIN_PROVIDE(DH, ECP_192_BIT),
-#if OPENSSL_VERSION_NUMBER >= 0x10002000L
+                       //PLUGIN_PROVIDE(DH, ECP_192_BIT),
+#if OPENSSL_VERSION_NUMBER >= 0x10002000L && false
                        PLUGIN_PROVIDE(DH, ECP_256_BP),
                        PLUGIN_PROVIDE(DH, ECP_384_BP),
                        PLUGIN_PROVIDE(DH, ECP_512_BP),

@pemensik
Copy link
Contributor Author

Above program reports only those curves on Rawhide:

00 nid 2C9: NIST/SECG curve over a 224 bit prime field
01 nid 2CA: SECG curve over a 256 bit prime field
02 nid 2CB: NIST/SECG curve over a 384 bit prime field
03 nid 2CC: NIST/SECG curve over a 521 bit prime field
04 nid 19F: X9.62/SECG curve over a 256 bit prime field

@tobiasbrunner
Copy link
Member

tobiasbrunner commented Nov 16, 2021

Could it be used to initialize only supported EC types in openssl_plugin.c, get_features method?

Yes, that's possible, but requires some refactoring. I've pushed a couple of patches to the 752-libtls-tests branch.

By the way, that limitation to 5 curves is not only due to the patch (which only removes weak curves) but building in FIPS mode. I wonder if those other curves (in our case the Brainpool curves) are available in non-FIPS mode.

@pemensik
Copy link
Contributor Author

No, they are not available in non-FIPS mode also. I tested it on my machine, which does not have FIPS mode enabled.

@tobiasbrunner
Copy link
Member

Interesting, I wonder if that was an intentional decision or just due to how the curves are stored/provided (in particular in OpenSSL 3 where FIPS-mode is handled as separate provider). Seems weird that curves that are, other than not being FIPS-approved, considered strong are not available in non-FIPS mode.

@pemensik
Copy link
Contributor Author

Yes, it seems to be intentional decision on Fedora and RHEL. It seems to be long standing way and I doubt it would change soon. I admit that is strange to me too.

@tobiasbrunner tobiasbrunner added this to the 5.9.5 milestone Dec 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants