Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connecting to the same listener from the same place twice simultaneously is failing frequently with passphrase error... #2412

Closed
oviano opened this issue Jul 20, 2022 · 7 comments
Labels
[core] Area: Changes in SRT library core Type: Bug Indicates an unexpected problem or unintended behavior
Milestone

Comments

@oviano
Copy link
Contributor

oviano commented Jul 20, 2022

Remote server starts listening.

Client application opens up two connections at the same time to the server:

  1. With encryption enabled, via MbedTLS 2.x.x (i.e. supplying passphrase), most times one or both of the connection attempts fail with invalid passphrase (BADSECRET) on server. I double checked I was definitely passing the correct passphrase into the socket option, and I was.

  2. With encryption disabled, connections work.

  3. With encryption enabled, but a delay of 1s between connection attempts, allows it to work.

Is this an allowable use-case? Maybe there is some client and/or server side state relating to MbedTLS? Since the two connections will have a unique source port, they must be seen as separate connections, but maybe there is some sort of state associated with only the IP-address on the server or something that gets accessed/written to at the same time?

@oviano oviano added the Type: Bug Indicates an unexpected problem or unintended behavior label Jul 20, 2022
@oviano oviano changed the title Connecting to the same listener from the same place at the same time is failing frequently with passphrase error... Connecting to the same listener from the same place twice simultaneously is failing frequently with passphrase error... Jul 20, 2022
@maxsharabayko maxsharabayko added the [core] Area: Changes in SRT library core label Jul 20, 2022
@maxsharabayko maxsharabayko added this to the Next Release milestone Jul 20, 2022
@oviano
Copy link
Contributor Author

oviano commented Jul 20, 2022

Attaching heavy server + client logs from an example of it happening.

server_log.log
client_log.log

And two logs where there is a 1s delay on the client, but otherwise the exact same code, showing the issue not occurring:

server_log_success.log
client_log_success.log

@oviano
Copy link
Contributor Author

oviano commented Jul 20, 2022

I think I may have figured this out. Applying the change here #2413 appears to fix the issue, although I need to test it some more as this solution was identified by eyeballing the code rather than anything systematic.

The mbedtls crypto SPR currently uses a static MD context in the PBKDF; the change above creates one for each call so that multiple threads can call the PBKDF at the same time without interfering with each other, which I think is what was happening.

For reference, here is the RIST equivalent code. With my change, SRT now does it the same way as RIST.

static void _librist_crypto_aes_key(struct rist_key *key)
{
    uint8_t aes_key[256 / 8];
#if !HAVE_MBEDTLS
    fastpbkdf2_hmac_sha256(
            (const void *) key->password, strlen(key->password),
            (const void *) &key->gre_nonce, sizeof(key->gre_nonce),
            RIST_PBKDF2_HMAC_SHA256_ITERATIONS,
            aes_key, key->key_size / 8);
#else
    mbedtls_md_context_t sha_ctx;
    const mbedtls_md_info_t *info_sha;
    int ret = -1;
    /* Setup the hash/HMAC function, for the PBKDF2 function. */
    mbedtls_md_init(&sha_ctx);
    info_sha = mbedtls_md_info_from_type(MBEDTLS_MD_SHA256);
    if (info_sha == NULL)
    {
        //rist_log_priv(cctx, RIST_LOG_ERROR, "Failed to setup Mbed TLS hash info\n");
    }

    ret = mbedtls_md_setup(&sha_ctx, info_sha, 1);
    if (ret != 0)
    {
        //rist_log_priv(cctx, RIST_LOG_ERROR, "Failed to setup Mbed TLS MD ctx");
    }

    ret = mbedtls_pkcs5_pbkdf2_hmac(&sha_ctx,
                                    (const unsigned char *)key->password, strlen(key->password),
                                    (const uint8_t *)&key->gre_nonce, sizeof(key->gre_nonce),
                                    RIST_PBKDF2_HMAC_SHA256_ITERATIONS, key->key_size /8, aes_key);
    if (ret != 0)
    {
        //rist_log_priv(cctx, RIST_LOG_ERROR, "Mbed TLS pbkdf2 function failed\n");
    }
    mbedtls_md_free(&sha_ctx);
#endif
#if HAVE_MBEDTLS
    mbedtls_aes_setkey_enc(&key->mbedtls_aes_ctx, aes_key, key->key_size);
#elif defined(LINUX_CRYPTO)
    if (key->linux_crypto_ctx)
        linux_crypto_set_key(aes_key, key->key_size / 8, key->linux_crypto_ctx);
    else
        aes_key_setup(aes_key, key->aes_key_sched, key->key_size);
#else
    aes_key_setup(aes_key, key->aes_key_sched, key->key_size);
#endif
    key->used_times = 0;
}

There are still a couple of static objects in the MbedTLS SPR, maybe someone can comment on whether these are unsafe too.

The other Crypto SPRs do not have any static data.

@oviano
Copy link
Contributor Author

oviano commented Aug 4, 2022

I think there is still an issue relating to concurrent handshaking and the MbedTLS SPR.

In this crash T46 has got hold of the SPR instance, while T44 appears to be still initialising it.

I imagine you can probably reproduce this quite easily by just spinning up multiple connections all at once, using the MbedTLS.

In my case, the issue occurs occasionally with 4 connections.

Screenshot 2022-08-04 at 10 50 43

@oviano
Copy link
Contributor Author

oviano commented Aug 4, 2022

So looking at the three SPRs, MbedTLS is the only one with static data.

However, all three do a check for the "open" function having been filled in, to prevent multiple initialisation.

This still doesn't seem safe though as multiple threads could still be in this function at the same time, if the "open" function has not yet been set. i.e. T1 could have got as far as setting "open" but not the other functions, while T2 enters the function, sees that "open" has been set and so assumes everything else has been set, even though T1 hasn't finished yet. It might then call a function which hadn't yet been set.

Most likely it would very rarely been an issue with the OpenSSL and GNUTLS SPRs, because they only do some trivial setting of functions, but it's still technically flawed. MbedTLS obviously does further static initialisation which increases the likelihood of multiple threads trampling on each other.

Safest option appears to be a static mutex with a scoped lock at the top of this function:

    static srt::sync::Mutex s_mtxCrysprInit;
    srt::sync::ScopedLock lck(s_mtxCrysprInit);

This necessitates including "sync.h" though, which in-turn requires cryspr-mbedtls.c (and the OpenSSL and GNUTLS equivalents) to either be changed to ".cpp" files or compiled as CPP files.

If you are happy with this solution @maxsharabayko then I don't mind making a PR with this additional lock + renaming the files to.cpp.

@oviano
Copy link
Contributor Author

oviano commented Aug 6, 2022

See PR #2425

@maxsharabayko
Copy link
Collaborator

Hi @oviano.
Did PRs #2413 and #2425 resolve the issue?

@oviano
Copy link
Contributor Author

oviano commented Sep 5, 2022

Yes I’ve had no problems since those fixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[core] Area: Changes in SRT library core Type: Bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

2 participants