Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed bug with GCM retransmits when server in AUTO AEAD mode #2703

Closed
wants to merge 3 commits into from

Conversation

oviano
Copy link
Contributor

@oviano oviano commented Mar 30, 2023

After further testing with the new Botan cryspr I noticed that there were sporadic decryption failures occurring on the client when in AES-GCM mode.

Further investigation revealed that re-transmitted packets from the server were not having space accounted for the GCM auth tag and this was leading to failure to decrypt on the client.

The reason is this piece of code:

        if (m_config.iCryptoMode == CSrtConfig::CIPHER_MODE_AES_GCM)
        {
            w_packet.setLength(w_packet.getLength() + HAICRYPT_AUTHTAG_MAX);
        }

In the configuration I was testing, my server was set to auto mode; i.e. m_config.iCryptoMode was set to CSrtConfig::CIPHER_MODE_AUTO so the packet length wasn't getting adjusted.

The fix is to use the actual crypto mode being used, from the crypto control:

        if (m_pCryptoControl->getCryptoMode() == CSrtConfig::CIPHER_MODE_AES_GCM)
        {
            w_packet.setLength(w_packet.getLength() + HAICRYPT_AUTHTAG_MAX);
        }

After the above change, the decrypt failures disappeared.

Presumably the same bug exists when using OpenSSL-EVP.

Copy link
Collaborator

@maxsharabayko maxsharabayko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @oviano
Thank you for noticing and fixing this!

srtcore/core.cpp Outdated
@@ -9193,7 +9193,7 @@ int srt::CUDT::packLostData(CPacket& w_packet)

// The packet has been ecrypted, thus the authentication tag is expected to be stored
// in the SND buffer as well right after the payload.
if (m_config.iCryptoMode == CSrtConfig::CIPHER_MODE_AES_GCM)
if (m_pCryptoControl->getCryptoMode() == CSrtConfig::CIPHER_MODE_AES_GCM)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check if crypto control exists. Also m_ConnectionLock needs to be locked, and it is locked in packData(..) up the stack, so all good there.

Suggested change
if (m_pCryptoControl->getCryptoMode() == CSrtConfig::CIPHER_MODE_AES_GCM)
if (m_pCryptoControl && m_pCryptoControl->getCryptoMode() == CSrtConfig::CIPHER_MODE_AES_GCM)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is this block of code susceptible to the same issue too? It seems to be doing some initialisation at the start, does it matter if the mode is auto at this stage?

    try
    {
        const int authtag = m_config.iCryptoMode == CSrtConfig::CIPHER_MODE_AES_GCM ? HAICRYPT_AUTHTAG_MAX : 0;
        m_pSndBuffer = new CSndBuffer(32, m_iMaxSRTPayloadSize, authtag);
        SRT_ASSERT(m_iISN != -1);
        m_pRcvBuffer = new srt::CRcvBuffer(m_iISN, m_config.iRcvBufSize, m_pRcvQueue->m_pUnitQueue, m_config.bMessageAPI);
        // after introducing lite ACK, the sndlosslist may not be cleared in time, so it requires twice space.
        m_pSndLossList = new CSndLossList(m_iFlowWindowSize * 2);
        m_pRcvLossList = new CRcvLossList(m_config.iFlightFlagSize);
    }

This is inside srt::CUDT::prepareConnectionObjects.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The difficulty here is that CryptoControl is not yet created. Probably worth checking against CIPHER_MODE_AES_CTR and adding auth tag both for CIPHER_MODE_AES_AUTO and CIPHER_MODE_AES_GCM. 🤔

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or create CryptoControl first... Let me check.

Copy link
Collaborator

@maxsharabayko maxsharabayko Mar 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downside would be 16 bytes less space for the payload in Auto mode: 1440 instead of 1456.

Also, how is it working currently without the above - does it reserve the extra space when it needs it, after ending up in GCM mode? I mean, it seems to work correctly...

Currently, I believe if you are testing with MTU 1500 and 1316 payload sizes, you just have enough space.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so it just reduces the available space, but it's not like something is going to overwrite memory or something?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the payload might take two packets instead of one in that case, which is unwanted in live mode.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, no, sorry. This has been fixed in #2589.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've committed a change to reserve the space in both AUTO and GCM modes, for now.

Maybe your proper patch can follow later? Well, up to you guys obviously.

@maxsharabayko maxsharabayko added Type: Bug Indicates an unexpected problem or unintended behavior [core] Area: Changes in SRT library core labels Mar 31, 2023
@maxsharabayko maxsharabayko added this to the v1.5.2 milestone Mar 31, 2023
@maxsharabayko
Copy link
Collaborator

Steps to reproduce/test.

Apply some packet loss on the network.
Then run:

(Sender. Use crypto mode 'auto' to get the buggy behavior)
srt-xtransmit generate "srt://:4200?cryptomode=0&passphrase=abcdefghijk" --sendrate 2Mbps --enable-metrics

(Receiver)
srt-xtransmit receive "srt://127.0.0.1:4200?cryptomode=2&passphrase=abcdefghijk" --enable-metrics

@maxsharabayko
Copy link
Collaborator

Potential patch involving reordering of buffer allocation after HS is interpreted and CryptoControl deduces the crypto mode (GCM/CTR).
Changes a bit too much and would require retesting all combinations of connections including caller-listener and rendezvous 🤔

diff --git a/srtcore/core.cpp b/srtcore/core.cpp
index e6da157..83aa01f 100644
--- a/srtcore/core.cpp
+++ b/srtcore/core.cpp
@@ -4176,7 +4176,7 @@ EConnectStatus srt::CUDT::processRendezvous(
 
     // The CryptoControl must be created by the prepareConnectionObjects() before interpreting and creating HSv5 extensions
     // because the it will be used there.
-    if (!prepareConnectionObjects(m_ConnRes, m_SrtHsSide, NULL))
+    if (!prepareConnectionObjects(m_ConnRes, m_SrtHsSide, NULL) || !prepareBuffers(NULL))
     {
         // m_RejectReason already handled
         HLOGC(cnlog.Debug,
@@ -4731,6 +4731,7 @@ EConnectStatus srt::CUDT::postConnect(const CPacket* pResponse, bool rendezvous,
         // In this situation the interpretation of handshake was already done earlier.
         ok = ok && pResponse->isControl();
         ok = ok && interpretSrtHandshake(m_ConnRes, *pResponse, 0, 0);
+        ok = ok && prepareBuffers(eout);
 
         if (!ok)
         {
@@ -5570,9 +5571,21 @@ bool srt::CUDT::prepareConnectionObjects(const CHandShake &hs, HandshakeSide hsd
         }
     }
 
+    if (!createCrypter(hsd, bidirectional)) // Make sure CC is created (lazy)
+    {
+        m_RejectReason = SRT_REJ_RESOURCE;
+        return false;
+    }
+
+    return true;
+}
+
+bool srt::CUDT::prepareBuffers(CUDTException* eout)
+{
     try
     {
-        const int authtag = m_config.iCryptoMode == CSrtConfig::CIPHER_MODE_AES_GCM ? HAICRYPT_AUTHTAG_MAX : 0;
+        // CryptoControl has to be initialized and in case of RESPONDER the KM REQ must be processed (interpretSrtHandshake(..)) for the crypto mode to be deduced.
+        const int authtag = (m_pCryptoControl && m_pCryptoControl->getCryptoMode() == CSrtConfig::CIPHER_MODE_AES_GCM) ? HAICRYPT_AUTHTAG_MAX : 0;
         m_pSndBuffer = new CSndBuffer(32, m_iMaxSRTPayloadSize, authtag);
         SRT_ASSERT(m_iISN != -1);
         m_pRcvBuffer = new srt::CRcvBuffer(m_iISN, m_config.iRcvBufSize, m_pRcvQueue->m_pUnitQueue, m_config.bMessageAPI);
@@ -5590,13 +5603,6 @@ bool srt::CUDT::prepareConnectionObjects(const CHandShake &hs, HandshakeSide hsd
         m_RejectReason = SRT_REJ_RESOURCE;
         return false;
     }
-
-    if (!createCrypter(hsd, bidirectional)) // Make sure CC is created (lazy)
-    {
-        m_RejectReason = SRT_REJ_RESOURCE;
-        return false;
-    }
-
     return true;
 }
 
@@ -5707,6 +5713,19 @@ void srt::CUDT::acceptAndRespond(const sockaddr_any& agent, const sockaddr_any&
         throw CUDTException(MJ_SETUP, MN_REJECTED, 0);
     }
 
+    if (!prepareBuffers(NULL))
+    {
+        HLOGC(cnlog.Debug,
+            log << CONID() << "acceptAndRespond: prepareConnectionObjects failed - responding with REJECT.");
+        // If the SRT buffers failed to be allocated,
+        // the connection must be rejected.
+        //
+        // Respond with the rejection message and exit with exception
+        // so that the caller will know that this new socket should be deleted.
+        w_hs.m_iReqType = URQFailure(m_RejectReason);
+        throw CUDTException(MJ_SETUP, MN_REJECTED, 0);
+    }
+
    // Synchronize the time NOW because the following function is about
    // to use the start time to pass it to the receiver buffer data.
     bool have_group = false;
@@ -9193,7 +9212,7 @@ int srt::CUDT::packLostData(CPacket& w_packet)
 
         // The packet has been ecrypted, thus the authentication tag is expected to be stored
         // in the SND buffer as well right after the payload.
-        if (m_config.iCryptoMode == CSrtConfig::CIPHER_MODE_AES_GCM)
+        if (m_pCryptoControl && m_pCryptoControl->getCryptoMode() == CSrtConfig::CIPHER_MODE_AES_GCM)
         {
             w_packet.setLength(w_packet.getLength() + HAICRYPT_AUTHTAG_MAX);
         }
diff --git a/srtcore/core.h b/srtcore/core.h
index ce0e539..a39df8e 100644
--- a/srtcore/core.h
+++ b/srtcore/core.h
@@ -491,6 +491,10 @@ private:
     SRT_ATR_NODISCARD SRT_ATTR_REQUIRES(m_ConnectionLock)
     bool prepareConnectionObjects(const CHandShake &hs, HandshakeSide hsd, CUDTException *eout);
 
+    /// Create the CryptoControl object based on the HS packet. Allocates sender and receiver buffers and loss lists.
+    SRT_ATR_NODISCARD SRT_ATTR_REQUIRES(m_ConnectionLock)
+    bool prepareBuffers(CUDTException* eout);
+
     SRT_ATR_NODISCARD SRT_ATTR_REQUIRES(m_ConnectionLock)
     EConnectStatus postConnect(const CPacket* response, bool rendezvous, CUDTException* eout) ATR_NOEXCEPT;
 

@bpolic
Copy link
Collaborator

bpolic commented Apr 17, 2023

  • Tested on latest master.
  • Packet lost 5% (endpoint A and B)
./srt-xtransmit generate "srt://:5555?cryptomode=0&passphrase=abcdefghijk" --sendrate 2Mbps --enable-metrics --duration 20s -v
./srt-xtransmit receive "srt://192.168.3.1:5555?cryptomode=2&passphrase=abcdefghijk" --enable-metrics

Testing results:
Working as expected

5 % packet testing results.zip

@bpolic
Copy link
Collaborator

bpolic commented Apr 18, 2023

  • Tested on latest master.
  • Packet lost 1% (endpoint A and B)
  • Results shared before and after fix

Stats and Metrics.zip
Working as expected

@oviano oviano deleted the gcm-rexmit-auth-tag-bug branch April 18, 2023 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[core] Area: Changes in SRT library core Type: Bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants