# Transport layer key rotation race condition #633

opened this issue Jul 9, 2019

 I'm reimplementing the transport layer once again in python to allow us to write a better test-suite, and while implementing the key rotation I noticed something strange: there might be a race-condition in how we rotate keys. The current specification states this: Changing keys regularly and forgetting previous keys is useful to prevent the decryption of old messages, in the case of later key leakage (i.e. backwards secrecy). Key rotation is performed for each key (sk and rk) individually. A key is to be rotated after a party encrypts or decrypts 1000 times with it (i.e. every 500 messages). This can be properly accounted for by rotating the key once the nonce dedicated to it exceeds 1000. Key rotation for a key k is performed according to the following steps: Let ck be the chaining key obtained at the end of Act Three. `ck', k' = HKDF(ck, k)` Reset the nonce for the key to `n = 0`. `k = k'` `ck = ck'` Notice that we are managing 4 keys in total: `node1.{rk,sk}` and `node2.{rk,sk}`, with the added constraint that `node1.rk == node2.sk` and `node1.sk == node2.rk`. Assume that both nodes have sent and received 499 messages (`node1.rn = node1.sn = node2.rn = node1.sn = 998`) so both sending and receiving will trigger a rotation. Now both nodes send a message asynchronously, so that they both spontaneously rotate keys on their sending side and reset their sending nonce. Upon receiving the other side's message they'll rotate the receiving keys as well. Notice however that there is a crossover in the rotation, sending the first message that triggered the sending key to rotate modified the chaining key `ck`, and when the receiving key is rotated it'll use `ck'` instead, which results in the `node1.rk == node2.sk` and `node1.sk == node2.rk` constraints to break. The following illustrates how the keys change during this example: `node1` and `node2` send their 500th message asynchronously `node1` and `node2` rotate keys (up to this point `ck` was in sync, they get desynced here), notice that `ck_1 != ck_2`: `node1`: `ck_1', node2.sk' = HKDF(ck, node1.sk)` `node2`: `ck_2', node2.sk' = HKDF(ck, node2.sk)` `node1` and `node2` receive each other's message, triggering a receive rotation: `node1`: `ck_1_1', node2.rk' = HKDF(ck_1, node1.rk)` `node2`: `ck_2_2', node2.rk' = HKDF(ck_2, node2.rk)` At this point none of the keys match up anymore, making decryption impossible, and all because we have asynchronous updates to the chaining key. This is unlikely to be very severe since we first need to get into a situation in which we rotate at the same time, but it is annoying, especially since decryption will fail only on the next (501st) message. Can someone confirm my suspicion?

 Nevermind, after further inspection it turns out that we keep two copies of the chaining key, hence there is no cross-over between sending and receiving keys.