diff --git a/proposals/1687-encrypted-recovery-keys.md b/proposals/1687-encrypted-recovery-keys.md new file mode 100644 index 00000000000..e8e8df4e3d3 --- /dev/null +++ b/proposals/1687-encrypted-recovery-keys.md @@ -0,0 +1,156 @@ +# Proposal for storing an encrypted recovery key on the server to aid recovery of megolm key backups + +## Problem + +[MSC1219](https://github.com/matrix-org/matrix-doc/issues/1219) proposes an API +for optionally storing encrypted megolm keys on your homeserver, so if a user +loses all their devices, they can still recover their history. The megolm keys +are public-key encrypted using a private Curve25519 key that only the end-user +has. + +However, there are usability concerns about users having to store their +Curve25519 recovery private key in a secure manner. Casual users are likely to +be scared away by having to file away a relatively long (e.g. 10 word) +generated recovery key. + +We would like to give the user the option to access their key backup using a +passphrase in addition to their recovery key. We can take inspiration from +Appleā€™s [FileVault 2](https://hal.inria.fr/hal-01460615/document) where Apple +store encrypted copies of your FileVault AES key on your hard disk, encrypted +by your UNIX account password, or a passphrased SSH private key on a server for +convenience. + +## Proposed solution + +Three solutions are given here (two of which are viable, one included for +completeness), varying in the implications of the user changing their +passphrase. + +Option 1 has been chosen, on the basis that we do not require the user to +be able to change their passphrase without also changing their recovery key. + +### Recovery Key + +In all options below, the process for generating a recovery key from a byte +string, b is as follows: + * Prepend the two bytes 0x8B, 0x01 to the byte string b + * Compute a parity bit by XORing all bytes of the resulting string (ie. prefix + + `byte string`) + * Append the parity byte to the prefix + b + * base58 encode the resulting byte string with alphabet + '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz'. + * Format the resulting ASCII string into groups of 4 characters separated by + spaces. + +### Option 1 + +The user provides a passphrase, P. The client generates the backup encryption +private key, K-1 by running PBKDF on this passphrase. The PBKDF +parameters are stored in the auth_data of the key backup under +'private_key_salt' and 'private_key_iterations' keys, respectively: + +```json +{ + [...] + "private_key_salt": "MmMsAlty", + "private_key_iterations": 100000 +} +``` + +The backup public encryption key, K, is determined by running the curve25519 +function on K-1 with basepoint {9}. The recovery key is then +generated by encoding K-1 as above. + +To change the passphrase, a client creates a completely new backup version, +performing the steps above with the new passphrase. The client then re-encrypts +all sessions keys and uploads them to the new backup. The user will always get +a new recovery key whenever they change their passphrase. + +In this option, the recovery key is generated directly from the passphrase +using PBKDF. This means the ciphertext of the backed up keys is more vulnerable +to dictionary attacks. Option 2b attempts to offer a mitigation against this. + +### Option 2a + +The backup encryption private key, K-1 is generated by a secure +random number generator. A private key, K-1p is generated +by running PBKDF on the passphrase. K-1p' is generated by +XORing K-1 with K-1p. +K-1p' is stored on the along with the key backup in the +`private_key` object above. The recovery key is generated by encoding +K-1 as above. + +To change the passphrase, the client generates the new +K-1p from the new passphrase then computes a new +K-1p'. It then updates the backup information with this +new K-1p'. + +This would require the API to support updating the metadata stored with a +backup (or the key parameters to be stored elsewhere, eg. in account data). + +This option, however, allows the server to obtain K-1 by obtaining +any one of the users previous passphrases, assuming it keeps copies of the +previous versions of the key parameters. This option is therefore not viable, +but included for completeness. + +### Option 2b + +A variant on option 2a is to regenerate K-1 when the passphrase is +changed, meaning the recovery does change when the passphrase is changed, +making it identical feature-wise to option 1 and without the problem of any +previous passphrase being sufficient to obtain K-1. It differs, +however, in that K-1 is generated randomly and therefore not +vulnerable to dictionary attacks. However, K-1p is still +vulnerable to dictionary attacks and is stored in the same place with the same +protection, and, if compromised, gives access to K-1. This option +therefore offers no significant security benefit over option 1. + +### Option 3 + +The backup encryption private key, K-1, and a private, +passphrase-derived key, K-1p are generated as above.The +passphrase key counterpart, K-1p', is also generated as +above from the K-1 XOR K-1p. Another private +key, K-1r is generated also by a secure random number +generator and encoded to give the recovery key as above. +K-1r' is generated by XORing K-1r +with K-1. Both K-1p' and +K-1r' are stored in the `private_key` in the backup under +keys `passphrase_counterpart` and `recovery_key_counterpart` respectively. + +To change the passphrase, the client starts a new backup version as in option 1 +(generating a new K-1), but additionally computes a new +K-1r' by XORing K-1r with the new +K-1. This refreshes all keys, but allows the user to keep the same +recovery key for their backup, on the assumption that the recovery key itself +has not been compromised. If it has, the client generates a new backup with a +completely fresh recovery key instead. + +## Security considerations + +The proposal above is vulnerable to a malicious server admin performing a +dictionary attack against the encrypted passphrases stored on their server to +access history. (It's worth bearing in mind that the server admin can also +always hijack its user's accounts; the thing that stopping them from +impersonating their users is E2E device verification.) + +## Possible extensions + +In future, we could consider supporting authenticating users for login based on +their encrypted passphrase, meaning that users only have to remember one +password for their Matrix account rather than a login password and a +history-access passphrase. However, this of course exposes the user's whole +E2E history to the risk of dictionary attacks by public attackers (i.e. not +just server admins), keysniffer-at-login attacks or clients which are lazy +about storing account passwords securely. There's also a risk that because +login passwords are much more commonly entered than history passwords, they +might encourage users to force a weaker password. It's unclear whether this +reduction in security-in-depth is worth the UX benefits of a single master +password, so we suggest checking how this proposal goes first (given in general +we expect key recovery to happen by cross-verifying devices at login rather +than by entering a recovery key or passphrase). + +## See also: + +Notes from discussing this IRL are at +https://docs.google.com/document/d/11fF1rbX5eTkrfxXRS8UhpW5sBENOCydYlLWzB8X1IuU/edit