Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC3270: Symmetric megolm backup #3270

Closed
wants to merge 9 commits into from
Closed
135 changes: 135 additions & 0 deletions proposals/3270-symmetric-megolm-backup.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
# MSC3270: Symmetric megolm backup
The current megolm backup uses asymmetric encryption. This was chosen so that
clients without the private key can still add their own megolm sessions to the
backup. This, however, allows a homeserver admin to inject their own malicious
megolm session into someone’s backup and then send an encrypted message as a user
that they wish to impersonate. Due to this, some clients such as Element [warn the
user](https://github.com/vector-im/element-web/issues/14323#issuecomment-740855963)
that a message cannot be authenticated when the megolm session for that
message was obtained from backup.

Using symmetric encryption for megolm backup would fix this attack vector,
since keys added by untrusted devices would be undecryptable, thus allowing keys
obtained from backup to be trusted. Additionally, many clients cache the
megolm private key anyway, making the original reason for choosing asymmetric
encryption obsolete.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation details of some clients don't make "the original factor for choosing asymmetric
encryption obsolete" at the spec level. As long as a user doesn't use such a client, this doesn't affect them at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thought behind this paragraph was probably that some x-signing keys typically need to be cached anyways so it is trivial to toss the megolm key into said cache. online megolm backup outdates x-signing.


**Credits:** This proposal was originally written by @sorunome.

## Proposal
This proposal introduces a new megolm backup algorithm, `m.megolm_backup.v1.aes-hmac-sha2`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stays as v1, because it uses the same API, only the algorithm is different? (I would have expected a v2, but it doesn't really seem necessary, just surprising!)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My rationale is that it's v1 of the aes-hmac-sha2 algorithm. We probably should have put the version number after the name originally (so that we would end up with m.megolm_backup.aes-hmac-sha2.v1), but we did it this way to begin with.

I'm open to being persuaded that it should be different.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really see a reason for v2, apart from being able to distinguish the backup format the key name. (See other thread)

Copy link
Contributor

@poljar poljar Nov 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a couple of reasons:

  1. Newcomers can quickly know how many versions there were in total.
  2. It's not ambiguous which backup algorithm I mean if I just say the version number N.
  3. Since it's unlikely that the algorithm won't change when the version number changes we're going to have only v1 algorithms, which renders the version number pointless.
  4. Since the key itself didn't change but only the way we use it has, we can have a RecoveryKey struct that now has a decrypt_v1() and decrypt_v2() method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Soru commented:

Fair point, calling it v2 will be trivial

(It's too bad GitHub doesn't automatically put email replies in the right thread.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that it's that important that people can easily tell how many versions there are. Especially since the old algorithm will eventually be deprecated and (hopefully eventually) will be able to be ignored. So in practice we'll end up with just one version in use.

Also, I don't think that we would refer to backup algorithms by number -- I think that it would be more common to refer to them by name, or as "symmetric" versus "asymmetric". And I think it would be confusing to have two different ways (number and name) to distinguish the two algorithms embedded in the same string.

If people don't like the m.megolm_backup.v1.aes-hmac-sha2, my current first preference would be switching to m.megolm_backup.aes-hmac-sha2.v1 (so that it's clearer that we're talking about version 1 of the aes-hmac-sha2 method, rather than version 1 of megolm backup) and my second preference would be to drop the number altogether (m.megolm_backup.aes-hmac-sha2) so that we avoid the confusion altogether.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it would be more common to refer to them by name, or as "symmetric" versus "asymmetric".

I agree with this and I think it does have merit to name things in a way that makes it easy to refer to them, both from code (in the form of method names) and in casual conversation.

So I'm wondering if it's even important to include the crypto construct name in the version at all? It's clunky to spell out so it's unlikely to ever get used in conversation, and then you have to mentally translate between symmetric and aes-hmac-sha2. Would it perhaps make more sense to name it m.megolm_backup.symmetric.v1?

Barring that, I would prefer moving the version number to the end.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally try to include the algorithm names, as explained in https://spec.matrix.org/unstable/client-server-api/#messaging-algorithm-names

The backup method `m.megolm_backup.v1.curve25519-aes-sha2`
is deprecated.

### `session_data`
The session data of the megolm backup is an object containing the `iv` property
and the `ciphertext` property. Below is described how to generate these from
the megolm backup key. The data to be encoded is the same JSON-encoded data as
in `m.megolm_backup.v1.curve25519-aes-sha2`.

As such, a complete `KeyBackupData` object could look as follows:

```json
{
"first_message_index": 0,
"forwarded_count": 0,
"is_verified": true,
"session_data": {
"iv": "cL/0MJZaiEd3fNU+I9oJrw",
"ciphertext": "WL73Pzdk5wZdaaSpaeRH0uZYKcxkuV8IS6Qa2FEfA1+vMeRLuHcWlXbMX0w"
}
}
```

### Encryption
Room keys are encrypted using AES-GCM using the following process:

1. Encode the session key to be backed up as a JSON object with the same
properties as with `m.megolm_backup.v1.curve25519-aes-sha2`, with the
addition of an optional property `untrusted`, which is a boolean indicating
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I discovered when writing matrix-org/matrix-spec#1294, the spec uses the term "trusted" to mean different things is different contexts, so we may want to consider using a different term. Suggestions welcome.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we maybe store what the session was signed by?

whether the Megolm session should be considered as untrusted, for example
because it came from an untrusted source. If the `untrusted` property is
absent, the key should be considered as trusted.
2. Given the megolm backup key, generate 32 bytes by performing an HKDF with
SHA-256 as the hash, a salt of 32 bytes of 0, and with `<session ID>|<backup
version>` as the info. This is the AES encryption key.
3. Generate 16 random bytes, and use this as the AES initialization
vector. This becomes the `iv` property, encoded using unpadded base64.
4. Stringify the JSON object and encrypt it using AES-GCM-256 using the AES key
and initialization vector generated above. This encrypted data, encoded
using unpadded base64, becomes the `ciphertext` property.

### `auth_data`
Similar to symmetric SSSS, the `auth_data` object in the `versions` reply
contains information to verify that you indeed have the correct private
key. For that, you generate a random initialization vector and encrypt the
empty string using the method above, and put the `iv` and the `ciphertext` into
the `auth_data` object. For session ID and backup version, an empty string is
used. As such, a reply to the `versions` endpoint could look as follows:

```json
{
"algorithm": "m.megolm_backup.v1.aes-hmac-sha2",
"auth_data": {
"iv": "cL/0MJZaiEd3fNU+I9oJrw",
"ciphertext": "+xozp909S6oDX8KRV8D8ZFVRyh7eEYQpPP76f+DOsnw"
},
"count": 42,
"etag": "meow",
"version": "foxies"
}
```

### Secret key storage
The secret key for symmetric megolm is stored in SSSS base64-encoded with the
name `m.megolm_backup.v1.<backup version>`.

### Transitioning to symmetric backup

Clients that implement this backup method should consider older clients,
which will not be able to use backups created using this method. For example,
clients can initially be updated so that they will be able to use backups
created using this method, but not yet create new backups using this method.
Some time later, once the client author deems that a sufficient number of
clients have been updated to use the new backup method, the client can be
modified such that new backups are created using this method.

## Potential issues
Many users already have an asymmetric megolm backup and rely on it. In order to
not lose any megolm keys, clients would have to implement a migration to
symmetric megolm backup. When doing so, the existing keys should be marked as
untrusted.

## Alternatives
An earlier version of this MSC used the same encryption as is currently used in
[symmetric
SSSS](https://github.com/matrix-org/matrix-spec-proposals/pull/2472). However,
it was
[later](https://matrix.org/blog/2022/09/28/upgrade-now-to-address-encryption-vulns-in-matrix-sdks-and-clients)
noted that this method is not "IND-CCA2 secure". As a result, it now uses
AES-GCM-256 instead.

The authenticity of keys could be established in backups using the
`m.megolm_backup.v1.curve25519-aes-sha2` algorithm by adding a signature or a
MAC. However, this would require managing another key for the key backup. It
would be easier for clients to only need to manage one key.

## Security considerations
The proposed method requires clients to cache the encryption key for the key
backup. This means that an attacker who compromises a client that uses the key
backup will have access to all the key stored in the backup. Since most
clients already cache the decryption key for current backups, this is not a
change from current practice.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I don't disagree with this, I don't think that current implementation practices should legitimize increasing the impact of a client being compromised at the spec level.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current (unstable) spec with asymmetric megolm backup introduces a security issue (not because of the specific asymmetric implementation, but because of using asymmetric cryptography in a place where symmetric one is more appropriate), which lead to this MSC, as outlined in the first paragraph.

So, we are closing one bigger security issue with this. Clients who don't want to cache the key could always have the user manually input it for a one-shot sync or similar.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the risks here depend on your threat model. if you trust your server admin not to inject malicious history (but you want the legit history to be as secure as possible), then using a client with assymetric crypto which doesn’t cache the megolm key may be most appropriate. The use case here might be a govt server or something where you assume the server is generally trusted but you still don’t want your messages to be exposed if the server was compromised - and are happy for msgs whose keys came from msg backup to say “untrusted”.

Conversely, if you are a random user on a relatively untrusted server (eg using a public hosting solution), you might be very worried about the server sending you faked history when you log in on a new device (eg some malware attachment which appears to come from a verified user but was actually inserted by the server admin). In which case risks of an endpoint attack stealing your symmetric backup key might be less than the risk of a malicious server admin attack, making this MSC worthwhile.

I feel pretty uncomfortable that after all the effort the double ratchet takes to mitigate the risk of key exfiltration by having a self healing ratchet, we end up a) storing all our megolm keys locally anyway, b) storing our megolm backup key locally too. That said, given the megolm keys are stored locally anyway… perhaps storing the backup key is okay. (At which point, why are we even using the double ratchet in the first place?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are storing the backup key and all sessions you know about locally, I think symmetric backup is fine. If you don't want past messages to get compromised, wouldn't you rather just not use any backup at all and as soon as you log out, all your history is burned and lost? That gives you the best forward secrecy.

Asymmetric backup is somewhere in between, where only the device with the decryption key can decrypt the past messages, while other devices can only upload keys into the backup. But that backup can still get compromised, you could just theoretically require the client to always prompt for a password to decrypt past messages. That is however weakened by other clients readily sharing the backup key, if a verified device asks for it. So really, I think the best approach here is to disable the online key backup and rely on key gossiping instead to access past messages. Because then at least you always have the control of burning old decryption keys. Once you upload a backup, that is always there and once someone gets access to the key, those messages are all forever compromised.


To migrate an existing backup to the new method, clients will need the SSSS key
to read the existing backup and to store the key for the new backup. When the
user enters the SSSS key, the client will have access to all of the other
secrets stored in SSSS. In general, most users already trust their clients
with their secrets, or could select a trusted client to perform the migration.

## Unstable prefix
While this feature is in development, implementations should use
`org.matrix.msc3270.v2.aes-hmac-sha2` as the backup algorithm.

`org.matrix.msc3270.v1.aes-hmac-sha2` was used in a previous version that used
AES-CTR-256 for encryption.