Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC3270: Symmetric megolm backup #3270

Closed
wants to merge 9 commits into from

Conversation

uhoreg
Copy link
Member

@uhoreg uhoreg commented Jul 7, 2021

@uhoreg uhoreg changed the title MSCxxxx: Symmetric megolm backup MSC3270: Symmetric megolm backup Jul 7, 2021
@uhoreg uhoreg added kind:maintenance MSC which clarifies/updates existing spec proposal A matrix spec change proposal proposal-in-review labels Jul 7, 2021
**Credits:** This proposal was originally written by @sorunome.

## Proposal
This proposal introduces a new megolm backup algorithm, `m.megolm_backup.v1.aes-hmac-sha2`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stays as v1, because it uses the same API, only the algorithm is different? (I would have expected a v2, but it doesn't really seem necessary, just surprising!)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My rationale is that it's v1 of the aes-hmac-sha2 algorithm. We probably should have put the version number after the name originally (so that we would end up with m.megolm_backup.aes-hmac-sha2.v1), but we did it this way to begin with.

I'm open to being persuaded that it should be different.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really see a reason for v2, apart from being able to distinguish the backup format the key name. (See other thread)

Copy link
Contributor

@poljar poljar Nov 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a couple of reasons:

  1. Newcomers can quickly know how many versions there were in total.
  2. It's not ambiguous which backup algorithm I mean if I just say the version number N.
  3. Since it's unlikely that the algorithm won't change when the version number changes we're going to have only v1 algorithms, which renders the version number pointless.
  4. Since the key itself didn't change but only the way we use it has, we can have a RecoveryKey struct that now has a decrypt_v1() and decrypt_v2() method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Soru commented:

Fair point, calling it v2 will be trivial

(It's too bad GitHub doesn't automatically put email replies in the right thread.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that it's that important that people can easily tell how many versions there are. Especially since the old algorithm will eventually be deprecated and (hopefully eventually) will be able to be ignored. So in practice we'll end up with just one version in use.

Also, I don't think that we would refer to backup algorithms by number -- I think that it would be more common to refer to them by name, or as "symmetric" versus "asymmetric". And I think it would be confusing to have two different ways (number and name) to distinguish the two algorithms embedded in the same string.

If people don't like the m.megolm_backup.v1.aes-hmac-sha2, my current first preference would be switching to m.megolm_backup.aes-hmac-sha2.v1 (so that it's clearer that we're talking about version 1 of the aes-hmac-sha2 method, rather than version 1 of megolm backup) and my second preference would be to drop the number altogether (m.megolm_backup.aes-hmac-sha2) so that we avoid the confusion altogether.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it would be more common to refer to them by name, or as "symmetric" versus "asymmetric".

I agree with this and I think it does have merit to name things in a way that makes it easy to refer to them, both from code (in the form of method names) and in casual conversation.

So I'm wondering if it's even important to include the crypto construct name in the version at all? It's clunky to spell out so it's unlikely to ever get used in conversation, and then you have to mentally translate between symmetric and aes-hmac-sha2. Would it perhaps make more sense to name it m.megolm_backup.symmetric.v1?

Barring that, I would prefer moving the version number to the end.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally try to include the algorithm names, as explained in https://spec.matrix.org/unstable/client-server-api/#messaging-algorithm-names

```

### Secret key storage
The secret key for symmetric megolm is stored in SSSS base64-encoded with the name `m.megolm_backup.v1`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the old encryption key be reused, that is already stored under this key? Or should a new one be generated? How does a client tell in that case, that this key is for the new or old backup?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the same issue that we have when you create a new backup version -- you can't tell if the thing stored in SSSS is for the new backup or the old backup. The best we can say right now is that the key stored in SSSS should be for the current backup, which means that there is a race when you're migrating from one backup to the other.

We could fix this by saying that we should store it using the name m.megolm_backup.v1.<backup version>, but it's unclear if we should fix this as part of this MSC, or as a separate MSC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this special instance, we could also name the key v2, which would make it obvious, but we may want to reserve v2 for proper naming of the keys? Although I would prefer to only migrate all key backups once, in which case a new naming scheme may make sense being introduced in this as well.

@deepbluev7
Copy link
Contributor

Until https://github.com/uhoreg/matrix-js-sdk/tree/symmetric_backup is ready, I think you need to also add the need-implementation label.

Co-authored-by: Sorunome <mail@sorunome.de>
since keys added by untrusted devices be undecryptable, thus allowing keys
obtained from backup to be trusted. Additionally, many clients cache the
megolm private key anyways, making the original factor for choosing asymmetric
encryption obsolete.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation details of some clients don't make "the original factor for choosing asymmetric
encryption obsolete" at the spec level. As long as a user doesn't use such a client, this doesn't affect them at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thought behind this paragraph was probably that some x-signing keys typically need to be cached anyways so it is trivial to toss the megolm key into said cache. online megolm backup outdates x-signing.

backup. This means that an attacker who compromises a client that uses the key
backup will have access to all the key stored in the backup. Since most
clients already cache the decryption key for current backups, this is not a
change from current practice.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I don't disagree with this, I don't think that current implementation practices should legitimize increasing the impact of a client being compromised at the spec level.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current (unstable) spec with asymmetric megolm backup introduces a security issue (not because of the specific asymmetric implementation, but because of using asymmetric cryptography in a place where symmetric one is more appropriate), which lead to this MSC, as outlined in the first paragraph.

So, we are closing one bigger security issue with this. Clients who don't want to cache the key could always have the user manually input it for a one-shot sync or similar.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the risks here depend on your threat model. if you trust your server admin not to inject malicious history (but you want the legit history to be as secure as possible), then using a client with assymetric crypto which doesn’t cache the megolm key may be most appropriate. The use case here might be a govt server or something where you assume the server is generally trusted but you still don’t want your messages to be exposed if the server was compromised - and are happy for msgs whose keys came from msg backup to say “untrusted”.

Conversely, if you are a random user on a relatively untrusted server (eg using a public hosting solution), you might be very worried about the server sending you faked history when you log in on a new device (eg some malware attachment which appears to come from a verified user but was actually inserted by the server admin). In which case risks of an endpoint attack stealing your symmetric backup key might be less than the risk of a malicious server admin attack, making this MSC worthwhile.

I feel pretty uncomfortable that after all the effort the double ratchet takes to mitigate the risk of key exfiltration by having a self healing ratchet, we end up a) storing all our megolm keys locally anyway, b) storing our megolm backup key locally too. That said, given the megolm keys are stored locally anyway… perhaps storing the backup key is okay. (At which point, why are we even using the double ratchet in the first place?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are storing the backup key and all sessions you know about locally, I think symmetric backup is fine. If you don't want past messages to get compromised, wouldn't you rather just not use any backup at all and as soon as you log out, all your history is burned and lost? That gives you the best forward secrecy.

Asymmetric backup is somewhere in between, where only the device with the decryption key can decrypt the past messages, while other devices can only upload keys into the backup. But that backup can still get compromised, you could just theoretically require the client to always prompt for a password to decrypt past messages. That is however weakened by other clients readily sharing the backup key, if a verified device asks for it. So really, I think the best approach here is to disable the online key backup and rely on key gossiping instead to access past messages. Because then at least you always have the control of burning old decryption keys. Once you upload a backup, that is always there and once someone gets access to the key, those messages are all forever compromised.

@turt2live turt2live added the needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. label Jul 18, 2021
Co-authored-by: Denis Kasak <dkasak@termina.org.uk>
@Sorunome

This comment has been minimized.

uhoreg and others added 2 commits November 16, 2021 15:02
Co-authored-by: Matthew Hodgson <matthew@arasphere.net>

1. Encode the session key to be backed up as a JSON object with the same
properties as with `m.megolm_backup.v1.curve25519-aes-sha2`, with the
addition of an optional property `untrusted`, which is a boolean indicating
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I discovered when writing matrix-org/matrix-spec#1294, the spec uses the term "trusted" to mean different things is different contexts, so we may want to consider using a different term. Suggestions welcome.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we maybe store what the session was signed by?

@richvdh
Copy link
Member

richvdh commented Mar 1, 2024

@uhoreg I gather we are planning to do something different to this? (#4048) Should this MSC be closed?

@uhoreg
Copy link
Member Author

uhoreg commented Mar 7, 2024

Closed in favour of #4048

@uhoreg uhoreg closed this Mar 7, 2024
@turt2live turt2live added the obsolete A proposal which has been overtaken by other proposals label Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:maintenance MSC which clarifies/updates existing spec needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. obsolete A proposal which has been overtaken by other proposals proposal A matrix spec change proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants