New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] MSC1756: cross-signing devices using a master identity key #1756

Open
wants to merge 8 commits into
base: master
from

Conversation

Projects
None yet
8 participants
@uhoreg
Copy link
Member

uhoreg commented Dec 14, 2018

uhoreg added some commits Dec 14, 2018

@uhoreg uhoreg added the proposal label Dec 14, 2018

@uhoreg uhoreg changed the title [WIP] MSC: cross-signing devices using a master identity key [WIP] MSC1756: cross-signing devices using a master identity key Dec 15, 2018

@erikjohnston

This comment has been minimized.

Copy link
Member

erikjohnston commented Dec 16, 2018

Some observations from a security perspective:

  1. The power to attest new devices resides in a single place, the master key. In MSC 1680 all devices have this power.
  2. Having a separate key for attestations means that a user has flexibility over how to store the master key, and the trade-off they wish to make between security and convenience. For example, they can choose to have the master key on every device, or instead store the master key offline and only use it when signing new devices.
  3. If only the server has the power to distribute new attestations across the network, then attacker needs both the master key and the ability to pass UIA (i.e. have the account's password). This reduces the consequences of the master key being compromised, as they would not be able to create new devices (or rotate master key, etc) without also compromising the user account.
  4. Even if a user decides to distribute the master key across all devices, we should probably strongly recommend that the master key is stored in an encrypted manner. While an attacker would be able to eventually crack the password, it would give time for users to rotate the master key if they became aware their key was compromised.
  5. In this model there is a separation between a single device being compromised and the identity being compromised. For the former a compromised device's attestation can simply be revoked, while for the latter the entire master key needs to be revoked, meaning other users need to verify a new master key for the compromised user. In MSC 1680 these two cases are muddled together due to the graph nature of the attestations, making it harder for a user to figure out the correct response to a device being compromised.
@erikjohnston

This comment has been minimized.

Copy link
Member

erikjohnston commented Dec 16, 2018

I wonder if the POST /_matrix/client/r0/keys/query API can be simply changed to:

{
  "failures": {},
  "device_keys": {
    "@alice:example.com": {
      "JLAFKJWSCS": { ... },
        "unsigned": {
          "device_display_name": "Alice's mobile phone",
          "attestations": {
              "base64+encoded+public+key": "base64+encoded+signature"
          }
        }
      }
    }
  },
  "verified_master_keys": {
       "@user:example.com": {
           "key": "base64+encoded+public+key",
           "signature": "base64+encoded+signature",
       }
  }
}

Where:

  • "attestations" is a map from master key to the signature of the device's public key signed by the master key
  • "verified_master_keys" is a map of users' master keys signed by our master key

In particular, I'm leaning towards simply signing the actual keys, rather than signed JSON of a bunch of stuff. This is because a) its the keys we care about, and b) signed JSON is a bit of a PITA and I'd prefer it if we avoid it where possible.

Show resolved Hide resolved proposals/1756-cross-signing.md Outdated
@ara4n

This comment has been minimized.

Copy link
Member

ara4n commented Dec 16, 2018

At a high level this is looking good to me - thanks for writing it up.

My main concerns are:

  • How does this interact with incremental key backups? It feels like we're solving a very similar problem: creating a single keypair which is used to encrypt a backup, storing it encrypted on the server behind a passphrase, and then loading it & unencrypting it on clients on demand. Should it be the same key so we don't double the complexity (which is already pretty bad for the incremental keybackup stuff?) and to keep a simpler security model? i.e. "This is the one true key which allows the owner to impersonate me by creating new devices... and also access all my online history. So keep it safe."
  • The proposal of using an m.master algorithm to identify the master key feels very weird & hacky. I think this would go away if the master key isn't a device but instead the same as your online backups key? The disadvantage being that we drift further from the shape of the current API.
@erikjohnston

This comment has been minimized.

Copy link
Member

erikjohnston commented Dec 16, 2018

Agree with @ara4n, and thanks again for writing it up! 👍

  • How does this interact with incremental key backups? It feels like we're solving a very similar problem: creating a single keypair which is used to encrypt a backup, storing it encrypted on the server behind a passphrase, and then loading it & unencrypting it on clients on demand. Should it be the same key so we don't double the complexity (which is already pretty bad for the incremental keybackup stuff?) and to keep a simpler security model? i.e. "This is the one true key which allows the owner to impersonate me by creating new devices... and also access all my online history. So keep it safe."

Yup. Though we'll probably still want a separate backup that is instead simply encrypted by the master key when stored on the server, as:

  1. We want to be able to rotate the master key, and I doesn't seem feasible to re-encrypt all the backups.
  2. The backup key is stored unencrypted on devices, whereas we'd want the master key to be encrypted, etc
@ara4n

This comment has been minimized.

Copy link
Member

ara4n commented Dec 17, 2018

Though we'll probably still want a separate backup that is instead simply encrypted by the master key when stored on the server

I think this is how the incremental keybackup stuff works already? (as per #1703 and its predecessors)

@turt2live turt2live added the T-Core label Dec 17, 2018

@uhoreg

This comment has been minimized.

Copy link
Member

uhoreg commented Dec 18, 2018

@ara4n, @erikjohnston: I've updated the proposal with the alternative API. I'll do some more thinking about how to integrate with the key backup stuff.

when logging in or registering; if a client tries to log in using this device
ID, then the server must respond with an error. (FIXME: what error?)

Uploading a new master key should invalidate any previous master key.

This comment has been minimized.

@erikjohnston

erikjohnston Dec 18, 2018

Member

I think we need to make this a bit less of a footgun. It'd be very annoying if a random client could upload a new master key and blow away your identity. Which I guess would look something like you have a upload_new, invalidate and rotate API, where you need to explicitly call invalidate before you can upload_new a new master key

This comment has been minimized.

@ara4n

ara4n Dec 18, 2018

Member

Making developers jump through an invalidate before they can upload_new feels like theatre to me. Or is the idea that invalidate requires UI Auth or something? In which case, why not do UI Auth on upload instead?

This comment has been minimized.

@erikjohnston

erikjohnston Dec 18, 2018

Member

All of this should require UIA to protect against a stolen device.

But basically, overwriting a master key will break all attestations and so will require all remote users to verify the new master key, which is something you really don't want to happen accidentally or due to a bug. The only time this should be necessary is when they have lost their old master key and are going through a full blow recovery process, which a client should very much be handling differently than simply uploading a master key for the first time (i.e. big scary read flashing warnings requiring a blood oath that you have actually lost your master key). I.e., upload_new and rotate should be safe and fine for clients to do, invalidate should be scary and incredibly rare.

I'm mainly worried about new clients blindly overwriting master keys due to some bug/didn't understand the spec/haven't implemented checking for existing keys, etc, and the carnage that could potentially cause.

This comment has been minimized.

@erikjohnston

erikjohnston Dec 18, 2018

Member

To be clear: I think developers should happily ignore the existence of invalidate safe in the knowledge that upload_new is absolutely 100% safe to call at any point, which will support pretty much all use cases (including registration). If they then want to implement full blown key recovery, then they can fairly easily then do so.

Really, I see this as making life easier for developers, rather than having to put scary warning on uploading new master keys.

This comment has been minimized.

@uhoreg

uhoreg Jan 11, 2019

Member

I'm not too fond of having a separate invalidate step followed by an upload_new, but I understand the need to prevent users from accidentally destroying their identity. I'm thinking of having the upload endpoint handling three different cases:

  • uploading just a plain key with no other information - will fail if there is an existing key
  • uploading a key signed by the previous key - replaces the previous key (assuming the signature is fine). (The signature is also included when others query the key, so that people who previously verified the old key can automatically verify the new key.)
  • uploading a key with an extra flag saying "yes, I know that there's an existing key with ID 'foo'" - replaces the previous key

@erikjohnston what do you think?

This comment has been minimized.

@erikjohnston

erikjohnston Jan 13, 2019

Member

That sounds like the three APIs hidden behind the single API tbh. I don't really mind, but given both the client and servers have to handle each case separately I don't really see the big deal with having them as separate API endpoints.

This comment has been minimized.

@uhoreg

uhoreg Jan 15, 2019

Member

Fair enough. I kind of like having it in one endpoint because all the necessary information is already encoded in the request already. I'll think about it and see whether I feel the same way after trying to implement it. ;) The bigger question I think, though, is whether you still want an invalidate followed by an upload_new, or if you'd be fine with just a replace request.

This comment has been minimized.

@erikjohnston

erikjohnston Jan 16, 2019

Member

Having a replace is absolutely fine. Though lets give it a scarier name.

Fair enough. I kind of like having it in one endpoint because all the necessary information is already encoded in the request already.

FWIW I'm not a fan of decoding intent from what happens to be in a request. Its easy for bugs or bad devs to accidentally add/remove the wrong fields and then suddenly the API does something completely different rather than returning an error (though your proposal doesn't suffer from that so much).

Also, from a documentation perspective, I think its easier to say "This API uploads a new key and is always safe. This API rotates the key and is safe. This one nukes your account, use extreme caution." And then you can easily document the failure modes etc of each in a nice clean way. With the swagger stuff we use its easy to say "all these APIs share this common structure" too

@erikjohnston

This comment has been minimized.

Copy link
Member

erikjohnston commented Dec 18, 2018

Looking good! FTR, I much prefer proposal 2 where we differentiate between devices and the master key, as really the only thing devices and master keys have in common are they both share a key ID namespace

@ara4n

This comment has been minimized.

Copy link
Member

ara4n commented Dec 21, 2018

@richvdh is there any chance you could take a quick sanity check over this (so we can avoid another situation where we get valuable but last minute feedback after the impl has already happened O:-)

@richvdh
Copy link
Member

richvdh left a comment

looks broadly good to me. My main concern is around how we get hold of the private master key so that we can sign other users' keys, without the user having to type in his recovery password every 30 seconds.

"key": "base64+public+key",
"signatures": {
"@alice:example.com": {
"ed25519:ABCDEFG": "base64+self+signature"

This comment has been minimized.

@richvdh

richvdh Dec 27, 2018

Member

what is the advantage of having the master key be self-signed?

This comment has been minimized.

@uhoreg

uhoreg Dec 27, 2018

Member

There isn't really any advantage. This could be taken out.

Show resolved Hide resolved proposals/1756-cross-signing.md Outdated
Show resolved Hide resolved proposals/1756-cross-signing.md Outdated
Show resolved Hide resolved proposals/1756-cross-signing.md Outdated
Show resolved Hide resolved proposals/1756-cross-signing.md Outdated
@caev

This comment has been minimized.

Copy link

caev commented Jan 1, 2019

As I understand it, the user must choose, (a) "store master key on device" vs. (b) "store wrapped master key on server."

(b) is a non-starter because most users won't participate in the feature. The overall feature only benefits a users' friends, not themselves, so they will not go to any effort to learn about it, nor ever achieve full understanding of the problem you are solving even while they are annoyed by the problem.

In ~all cases of a new login, the proposed feature should be used, so it should happen as part of the regular login flow similar to Google's "did you sign in from device X?" notifications on Android. At most using the feature can require a quick "[confirmed]" on an old device to add a new one. It can't require a tier 2 emergency password to do something that will happen exactly as often as the tier 1 ordinary password gets used; if optional it won't happen, and if mandatory the tiers are meaningless.

IMHO, (b) should not be implemented, optionally or otherwise. The alternate use-cases will be too confusing on our side, and we will tend to be overly generous in evaluating ourselves if the escape hatch exists, then be surprised when things don't typically go well in the wild.

(a) doesn't handle revocations well.

1 Alice has 3 devices in the normal state (master key on all of them, in accordance with (a)).
2 Alice loses 1 device.
3 Alice notices the device is gone and uses one of the remaining two devices to revoke it.
4 A hacker recovers the lost device and extracts key material from it.

For a protocol to have "revocation" in a meaningful way, (3) must mitigate (4). Yes, there are other scenarios to worry about, for example where Alice doesn't do revocation at all, or where (4) and (3) are inverted in time. But the perfect should not be allowed to be the enemy of the good. Baseline "meaningful" revocation is valuable because the gap between 2 and 4 is likely large, ex. a recycled hard drive.

In this proposal, (3) seems to have become basically meaningless because the master key can't be revoked at all, or can't be revoked without nuking Alice's trust graph anchored to the non-lost devices, which in practice will often mean worse security than ignoring the lost device because in many cases the lost device will never be recovered by an actual hacker (it's probably something that got wiped), and nuking accounts frequently has a cost of teaching people to accept unverified keys, which become the more realistic actual attack than trawling for lost devices.

In my opinion, a good revocation step would have this basic property:

  • A revoked device can't sign new devices with the master key. A master key can't be extracted from a revoked device and then used, somehow, to authenticate a new device, whether by OOB upload, injection of messages by spoofing the server, new-device-login followed by new-device-master-signing, mischevious backup-restore, or any other nefarious means.

and these advanced properties:

  • Revocation respects a quorum. For example, in the 3 & 4 inverted case where the hacker uses Alice's lost device's credentials before Alice notices and revokes them, the hacker will only be able to revoke one of Alice's two non-lost devices. Alice can then use the surviving device to revoke the hacker's device, then add back more devices so she can retain ultimate control of the account.

    • Actions taken by a device between when it was lost and when it was revoked can be undone. For example, at revocation time, the user declares the last time she's certain she had the device, either by timestamp, or by choosing a message she remembers sending from it. Then,
    • a lost device in the hands of a hacker can't add 10 devices, establish quorum, then delete Alice's remaining two devices. This implies there has to be a vesting period before a device contributes to quorum and has revocation privileges, and a staleness period for users who destroy their own qorum by spamming their accounts with fake web devices they no longer actually have ex. by clearing cookies repeatedly and logging in again, or some such flail.
    • Any device signed by a lost device's master key after it was lost can be conveniently revoked along with the lost device (this is just UI sugar, suggestions of what other devices should be marked "lost" based on the last-message-I-remember-sending watermark, plus the "vesting" rules).
    • Any messages sent by a revoked device between when it was lost and when it was revoked can be retroactively marked untrusted in Alice's friends' histories.

I don't know the best way to hit these revocation goals, especially the stuff around quorum.

One degenerate implementation that falls short, but is an improvement:

  • Two levels of master key like GnuPG. The true master key (L1) can be stored on a paper wallet and is required for "survivable" revocation. The regular master key (L2) is stored on all devices.
    • without the paper key: non-"survivable" revocation. Any device may revoke the entire identity. If that has not happened yet, any device may add another device. Revoking a single device without the paper key is not possible; it's all or nothing.
    • with the paper key: it's possible to create an "L2-master-key rotation blob."
      • L1 signature of a fresh L2 key
      • private half of L2 key, wrapped to the device key of every device that is NOT revoked
      • high-water mark for each device IS revoked, saying when it was lost, "messages older than this are still ok".

The workflow with the paper key is:

1 Alice has 3 devices in the normal state (master key on all of them, in accordance with (a)).
2 Alice loses 1 device.
3 Alice notices the device is gone and revokes her entire account.
4 A hacker recovers the lost device and extracts key material from it, but it's useless because Alice revoked herself from matrix entirely.
5 Alice finds her paper key and recovers her account.

One thing to watch for is whether timestamps can be forged. It's probably better to express the watermark as a position in the crypto ratchet, not a timestamp, regardless of how the UI surfaces the feature, which is why I mention using a previously-sent message as a marker. This means device cross-signing needs to be put on the same ratchet somehow so any signatures made by the hacker can be revoked without the hacker evading that by pre-dating the signature, which might not be possible. : / Timestamp attacks vs ratchets may also apply to the quorum timers. : (

I think this degenerate scheme won't work well for users compared to the "quorum" rules because either users won't keep the paper key, or they will lose the paper key (which can't be itself rotated), but at least the degenerate scheme degrades gracefully enough to be strictly better than the existing proposal. I'm afraid I would make a mistake if I tried to implement the quorum rules.

@anoadragon453 anoadragon453 removed the T-Core label Jan 4, 2019

@uhoreg uhoreg referenced this pull request Jan 8, 2019

Open

Implement cross-signing proposal #4110

0 of 7 tasks complete
devices, and Bob has *m* devices, then for Alice to be able to communicate with
Bob on any of their devices, this involves *n×m* key verifications.

One way to addresss this is for each user to use a "master key" for their

This comment has been minimized.

@uhoreg

uhoreg Jan 11, 2019

Member

I'm thinking of changing the terminology, as "master key" will surely get confused with the user keys from MSC1228. I'm thinking maybe something like "device signing key", though that seems somewhat unsatisfactory. Any suggestions?

This comment has been minimized.

@erikjohnston

erikjohnston Jan 13, 2019

Member

Device signing key works. Though given this comment I wonder whether we should have a "master key" and have sub keys for various functions? So folks would still verify other users' master key, but they would use a separate signing key to sign the verification? I.e. there shouldn't be any keys that aren't signed by the master key?

@erikjohnston

This comment has been minimized.

Copy link
Member

erikjohnston commented Jan 13, 2019

Thanks @caev for the thoughts, especially around how this would actually be used in the real world. Its taken me a while to digest it, but a few quick notes:

As I understand it, the user must choose, (a) "store master key on device" vs. (b) "store wrapped master key on server."

Technically there's also option c) store master key in a safe and only take it out on special occasions like adding a new device or recovering you account, etc. Though this is more for power user folks, so doesn't really change anything re your following points.

(b) is a non-starter because most users won't participate in the feature. The overall feature only benefits a users' friends, not themselves, so they will not go to any effort to learn about it, nor ever achieve full understanding of the problem you are solving even while they are annoyed by the problem.

This is a very interesting way of looking at it. Certainly I've been assuming that clients would suitably prompt users to Do The Right Thing, e.g. prompting users to save their master key offline and/or upload an encrypted version to the server, with suitable UX to push people into not just skipping those steps. However this raises two questions: 1) is this enough to get people to use it and 2) will client implementors get it right (assuming Riot gets it right so can be used for reference)?

Certainly its a bit unfortunate that there hasn't been a greater discussion around likely real world UX in this MSC, I know its being considered elsewhere.

In ~all cases of a new login, the proposed feature should be used, so it should happen as part of the regular login flow similar to Google's "did you sign in from device X?" notifications on Android. At most using the feature can require a quick "[confirmed]" on an old device to add a new one. It can't require a tier 2 emergency password to do something that will happen exactly as often as the tier 1 ordinary password gets used; if optional it won't happen, and if mandatory the tiers are meaningless.

Yup, I believe the idea is to do exactly this 👍. Again its unfortunate that the UX/UI proposals aren't linked to this MSC.

IMHO, (b) should not be implemented, optionally or otherwise. The alternate use-cases will be too confusing on our side, and we will tend to be overly generous in evaluating ourselves if the escape hatch exists, then be surprised when things don't typically go well in the wild.

Hopefully it will get used in the wild, as above.

(Revocation...)

I think the ideas in this comment and elsewhere help with a lot of those concerns, though isn't as fully featured as your excellent suggestions. Basically, if you can pass user interactive auth (UIA) you can always revoke the master key and either a) blow away your master key and start again or b) if you have the old master key you can rotate to a new one.

I'm not so sure about the quorum proposal, as it sounds easy to game by just adding enough new devices.

@caev

This comment has been minimized.

Copy link

caev commented Jan 13, 2019

@richvdh richvdh self-requested a review Jan 14, 2019

@caev

This comment has been minimized.

Copy link

caev commented Jan 15, 2019

@uhoreg

This comment has been minimized.

Copy link
Member

uhoreg commented Jan 15, 2019

IMO the old master key needs to sign the new key. There must be a chain of signatures going back to the original master key when the account was created, which we will always have to be ready to replay for the benefit of the user's offline friends. If this chain is broken, it's as if the matrix ID was abandoned and recycled for a different person.

There are two possible ways to replace a key: one is signing the new key with the old key, and the second way is without. If the new key is signed, then users who have already verified with you will be able to maintain trust. If the new key is not, then users will have to re-verify. We have to support the use case where the user doesn't have the master key because people lose passwords.

@uhoreg

This comment has been minimized.

Copy link
Member

uhoreg commented Jan 17, 2019

re: #1756 (comment)

In particular, I'm leaning towards simply signing the actual keys, rather than signed JSON of a bunch of stuff. This is because a) its the keys we care about, and b) signed JSON is a bit of a PITA and I'd prefer it if we avoid it where possible.

The signature needs to be bound to the user's ID and device ID, otherwise someone could take a signature intended for one user and forge a device for a different user.

devices, and Bob has *m* devices, then for Alice to be able to communicate with
Bob on any of their devices, this involves *n×m* key verifications.

One way to address this is for each user to use a device signing key to signs

This comment has been minimized.

@dbkr

dbkr Jan 21, 2019

Member

sign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment