New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document missing parts of E2E #1284

Merged
merged 27 commits into from Aug 20, 2018

Conversation

@Zil0
Contributor

Zil0 commented Jun 7, 2018

This is a step towards fixing #501, by merging what has already be written long ago by @richvdh in https://github.com/matrix-org/matrix-doc/tree/drafts/e2e.
Since this branch has long diverged, I had to resort to copy-pasting...

Opening this because I think I will need feedback as I go. Please do not merge until I remove the WIP tag. I think it will need squashing at some point anyway.

Signed-off-by: Valentin Deniaud <valentin.deniaud@inpt.fr>

@Zil0 Zil0 force-pushed the Zil0:e2e_doc branch from 2bd118a to 8f80e13 Jun 11, 2018

@Zil0

This comment has been minimized.

Contributor

Zil0 commented Jun 11, 2018

Some specific questions:

  • 9973445 is this correct? if yes, is it clear?
  • e83a955 does this need more detail? Is it the last thing that was missing from the olm section?

@Zil0 Zil0 force-pushed the Zil0:e2e_doc branch from 7d75dd4 to 068967e Jul 18, 2018

@Zil0 Zil0 changed the title from [WIP][RFC] Document missing parts of E2E to Document missing parts of E2E Jul 18, 2018

@Zil0 Zil0 force-pushed the Zil0:e2e_doc branch from 068967e to 0785f94 Jul 18, 2018

@@ -1,9 +1,3 @@
---
allOf:

This comment has been minimized.

@Zil0

Zil0 Jul 18, 2018

Contributor

I suppose this is not a good solution according to the comment below? But I can't really see why.

@Zil0 Zil0 force-pushed the Zil0:e2e_doc branch from 0785f94 to e088261 Jul 18, 2018

}
The type and content of the plaintext message event are given in the payload.
We include the room ID in the payload, because otherwise the homeserver would
be able to change the room a message was sent in.
.. TODO: claimed_keys
Clients must confirm that the ``sender_key`` belongs to the user that sent the

This comment has been minimized.

@Zil0

Zil0 Jul 18, 2018

Contributor

I didn't want to change the beginning of the sentence, but I don't think it is very clear since it feels weird to say "clients must do something they maybe can't".
How about starting the sentence with "When a device is verified, clients must confirm..."?

This comment has been minimized.

@uhoreg

uhoreg Jul 25, 2018

Member

One way to deal with this would be to separate key verification from verifying that it matches result /keys/query. So you could say something like "Clients must confirm that the sender_key matches the keys returned by /keys/query for the given user. This can be done by ..."

This comment has been minimized.

@Zil0

Zil0 Jul 26, 2018

Contributor

How about:

Clients must confirm that the sender_key and the ed25519 field value under the keys property matches the keys returned by /keys/query for the given user, and verify the signature of the payload. Without this check, a client cannot be sure that the sender device owns the private part of the ed25519 key it claims to have in the Olm payload.
This is crucial when the ed25519 key corresponds to a verified device.

This comment has been minimized.

@uhoreg

uhoreg Jul 26, 2018

Member

I would change "and verify" to "and must also verify" since the first sentence is a bit long, and the "must also" will link it with the "Clients must" at the beginning of the sentence. But other than that, it sounds good.

This comment has been minimized.

@Zil0

Zil0 Jul 26, 2018

Contributor

Done :)

present.
ciphertext:
type:
- object

This comment has been minimized.

@Zil0

Zil0 Jul 18, 2018

Contributor

I'll need swagger/yaml advice here. This results in the Type column saying object or string, when I want it to say CiphertextInfo or string, documenting the CiphertextInfo object as I would do with

type: object
title: CiphertextInfo
properties:
  ciphertext: ...

How can I do this?

This comment has been minimized.

@turt2live

turt2live Jul 25, 2018

Member

Could try putting title at the same level as type:

type:
  - object
  - string
title: CiphertextInfo

Otherwise I think you'll also need to specify the properties.

This comment has been minimized.

@KitsuneRal

KitsuneRal Jul 26, 2018

Contributor

@Zil0, this is not quite about Swagger, rather about JSON Schema which Swagger uses to define object schemas. You need oneOf: https://spacetelescope.github.io/understanding-json-schema/reference/combining.html

This comment has been minimized.

@Zil0

Zil0 Jul 26, 2018

Contributor

Thank you both for your input.

@turt2live when doing this, CiphertextInfo is simply ignored.

@KitsuneRal it fails with Error reading property None.ciphertext: 'type' in scripts/templating/matrix_templates/units.py:280 when I do:

      ciphertext:
        description: |-
          Normally required. The encrypted content of the event.
        oneOf:
          - type: string
          - type: object
            title: CiphertextInfo
            properties:
              sender_key:
                type: string

type seems like a required key. If I put object as a wild guess, oneOf is ignored.

For the record the first thing I tried fails with mapping values are not allowed here which is thrown by yaml:

      ciphertext:
        description: |-
          Normally required. The encrypted content of the event.
        type:
          - string
          - object
            title: CiphertextInfo
            properties:
              sender_key:
                type: string

This comment has been minimized.

@KitsuneRal

KitsuneRal Jul 26, 2018

Contributor

My first thought was the same as yours except that you can't miss strings and mapping pairs, so it should have been type: object instead of just object. Given that it's type inside type I'm unsure of that works either. As for the case with oneOf, you may try to wrap it in another type - I believe chances should be better then.

This comment has been minimized.

@KitsuneRal

This comment has been minimized.

@Zil0

Zil0 Jul 28, 2018

Contributor

Thanks for looking into this! Sadly it doesn't build master for me (I cloned your fork).
local variable 'prop_title' referenced before assignment, and other errors after adding back the else:. I don't feel like debugging this right now :/

This comment has been minimized.

@KitsuneRal

KitsuneRal Jul 29, 2018

Contributor

Sorry for throwing dysfunctional code at you. I've fixed this and a couple of other errors, gendoc.py should work fine now (just pull the latest changes from the same branch of mine).

There's a formal problem with this, however: turned out that Swagger/OpenAPI 2 only uses allOf from JSON Schema but not oneOf. OpenAPI 3 fixes this, adding oneOf, anyOf etc. to the list of borrowings from JSON Schema. So basically the options are:

  1. Assume and document an "extension" to OpenAPI 2 (similar to #1414)
  2. Wait until we switch to OpenAPI 3 (watch #1446) - this will require all the scripts to be upgraded anyway to support its new features.
  3. Stay with the limited [ object, string ] syntax (it's not the first such case, by the way - I see something similar in client-server/definitions/push_rule.yaml).

Given that it's a pure formality, I'd probably recommend option 1.

This comment has been minimized.

@Zil0

Zil0 Jul 29, 2018

Contributor

Thanks! It works great. Option 1 seems good, will you open a separate PR? I'll rebase this one on your branch for now.

This comment has been minimized.

@KitsuneRal

KitsuneRal Aug 2, 2018

Contributor

I will, and I guess it's a good idea to rebase your branch on mine (now that mine is rebased on master).

type: string
description: |-
The encryption algorithm to be used to encrypt messages sent in this
room. For example, ``m.megolm.v1.aes-sha2``.

This comment has been minimized.

@Zil0

Zil0 Jul 18, 2018

Contributor

I have a very general question that applies to other parts too: what is the rule of thumb here for saying "For example"?
The way I see it, there is no other supported encryption protocol. If a client implements one, it would be a spec extension, hence not covered by the spec. Another way of putting this is, if a client implements, say rot13 as an encryption algorithm to be used in rooms, it would conform to the spec, while having a non-working end-to-end encryption behavior in the Matrix world; isn't that a problem?
As a result of those thoughts I would have put at least "Should be" instead of "For example", had it not be written before.
Note that the guide clearly states that only Megolm is supported as a value here https://matrix.org/docs/guides/e2e_implementation.html#id32

This comment has been minimized.

@uhoreg

uhoreg Jul 25, 2018

Member

This is usually done by using the enum property, which is just an array of allowed values. You can see an example of it at https://github.com/matrix-org/matrix-doc/blob/master/api/client-server/receipts.yaml#L52, or you can find other examples by just grepping for "enum".

This comment has been minimized.

@Zil0

Zil0 Jul 26, 2018

Contributor

Yup, totally what I was looking for, thanks.

@Zil0

This comment has been minimized.

Contributor

Zil0 commented Jul 18, 2018

I'm mostly done with this, modulo eventual omissions and wording issues. With #1420, I think the last missing bit is key sharing, and then #501 can hopefully be closed?

Please answer the questions I left as review comments (even the ones marked Outdated by GitHub).

Fix #589, #590 (see here), #1200, #1171, #1157. Partially fix #1212, as it does not include the S2S part.

@uhoreg uhoreg self-assigned this Jul 18, 2018

@Zil0 Zil0 force-pushed the Zil0:e2e_doc branch 2 times, most recently from 1c51cc8 to 465fab5 Jul 26, 2018

type: integer
description: The Olm message type.
description: |-
Normally required. The encrypted content of the event. Either directly

This comment has been minimized.

@Zil0

Zil0 Jul 29, 2018

Contributor

Why "normally"?

This comment has been minimized.

@richvdh

richvdh Aug 16, 2018

Member

good question.The spec now says "Required. Normally required."

Let's get rid of "Normally required."

"ciphertext": {
"7qZcfnBmbEGzxxaWfBjElJuvn7BZx+lSz/SvFrDF/z8": {
"type": 0,
"body": "AwogGJJzMhf/S3GQFXAOrCZ3iKyGU5ZScVtjI0KypTYrWG0SIKkPAW8wAPNo9PAo915ex5TEpervpLtTmqbS4K9YgEorGiBLOXb2Sxb8vzJkZYBf7x1jVfIWL6LnCb4NGFMaRsyLBSKgBgMKIEfYtTaCQrwqh/grPMxKFFalrBLMlgez5S7W4mDYDARIEAEi8AV09qcmXzqJboIEUt8Z3O/mH4FhAO+wAYYcpe/Pz/1DQDkh8yhucyk8cxT1lJIIBIS22sRxmC6Rc98HH2g2dzHzszpM20/x68T4OurJ57DZXj5Ts/T3MQIYSAL8wkk/nfnnwrytWSSaAZle0cVqw4uOHq9FcjoNx47oug+lXhpCQX3vC50SygbsTy7LalO/qSiAo272FPOwKJ0NQVB/VHOnf1+jL6fl+v4skiJG1t/flFuCyWFyTSKIZRhY3jwIhXE09Tgtg1TFPYlgbGZXaKJBh3wQaiZ0PgS/Elc7qSkwrS67lQhvGytJKDR0uCxlcK4IXgHv0UTMkruZ/Ce5n8B/X7DT1SHHECF25SjuufyCM08IrcuhCvwukI6K8u0ppRJQGyaDAlQJpwn2SgQXuuYN8Czpj710babJAeYsK7QX8/xZrTWt+0navY/azOr7nScIkBBP7frPxWKVP+rJwwA886+0dmuJ9bGID8AwF4BhMzPvQhaN/ZR7ZCRggFJsl+yfhg7E4QsCAFrxmCMxmzm0RA0PmyPiyg280XCFnsi8iUJck3ipyJFemyiw2OdWuphwO96R/XudAoU4jdZ6JFD+F/RvnO7Uf9ftSG0MNw4dr16uabBrxIkOUZvViHJzyJGTNeHXTq9wyhwjDAV1hG7fSx7sqgM5pavbGb2c1rSe3v1K5toUzLV8xH/suqtFS3Ozea/njHBRoARkdwfblSNkydKLFC75b8e338qpjYntcyD5ZA+9rA4r3oJZLl1Qtd2RKoNhjqu4f70W/oSDM7a1FMUAh0Y9NZXQCaDsy/H8sTlaDJH6ypx3CPYJvZdbS7zE7nRsO6/3C3N3OulwQ/b0JtS5dpo7yELY/WMCFlPeuwwUA5n0lghstm+9uCeGqjV38FMe+uiEppKU2PCWtBszRwIts6+JU9nQf5VK7w75BH5QP5nXsgm/CzJuyjH7Vzncg0UrYC4P/YZpAxAZiu7CW8dUa0w4vkhaCRvHwvNWxKeSXnVnAuLH"

This comment has been minimized.

@Zil0

Zil0 Jul 31, 2018

Contributor

We probably don't want to keep those super-long strings, right?

This comment has been minimized.

@uhoreg

uhoreg Jul 31, 2018

Member

Yeah, for examples, keep it short.

@uhoreg uhoreg referenced this pull request Jul 31, 2018

Open

Federation API r0 megathread #1464

35 of 35 tasks complete

@Zil0 Zil0 force-pushed the Zil0:e2e_doc branch from 8354d4d to 26d97a6 Aug 1, 2018

@uhoreg uhoreg referenced this pull request Aug 6, 2018

Closed

Device List Update Stream #1212

@KitsuneRal KitsuneRal referenced this pull request Aug 12, 2018

Merged

Support oneOf #1503

@turt2live turt2live added this to Needs review in August 2018 r0 Aug 14, 2018

@uhoreg

Looks pretty good so far, as far as I can tell. My suggestions are mostly minor wording changes.

session_id:
type: string
description: |-
The ID of the session used to encrypt the message. Only present with

This comment has been minimized.

@uhoreg

uhoreg Aug 16, 2018

Member

Maybe say "Required with Megolm" rather than "Only present with Megolm"?

algorithm:
type: string
description: |-
The encryption algorithm the key in this event is to be used with.

This comment has been minimized.

@uhoreg

uhoreg Aug 16, 2018

Member

add enum: ["m.megolm.v1.aes-sha2"]

description: |-
This event type is used to exchange keys for end-to-end encryption. Typically
it is encrypted as an ``m.room.encrypted`` event.

This comment has been minimized.

@uhoreg

uhoreg Aug 16, 2018

Member

Also mention that they're typically sent as to-device events.

description: The room where the key is used.
session_id:
type: string
description: The ID of the session holding the key.

This comment has been minimized.

@uhoreg

uhoreg Aug 16, 2018

Member

"The ID of the session that the key is for."?

Messaging algorithm names use the extensible naming scheme used throughout this
specification. Algorithm names that start with ``m.`` are reserved for
algorithms defined by this specification. Implementations wanting to experiment
with new algorithms are encouraged to pick algorithm names that start with

This comment has been minimized.

@uhoreg

uhoreg Aug 16, 2018

Member

I'd say "Implementations wanting to experiment with new algorithms must be uniquely globally namespaced following Java's package naming conventions." (with the wording copied from https://matrix.org/docs/spec/#events)

description: The Olm message type.
description: |-
Normally required. The encrypted content of the event. Either directly
the encrypted payload in the case of a Megolm event, or a map from the

This comment has been minimized.

@uhoreg

uhoreg Aug 16, 2018

Member

Maybe replace the last sentence with a pointer to the "Messaging Algorithms" section, because I think it needs more than what fits in the description. Perhaps "For more details, see ..."

This comment has been minimized.

@Zil0

Zil0 Aug 16, 2018

Contributor

Appended rather than replaced, since this last part was used to explain the Type column.

This comment has been minimized.

@richvdh

richvdh Aug 16, 2018

Member

"directly" doesn't really make sense here. You could say "Either the encrypted payload itself, in the case of...", or just get rid of "directly".

.. Note::
When Bob and Alice share a room, with Bob tracking Alice's devices, she may leave
the room and then add a new device. Bob expects to be notified of this change, but

This comment has been minimized.

@uhoreg

uhoreg Aug 16, 2018

Member

I'm not sure that "Bob expects to be notified" is accurate or necessary. I'd just say "Bob will not be notified of thes change as he doesn't share a room anymore with Alice."

room again, Bob has an out-of-date list of Alice's devices. In order to address
this issue, Bob's homeserver will add Alice's user ID to the ``changed`` property of
the ``device_lists`` field, thus Bob will update his list of Alice's devices as part
of its normal processing. Note that Bob can also be notified when he stops sharing

This comment has been minimized.

@uhoreg

uhoreg Aug 16, 2018

Member

The "its" in "its normal processing" should probably be changed to "his" as it refers to Bob.

========= ========= =============================================
.. NOTE::
A user is added to ``changed`` everytime they join a room where the client is.

This comment has been minimized.

@uhoreg

uhoreg Aug 16, 2018

Member

"every time" should be two words, and maybe add ", and vice versa" to the end of this sentence. I don't really understand this paragraph, though.

This comment has been minimized.

@Zil0

This comment has been minimized.

@uhoreg

uhoreg Aug 16, 2018

Member

OK, I see. It's a bit unclear the way it's worded right now, but it's a bit tricky to word correctly. Maybe something like: "For optimal performance, Alice should be added to changed in Bob's sync only when she adds a new device, or when Alice and Bob now share a room but didn't share any room previously. However, for the sake of simpler logic, a server may add Alice to changed when Alice and Bob share a new room, even if they previously already shared a room."

@@ -266,9 +508,18 @@ device_lists DeviceLists Optional. Information on e2e device updates. Note:
Parameter Type Description
========= ========= =============================================
changed [string] List of users who have updated their device identity keys
since the previous sync response.
since the previous sync response, or who have just joined

This comment has been minimized.

@uhoreg

uhoreg Aug 16, 2018

Member

Bob will also be added to Alice's the changed set if Alice joins a new room with Bob, not just if Bob joins a new room with Alice. So I'd change it to: "List of users who have updated their device identity keys, or who now share an encrypted room with the client since the previous sync response."

This comment has been minimized.

@Zil0

Zil0 Aug 16, 2018

Contributor

Right, don't think I ever noticed this :/

* Curve25519 for the initial key agreement.
* HKDF-SHA-256 for ratchet key derivation.
* Curve25519 for the DH ratchet.

This comment has been minimized.

@uhoreg

uhoreg Aug 16, 2018

Member

I'm not sure if "DH ratchet" is the right term.

This comment has been minimized.

@richvdh

richvdh Aug 16, 2018

Member

https://signal.org/docs/specifications/doubleratchet/#diffie-hellman-ratchet does describe the root key ratchet as a "Diffie-Hellman ratchet", so I'd say it's a reasonable term.

On the other hand maybe it would be clearer to say "Curve25519 for the root key ratchet" / "HMAC-SHA-256 for the chain key ratchet".

<shrug> I don't suppose it really matters. The point is simply to list the encryption primitives so that if someone finds a vulnerability in SHA-256 tomorrow you can see what's affected.

@Zil0 Zil0 force-pushed the Zil0:e2e_doc branch from ed28080 to dd578d0 Aug 16, 2018

@Zil0

This comment has been minimized.

Contributor

Zil0 commented Aug 16, 2018

Applied your suggestions and cleaned-up commit history.

@richvdh richvdh self-requested a review Aug 16, 2018

@richvdh

looks awesome! A couple of niggles it would be great to fix if you get a minute

type: integer
description: The Olm message type.
description: |-
Normally required. The encrypted content of the event. Either directly

This comment has been minimized.

@richvdh

richvdh Aug 16, 2018

Member

good question.The spec now says "Required. Normally required."

Let's get rid of "Normally required."

description: The Olm message type.
description: |-
Normally required. The encrypted content of the event. Either directly
the encrypted payload in the case of a Megolm event, or a map from the

This comment has been minimized.

@richvdh

richvdh Aug 16, 2018

Member

"directly" doesn't really make sense here. You could say "Either the encrypted payload itself, in the case of...", or just get rid of "directly".

Normally required. The encrypted content of the event. Either directly
the encrypted payload in the case of a Megolm event, or a map from the
recipient Curve25519 identity key to ciphertext information in the case
of an Olm event. For more details, see `Messaging Algorithm Names`_.

This comment has been minimized.

@richvdh

richvdh Aug 16, 2018

Member

I think you want "Messaging Algorithms" rather than "Messaging Algorithm Names".

"origin_server_ts": 1476648761524,
"sender": "@example:localhost",
"type": "m.room.encrypted",
"unsigned": {

This comment has been minimized.

@richvdh

richvdh Aug 16, 2018

Member

most of the other event examples don't include "unsigned", and I think that's a good pattern to follow, because it will only appear when you use certain APIs to retrieve the event.

Bob). She will not send sensitive material to that device, and cannot trust
messages apparently received from it.
* Alice may choose to skip the device verification process. She is not be able

This comment has been minimized.

@richvdh

richvdh Aug 16, 2018

Member

"She is not able"

* Curve25519 for the initial key agreement.
* HKDF-SHA-256 for ratchet key derivation.
* Curve25519 for the DH ratchet.

This comment has been minimized.

@richvdh

richvdh Aug 16, 2018

Member

https://signal.org/docs/specifications/doubleratchet/#diffie-hellman-ratchet does describe the root key ratchet as a "Diffie-Hellman ratchet", so I'd say it's a reasonable term.

On the other hand maybe it would be clearer to say "Curve25519 for the root key ratchet" / "HMAC-SHA-256 for the chain key ratchet".

<shrug> I don't suppose it really matters. The point is simply to list the encryption primitives so that if someone finds a vulnerability in SHA-256 tomorrow you can see what's affected.

The type and content of the plaintext message event are given in the payload.
We include the room ID in the payload, because otherwise the homeserver would

This comment has been minimized.

@richvdh

richvdh Aug 16, 2018

Member

this is only actually true if we were sending in-room events; if it's a to-device event it's omitted. We should probably say that.

This comment has been minimized.

@Zil0

Zil0 Aug 17, 2018

Contributor

Since we're never sending Olm in-room event, I think it would just be confusing. I'll move the sentence to the Megolm subsection where it should have been in the first place.

A user is added to ``changed`` every time they join a room where the client is,
and vice versa. Instead, for optimal performance, ``changed`` could be populated
only when a user starts sharing a room with the client on the condition they
didn't before. This may be hard to garantee, hence this behavior is not enforced.

This comment has been minimized.

@richvdh

richvdh Aug 16, 2018

Member

guarantee.

And "behaviour", since we prefer British English

This comment has been minimized.

@Zil0

Zil0 Aug 17, 2018

Contributor

I'll change this according to what @uhoreg has written here #1284 (comment).

@Zil0 Zil0 force-pushed the Zil0:e2e_doc branch from dd578d0 to e20b289 Aug 17, 2018

@Zil0 Zil0 force-pushed the Zil0:e2e_doc branch from fa5c861 to 98e2e8d Aug 18, 2018

@Zil0

This comment has been minimized.

Contributor

Zil0 commented Aug 18, 2018

Rebased on master and made an attempt at newsfragment.

Apparently newsfragments as specified in Towncrier's doc use issue numbers but we are using PR numbers, so issue_format in https://github.com/matrix-org/matrix-doc/blob/master/changelogs/client_server/pyproject.toml should probably be changed to /pull/.

@turt2live

This comment has been minimized.

Member

turt2live commented Aug 18, 2018

Looks like your feature fragment got an rst extension.

A bit of background on why /issues/ was chosen: in the unlikely/rare case we need to reference something by issue then we don't need to go spelunking in the depths of the build process. Github does support an automatic redirect, which is being (ab)used here.

@richvdh richvdh merged commit fe3f3b6 into matrix-org:master Aug 20, 2018

4 checks passed

ci/circleci: build-docs Your tests passed on CircleCI!
Details
ci/circleci: build-swagger Your tests passed on CircleCI!
Details
docs Click details to preview the HTML documentation.
Details
swagger Click to preview the swagger build.
Details

August 2018 r0 automation moved this from Reviewer approved to Done (this list will be incomplete) Aug 20, 2018

@richvdh

This comment has been minimized.

Member

richvdh commented Aug 20, 2018

\o/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment