Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add authenticated_data to the mls message. #208

Merged
merged 5 commits into from
Oct 11, 2019
Merged

Conversation

psla
Copy link
Contributor

@psla psla commented Sep 16, 2019

As previously discussed, this is a proposal to add AAD to the application messages.

There are many motivations to do this:
• It's a 'cheap' feature to add (although it could potentially be misused)
• It avoids duplicate content: if there is a content that needs to be authenticated, but also needs to be visible to the server, the only solution today is to repeat it in the header (to the server) and then in the encrypted body.
• Modern ciphers already provide support for AAD, and MLS takes advantage of this. In fact most(all?) of the fields in the header are authenticated already.
• This field is optional(as in: can be empty), which means that it doesn't have to be used in the implementation.

The primary benefit is for the server to have access to the fields that are otherwise authenticated, but are not part of MLS message. Typically, the server has another encryption mechanism with the client (e.g. TLS) and as such client-server communication is already secure. As a matter of fact, handshake messages can already be transported in plaintext (in case server needs to examine their content), but application messages are not allowed to have any plaintext content, even though server may need to examine some metadata as well.

A couple of thoughts that may be worth discussing:

  • what should be the max size of authenticated_data (I assumed 32KB, just like application message)
  • whether authenticated data should be offered only as part of application, or both handshake message and application message. It is fair to consider it only for an application message, though for simplicity I added it to both message types. Happy to change it based on general preferences.

@@ -1273,6 +1273,7 @@ struct {
opaque application_data<0..2^32-1>;
}

opaque authenticated_data<0..2^32-1>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I would put this just below the content_type field, just to preserve parallelism (MLSCiphertext.ciphertext -> MLSPlaintext.operation/application_data)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it would avoid interleaving plaintexts and ciphertexts...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Done!

@eomara
Copy link
Collaborator

eomara commented Oct 9, 2019

@bifurcation per our discussion in London, we agreed to accept this change. you mentioned other places need to be updated, which one?

Copy link
Member

@beurdouche beurdouche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two big caveats:

  • this AAD field is AEAD authenticated but does not seem covered by the signature, while probably should.
  • we agreed to this change under the condition of a big flashing warning that the send/receive_group_operation_with_aad functions in the API MUST be separate from the normal ones at the api level. This is missing here.

@@ -1388,6 +1390,7 @@ struct {
ContentType content_type;
opaque sender_data_nonce<0..255>;
opaque encrypted_sender_data<0..255>;
opaque authenticated_content[length_of_authenticated_content];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opaque already encompass the length, I believe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was inspired by

    opaque content[length_of_content];

above (in MLSCiphertextContent). Is that one also incorrect?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can just be the same definition as in the MLSCiphertext object. The only reason for the [] notation w.r.t. content is the weird encoding of MLSCiphertextContent, which IIRC is about to get reverted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
I also renamed it to authenticated_data instead of authenticated_content, in order to keep the naming the same. (I was in the pickle here, since the data struct is called 'MLSCiphertextConterntAAD' and then the field inside is called authenticated_data. I initially thought that calling it authenticated_content will be better, but I believe it's better to keep the same name all over the place).

@@ -1273,6 +1273,7 @@ struct {
opaque application_data<0..2^32-1>;
}

opaque authenticated_data<0..2^32-1>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it would avoid interleaving plaintexts and ciphertexts...

@psla
Copy link
Contributor Author

psla commented Oct 9, 2019

this AAD field is AEAD authenticated but does not seem covered by the signature, while probably should.
I don't quite understand how to address this comment. I believe I added it to the signature, maybe I missed something

** Sign the plaintext metadata -- the group ID, epoch, sender index, and content type as well as the authenticated data and message content

Can you clarify which part is missing the AAD?

@bifurcation
Copy link
Collaborator

@beurdouche i think you're wrong here. In {{content-signing-and-encryption}}, we have the following:

The signature covers the plaintext metadata and message content, i.e., all fields of MLSPlaintext except for the signature field.

I think this is ready to go as soon as @psla fixes the two minor comments we have (order of fields and [] vs <> notation).

Copy link
Member

@beurdouche beurdouche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed that indeed, feel free to fix the <> [] and we'll merge the PR, I'll put the API recommendation in the architecture document. For the record, even though people at the interim kind of agree this mechanism was ok, I am still very skeptical about it because I am certain people will misuse it...

@beurdouche
Copy link
Member

beurdouche commented Oct 11, 2019

Note that in general, the MLSCiphertextSenderDataAAD and MLSCiphertextContentAAD are the prefix of their ciphertext which means you have to put the new authenticated_data field you added at the correct position in both of those too... : )

@psla
Copy link
Contributor Author

psla commented Oct 11, 2019

Are you saying that MLSCiphertextSenderDataAAD needs it as well? That would lead to two AAD blobs, right? One of which would be encrypted? (I am a bit fuzzy about the relation between MLSCiphertextContentAAD and encrypted_sender_data (which is encrypted MLSCiphertextSenderDataAAD, right?)

I added it there, but I don't fully understand it. Please take another look.

@bifurcation
Copy link
Collaborator

@beurdouche - I agree with @psla that it doesn't seem necessary to include the authenticated_data in the MLSCiphertextSenderDataAAD. Do you have a specific issue in mind?

@beurdouche
Copy link
Member

beurdouche commented Oct 11, 2019 via email

@bifurcation
Copy link
Collaborator

If we espouse that theory (not clear that it necessarily holds, but let's go with it for now), the question is where you draw the line between metadata and content. Compare the current state with two cases (a) and (b):

NOW              A.               B.

group_id         group_id         group_id
epoch            epoch            epoch
sender           sender           sender
content_type     content_type     content_type
============     ============     aad
content          aad              ============
signature        content          content
                 signature        signature

Basically, the ===== turns into the sender_data in the MLSCiphertext. In case (B), you would want the AAD in the SenderDataAAD; in case (A) you would not. I don't really see a reason why we need to assume case (B), so since case (A) is a bit simpler, I'm inclined to go that way. Simpler in the sense that the AAD is used as AAD exactly once, together with the protected content.

@beurdouche
Copy link
Member

beurdouche commented Oct 11, 2019 via email

@bifurcation
Copy link
Collaborator

Ah, even better illustration than mine! OK, when you put it that way, I can see the appeal of including the AAD in the sender data. Though I admit this is mostly an aesthetic point, not one that I have any security analysis to back up.

@psla would you mind moving the AAD to above the sender_data_nonce in all cases? Then I think this is ready to merge.

@psla
Copy link
Contributor Author

psla commented Oct 11, 2019

It makes sense to me too now. Thanks. I think I addressed it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants