Remote Attachment Content Type Proposal #18

nakajima · 2023-02-20T01:36:37Z

A suggestion for how we can handle larger messages but maintain E2E encryption.

neekolas · 2023-02-20T16:48:15Z

XIPs/xip-17-remote-attachment-content-type-proposal.md

+}
+```
+
+The content of the encoded message is a URL that points to encoded Envelope data that would not be limited by the 1MB limit, since it's not being sent on the network. By using the existing `Envelope` format, we can still ensure E2E encryption for this data.


Would be helpful to specify the encryption scheme. How do we pass keys to the reader? What types of keys are allowed?

It would use the same encryption we use for publishing envelopes to the network, but instead of publishing to the network it would be to some other host.

Which type, though? For V1 messages we use X3DH and for V2 messages we use random TopicKeys. In either case, we would need to figure out where to store the salt and the nonce. Those values are unique and random, and we wouldn't want to re-use the same values that went into the message since nonce re-use is an attack vector that could lead to other messages being compromised.

So, we have a few options:

In the message payload, include the nonce and salt and use whatever secret was used to encrypt the message. The remote content would just be the encrypted bytes of the message

Keep the message payload light, and just include a link to the content. The remote content would need to be formatted as xmtp.Ciphertext proto message. Re-use the shared secret for the message (either V1 X3DH or V2 TopicKey)

Use an ephemeral key. The message payload would include the key (maybe 32 bytes of randomness), nonce, and salt. The remote message would be the encrypted payload only.

Of those options (those are just the three off the top of my head), I'd lean towards 3. Gives the most security, since each payload is encrypted with a different key, and means that we don't have to decode another protobuf to read the message. And it's fully self-describing. You don't need to know anything about the conversation the message came from the decrypt the content.

@jazzz @richardhuaaa @michaelx11 curious what the ProtoPod thinks

I think maybe I'm missing something with how our encryption works. My thought for the remote payload is that it's an xmtp.Envelope with a topic, timestamp and message. In the same way we can send those as push notifications, I was thinking we could also make them the remote payload. It was my understanding that anyone can query for those but the only way to decrypt it is if you have the conversation key material as well (my thinking was that we'd be using the V2 message format).

I'd lean towards option 3. where:

PointerMessage contains a symmetricKey and all the parameters to decrypt it.

Server contains the content payload only.

I am weary of xmtp.Envelope as the abstraction interface.

All possible Envelopes stored remotely requires exponentially more testing.

"What happens if a Key Revocation message is stored off network?",

"what happens if an Invite is no longer accessible?"

"What if a remoteContent points to another removeContent?"

Coupling these messages to a specific XMTP Message version would make future migrations more complicated.

I hear you @jazzz on the potential complexity of having the remote payloads be some sort of Content Type. I'm unsold as xmtp.Envelope being the right wrapper. But an encrypted EncodedContent payload will make things a lot more usable for developers than a pile of unstructured bytes. Especially if these are messages that are meant to be displayed inline, where we would want to have the regular set of tools for decoding available that we offer for regular messages.

Otherwise you are going to have one flow to handle remote images, and a totally different decoding flow for on-network images. That seems less than ideal. And you lose things like fallback content for messages that can't be decoded by the client.

"What happens if a Key Revocation message is stored off network?"

Since we don't have Key Revocation messages yet, hard for me to answer conclusively. Does seem like we may want certain meta-messages to live outside the normal message encoding flow and be typed based on contentTopic. We do this with Invites currently, which are not a ContentType at all.

"what happens if an Invite is no longer accessible?"

Are you asking what happens if someone sends a conversation invite as a remote message? That should not be possible, as invites are separate from messages and are not EncodedContent or sent in message channels.

"What if a remoteContent points to another removeContent?"

Both the spec and the example implementation need a mitigation for this kind of attack. Seems like we can have some sort of recursion limit in the SDK and throw an error after N loops.

But an encrypted EncodedContent payload will make things a lot more usable for developers than a pile of unstructured bytes.

I'll start by saying I can get behind this. EncodedContent payload seems like the an acceptable place an alevates many of my concerns.

To round out my thoughts: If the xmtp.Envelope wrapper is excluded then I see two options:

OptionA -- Making remoteContent a composable type Leads to maximum utility. Any type can now be stored remotely with no changes required.

OptionB -- Making RemoteContent an implementation pattern for other type means it could be handled on a type by type basis. Larger types (Images, Video, File) could support remoteContent but this is not required of all types.

My argument for B is not particularly strong: There are only few types that could really benefit from OptionA, yet potential remote storage may become a burden for future ContentType Developers . Any future ContentTypes will now need to be evaluated against a wider threat model, as independent ContentServers are outside of the protection of the Network consensus protocol. I don't have any hard examples as there are few formal XIP ContentTypes defined. Given this Option A is compelling.

We do this with Invites currently, which are not a ContentType at all.

Great point.

That should not be possible

Great point. Despite the payload being able to hold an invite, it would not be possible to get an EncodedContent payload into the invite processing flow.

I'm with @jazzz in that I'm hesitant to define a RemoteContent type that can wrap any normal XMTP content type. I worry about the possible attack surface caused by remote fetches such as SSRF (e.g. attacker puts 127.0.0.1:[some important port]/[some sensitive path]). We'd have to consider the cross product of All RemoteContent actions X All ContentTypes. I'd be more comfortable if we keep RemoteContent's capabilities very well described and limited. For example: must be a GET over HTTPS.

neekolas · 2023-02-20T16:50:51Z

XIPs/xip-17-remote-attachment-content-type-proposal.md

+
+The content of the encoded message is a URL that points to encoded Envelope data that would not be limited by the 1MB limit, since it's not being sent on the network. By using the existing `Envelope` format, we can still ensure E2E encryption for this data.
+
+The envelope itself can contain any type of message already allowed on the network. The reference implementation uses the `Attachment` type from XIP 15, but if we introduce richer types for things like images or video, those would work here as well, since clients should be able to understand those types once they're settled.


How is the envelope serialized? Protobuf?

I think protobuf works since clients already need to speak it.

neekolas · 2023-02-20T21:33:29Z

One thing I'm wondering is whether we actually want the remote payload to be able to be stored as a full EncodedContent with XIP-5 compatible content-types. That allows you to store any message compatible with XMTP remotely and clients already have a flexible set of decoding options. Of course, one of those content types could be the proposed attachment content type that relies on MIME, but you could also use other XIP-5 content types that have built in decoders.

Not sure I've fully thought through the implications, but it seems like it would let us use all the content decoding tools we already have. Just with much larger payloads.

nakajima · 2023-02-20T22:22:39Z

One thing I'm wondering is whether we actually want the remote payload to be able to be stored as a full EncodedContent with XIP-5 compatible content-types.

Ya, this was my thinking.

nakajima · 2023-02-21T01:35:17Z

Added a reference implementation example PR in the Inbox iOS app to try to show what I've got in mind here: ephemeraHQ/xmtp-inbox-ios#83

jazzz

As currently written this seems problematic.

If this path is desired I would suggest

The remote content type should point to contentbytes only, not another message.
Server stored content should be completely independent from XMTP -- e.g. Raw Image bytes. Message Versions and content versions should be able to increment independently without consideration.
Server stored content should be encrypted using a onetime symmetric key.
RemoteContent type should contain a digest of the expected stored content.
RemoteContent should require a user action prior to fetching from server

jazzz · 2023-02-21T16:43:01Z

XIPs/xip-17-remote-attachment-content-type-proposal.md

+
+## Security considerations
+
+Having arbitrary data anywhere can be risky, but this is already the case for our messages, since there's no server side validation of message contents (besides size). The same protections we have now would be in place while the same pitfalls we have would still be there as well.


The same protections we have now would be in place

I'm not sure this is true, as content servers are not held to the same requirements as a Node is.

What is stopping a server from serving different messages to different participants? Currently XMTP uses immutable shared history which makes this attack vector difficult. In this case the pointerMessage should also contain a digest of the content stored on the server to lock the message at the time of sending.

jazzz · 2023-02-21T16:48:12Z

XIPs/xip-17-remote-attachment-content-type-proposal.md

+
+Having arbitrary data anywhere can be risky, but this is already the case for our messages, since there's no server side validation of message contents (besides size). The same protections we have now would be in place while the same pitfalls we have would still be there as well.
+
+## Copyright


Is there any guidance on how to preserve user privacy? Whats stopping a malicious sender from using a FB "pixel" like approach to tracking user actions?

Are the attachments auto-loaded? are users required to "Click to Load" messages from untrusted servers?

I think we should strongly encourage developers using this content type to implement click to load.

Easy to imagine someone sending 100 messages with 1gb remote attachments as an attack.

At the app level you could then store an "always download attachments from this sender" preference

jazzz · 2023-02-21T17:16:14Z

XIPs/xip-17-remote-attachment-content-type-proposal.md

+}
+```
+
+The content of the encoded message is a URL that points to encoded Envelope data that would not be limited by the 1MB limit, since it's not being sent on the network. By using the existing `Envelope` format, we can still ensure E2E encryption for this data.


I'd lean towards option 3. where:

PointerMessage contains a symmetricKey and all the parameters to decrypt it.

Server contains the content payload only.

I am weary of xmtp.Envelope as the abstraction interface.

All possible Envelopes stored remotely requires exponentially more testing.

"What happens if a Key Revocation message is stored off network?",

"what happens if an Invite is no longer accessible?"

"What if a remoteContent points to another removeContent?"

Coupling these messages to a specific XMTP Message version would make future migrations more complicated.

jazzz · 2023-02-21T17:18:01Z

XIPs/xip-17-remote-attachment-content-type-proposal.md

+
+## Security considerations
+
+Having arbitrary data anywhere can be risky, but this is already the case for our messages, since there's no server side validation of message contents (besides size). The same protections we have now would be in place while the same pitfalls we have would still be there as well.


Is there a strategy for avoiding circular loops? If the RemoteContent contains an envelope, can it also contain a another RemoteContent object?

Ooh, that's a very good point. Would be an easy DOS vector against clients if protections aren't in place.

nakajima · 2023-02-21T17:34:36Z

@jazzz Thanks for the feedback! I'll take another pass at this and hopefully address all your concerns.

nakajima · 2023-02-21T18:28:28Z

Updated the proposal to go with option 3, adding a required checksum field as well. I also added a note about potential pixel tracker concerns.

jazzz · 2023-02-21T18:52:45Z

XIPs/xip-17-remote-attachment-content-type-proposal.md

+```js
+{
+	// The SHA256 hash of the remote content
+	checksum: string,


checkSums are usually used in the context of data corruption. In this context the goal is message Integrity: Can we use "contentDigest" (or something similar) to reflect that?

Is SHA256 an ok hash to use?

Could be a good use-case for CIDs from https://github.com/multiformats/js-multiformats (similar to IPFS)

Curious since I'm not super familiar with IPFS, what would that CIDs get us that SHA256 doesn't?

Multihash/CID is kind of just an abstraction around different hashing methods for content. This is how IPFS docs describe it: https://docs.ipfs.tech/concepts/content-addressing/

jazzz · 2023-02-21T18:57:17Z

XIPs/xip-17-remote-attachment-content-type-proposal.md

+```js
+{
+  authorityId: "xmtp.org"
+  typeId: "remoteAttachment"


Thoughts on limiting this specification to the scope of StaticContent only?

Currently it is ambiguous whether the ContentServers can return Dynamic content. By explicitly only supporting static content then caching and Post-Delivery-storage approaches can avoid complications from cache Invalidation etc.

Adding of the contentDigest mostly achieves this anyways, being explicit would make things cleaner

What do you think of maybe remoteStaticContent?

michaelx11 · 2023-02-21T19:38:11Z

Jumping in on this a bit late with some dumb questions. My personal focus when evaluating changes like these is always on 1) clarity and 2) easily analyzable security properties.

In that sense, two questions:

If we just bump our message limit up to 10MB, do we need a remote content type at all for our current needs i.e. small videos, photos, etc?
If we separate the RemoteContent concept into its own thing, I think we run the risk of ambiguity about how to fetch things. In my mind, the most unambiguous way to handle this is to have a new Content-Type for each remote use case.

Example:

For webm videos meant to be fetched over HTTPS, we can have an xmtp.org/https_webm:1.0
For webm videos meant to be fetched over IPFS: xmtp.org/ipfs_webm:1.0

The idea here is that given just a content-type, a developer knows exactly how to fetch and interpret the content. Additionally, organizations can allowlist content-types if there are concerns around remote content (e.g. IP leakage, read receipt leakage, malicious payload vectors). Unifying at the content-type level will avoid folks having to do logic like "allow remote content, but only for these content-types, and only fetched in these ways". The logic would instead be "if content type not in this allowlist, do not process"

nakajima · 2023-02-21T19:54:35Z

If we just bump our message limit up to 10MB, do we need a remote content type at all for our current needs i.e. small videos, photos, etc?

I think partially it does, but then clients need to contend with huge messages being fetched and decoded unless they use a separate topic for these types of things.

If we separate the RemoteContent concept into its own thing, I think we run the risk of ambiguity about how to fetch things. In my mind, the most unambiguous way to handle this is to have a new Content-Type for each remote use case.

I think maybe the case could be made for separate remote content types based off how they're fetched, like one for HTTP vs one for IPFS. I still think the remote payloads being EncodedContent keeps things simplest for clients though. We could also add a content type field to the metadata that gets sent and ensure it matches that of the remote EncodedContent. Clients could use this to ignore remote messages that they aren't cool with.

nakajima · 2023-02-21T20:46:11Z

Thinking about it some more, if an organization ever did want to block certain types of content, they could just block all remoteStaticContent messages and if they still had a need for remote content, they could just make their own content type.

nakajima · 2023-02-21T21:45:10Z

SDK implementation updated: xmtp/xmtp-ios#68
Client usage updated: ephemeraHQ/xmtp-inbox-ios#83

Note that neither are particularly production ready yet (need to handle a bunch more cases) but I was able to get messages sending with encrypted remote payloads and receive them in the app:

nakajima · 2023-02-21T22:13:32Z

Are there any more pressing security concerns with this approach or would it be alright to ship and learn? Just trying to avoid making perfect the enemy of the good while still keeping things safe for folks.

michaelx11 · 2023-02-21T22:19:37Z

I'm okay with shipping the proposal but with a few fixed URL schemes to start, such as https://. We should explicitly state this in the proto/spec if possible, since schemes like file:// should be absolutely not allowed. Any time code is processing an untrusted URL there is a rich world of possible attacks. We can do our developers a favor by making these things (like expected scheme) an explicit part of the protocol. Potentially even pulling scheme into a specific enum, then only keeping the scheme-less URI.

nakajima · 2023-02-21T22:26:37Z

@michaelx11 What do you think of making the scheme part of the parameters? To start the only valid value would be https and SDKs could ensure that URLs only start with that. Later if we wanted to add IPFS in a newer version of this content type, we could put it there?

michaelx11 · 2023-02-21T22:29:42Z

Sounds good to me! We can make it an enum and add more supported schemes as we go

nakajima · 2023-02-21T22:31:34Z

Rad, I updated the proposal to include the scheme parameter as well as a note that the content must be acccessed via GET request.

nakajima · 2023-02-22T01:23:04Z

The reference PRs have been updated to include the mandatory scheme parameter, enforcing https:// and GET requests only.

nakajima · 2023-02-22T16:55:34Z

If we just bump our message limit up to 10MB, do we need a remote content type at all for our current needs i.e. small videos, photos, etc?

I think partially it does, but then clients need to contend with huge messages being fetched and decoded unless they use a separate topic for these types of things.

Given @nmalzieu's comment here, I actually don't think bumping the message limit to 10 mb is the best course of action anymore.

neekolas · 2023-02-23T23:13:16Z

I like where this landed

jazzz

@nakajima great job incorporating all the feedback on this. I think this has ended up in a good place.

Excited to see what developers do with this new power

🚢

jazzz · 2023-02-23T23:48:21Z

XIPs/xip-17-remote-attachment-content-type-proposal.md

+```
+
+The content of the encoded message is a URL that points to an encrypted `EncodedContent` object. The content must be accessible by an HTTP `GET` request to the URL. The `EncodedContent`'s content type MUST not be another `RemoteAttachment`.
+


content type MUST not be another RemoteAttachment.

I am glad to see this here.

nakajima · 2023-02-23T23:59:15Z

Thanks everyone for your feedback!

Create xip-17-remote-attachment-content-type-proposal.md

a47c029

nakajima mentioned this pull request Feb 20, 2023

Attachment content type proposal #15

Merged

neekolas reviewed Feb 20, 2023

View reviewed changes

Add client usage reference

4de3ad6

jazzz reviewed Feb 21, 2023

View reviewed changes

nakajima added 2 commits February 21, 2023 10:25

Update xip-17-remote-attachment-content-type-proposal.md

b81fa9d

Update xip-17-remote-attachment-content-type-proposal.md

6ef5f85

jazzz reviewed Feb 21, 2023

View reviewed changes

Update xip-17-remote-attachment-content-type-proposal.md

79d9fbc

Update xip-17-remote-attachment-content-type-proposal.md

513278a

Why do u hate tabs linter

cc2835a

neekolas approved these changes Feb 23, 2023

View reviewed changes

jazzz approved these changes Feb 23, 2023

View reviewed changes

nakajima merged commit 624890e into xmtp:main Feb 23, 2023

nakajima mentioned this pull request May 8, 2023

Attachments proposal xmtp/xmtp-ios#62

Closed


		The content of the encoded message is a URL that points to encoded Envelope data that would not be limited by the 1MB limit, since it's not being sent on the network. By using the existing `Envelope` format, we can still ensure E2E encryption for this data.

		The envelope itself can contain any type of message already allowed on the network. The reference implementation uses the `Attachment` type from XIP 15, but if we introduce richer types for things like images or video, those would work here as well, since clients should be able to understand those types once they're settled.


		## Security considerations

		Having arbitrary data anywhere can be risky, but this is already the case for our messages, since there's no server side validation of message contents (besides size). The same protections we have now would be in place while the same pitfalls we have would still be there as well.


		Having arbitrary data anywhere can be risky, but this is already the case for our messages, since there's no server side validation of message contents (besides size). The same protections we have now would be in place while the same pitfalls we have would still be there as well.

		## Copyright

Remote Attachment Content Type Proposal #18

Remote Attachment Content Type Proposal #18

Conversation

nakajima commented Feb 20, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

neekolas Feb 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

neekolas Feb 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

neekolas commented Feb 20, 2023

nakajima commented Feb 20, 2023

nakajima commented Feb 21, 2023

jazzz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nakajima commented Feb 21, 2023

nakajima commented Feb 21, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

snormore Feb 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michaelx11 commented Feb 21, 2023

nakajima commented Feb 21, 2023

nakajima commented Feb 21, 2023

nakajima commented Feb 21, 2023

nakajima commented Feb 21, 2023

michaelx11 commented Feb 21, 2023 • edited Loading

nakajima commented Feb 21, 2023

michaelx11 commented Feb 21, 2023

nakajima commented Feb 21, 2023

nakajima commented Feb 22, 2023

nakajima commented Feb 22, 2023

neekolas commented Feb 23, 2023

jazzz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nakajima commented Feb 23, 2023

neekolas Feb 20, 2023 •

edited

Loading

neekolas Feb 21, 2023 •

edited

Loading

snormore Feb 21, 2023 •

edited

Loading

michaelx11 commented Feb 21, 2023 •

edited

Loading