Support processing of large payloads in RAMF messages #14

gnarea · 2019-07-07T15:02:19Z

Executive summary

Due to limitations in a 3rd-party library and to minimise complexity in the initial implementation of the protocol suite, messages are limited to payloads of up to 8 MiB to prevent larger messages from exhausting the memory available in couriers and gateways processing such messages. Until this is fixed, Relaynet service providers wishing to send larger messages will have to chunk the data and assemble it at the receiving end.

The objective of this issue to identify and implement a solution that makes it very easy to send and receive large messages, without requiring the service provider to do any data chunking -- it'd be done for them behind the scenes.

Description

This 8 MiB limit is partly arbitrary and could be slightly increased as a stopgap, but RAM and swap memory are limited resources, so no system can hold arbitrarily large values in memory. In addition to supporting payloads small enough that can be held in memory, we should support large payloads by streaming them.

At the moment, the message payload is contained in a CMS EnvelopedData value, which is in turn contained in a CMS SignedData value. We could do serialisation and deserialisation of CMS values using streams, which BouncyCastle supports but PKI.js doesn't, so generally speaking we'd have to either add streaming support to PKI.js (and other CMS libraries in platforms to be supported in the future) or simply detach the payload from CMS values.

Potential solutions

The underlying implementation will be irrelevant to service providers and couriers: They'll still get to produce, consume and transport potentially large messages. The options below explore how, at the network level, we could achieve that.

Option A: Process RAMF messages as streams

The payload size limit could be very large (like the original 4 GiB).

This would require the following changes to the RAMF spec:

Detach the ciphertext (encryptedContentInfo.encryptedContent) from the CMS EnvelopedData value.
Place the CMS EnvelopedData value before the ciphertext so the symmetric key can be available before reading the ciphertext.
Reinstate the signature hashing algorithm as a RAMF message field so that a consumer can start computing the digest as the message is being received:

The algorithm MUST be valid per RS-018. This value MUST be DER-encoded as an ASN.1 Object Identifier; for example, SHA-256 (OID 2.16.840.1.101.3.4.2.1) would be encoded as 06 09 60 86 48 01 65 03 04 02 01. It MUST also have a fixed length of 16 octets, right padded with 0x00.

This will involve partially reverting d3bc4fa

We should also consider the implications of supporting large payloads when producing and verifying the digital signature. Algorithms like Ed25519/Ed448 can't work with streams, so we'll have to use their pre-hash variant instead.

This is the ideal solution in my opinion because I think it'd be easier to implement, but I know it's frown upon in the context of asynchronous messaging where messages are supposed to be small.

Option B: Chunk the RAMF messages

Endpoints and gateways would be responsible for splitting large messages into small chunks, in a way that's seamless to the service provider.

This is "better" or "more idiomatic" than Option A from an asynchronous messaging perspective, but I think will make it harder to implement due to the complexity in putting the pieces together at the receiving end.

Option C: Optionally Detach the Payload

When the payload is too big, the RAMF payload would be just a reference to an external value and its (SHA-256) digest. If the payload is to be encrypted, it'd be encrypted with a symmetric key that the recipient could decrypt from their RecipientInfo (as is always the case with EnvelopedData values).

This is "the idiomatic approach" from an asynchronous messaging perspective, and should be easier to implement than Option B because there are only two pieces to put together: The RAMF message and its detached payload.

Other considerations

PKI.js (used by relaynet-core-js) doesn't support stream-based plaintexts (see: Support streams in CMS EnvelopedData and SignedData PeculiarVentures/PKI.js#237). Bouncy Castle (used by awala-jvm) does support streams.
Google PubSub, which we use extensively at Relaycorp for our own deployment of Awala-related infrastructure, has a message limit of 10MB.
We may want to introduce an extension to the delivery authorization to specify the maximum size of the payload. Likewise, we could introduce another extension to control the bandwidth the sender is allowed to use in a given period of time (analogous to the rate limiting extension).
Bindings will generally have to do chunking to process large messages. That's definitely true with generic L7 protocols like plain old HTTP (PoHTTP), gRPC (CogRPC) and WebSockets (PoWebSockets), but shouldn't be a problem if we introduce purpose-built L7 protocols like PoSocket and CoSocket.
Gateways and couriers are also likely to have to do chunking behind the scenes. Especially gateways, as they use brokers like NATS to wrap/unwrap cargos.
Whilst gateways must distribute their payloads in as many cargoes as necessary anyway, they may benefit from this solution.

The text was updated successfully, but these errors were encountered:

See #14

So that it can easily fit in memory. See: #14

gnarea added a commit that referenced this issue Jul 7, 2019

RAMF: Add link to #14

3c207e1

gnarea mentioned this issue Jul 7, 2019

Implement RAMF serialisation and deserialisation as streams relaycorp/relaynet-core-js#21

Open

gnarea changed the title ~~Support large payloads in RAMF messages~~ Support encryption and decryption of very large payloads in RAMF messages Jul 7, 2019

gnarea changed the title ~~Support encryption and decryption of very large payloads in RAMF messages~~ Support processing of very large payloads in RAMF messages Jul 11, 2019

gnarea added a commit that referenced this issue Jul 11, 2019

RAMF: Drop partial support for very large payload

d3bc4fa

See #14

gnarea added the spec-ramf RAMF label Dec 4, 2019

gnarea mentioned this issue Dec 13, 2019

Consider whether use of CMS (or EnvelopedData specifically) is necessary #22

Closed

gnarea added the pkijs-compromises Compromises made due to the use of PKI.js in the reference NodeJS implementation label Dec 26, 2019

gnarea added a commit that referenced this issue Feb 14, 2020

RAMF: Decrease payload size limit from 4 GiB to 8 MiB

b0e6a4d

So that it can easily fit in memory. See: #14

gnarea changed the title ~~Support processing of very large payloads in RAMF messages~~ Support processing of large payloads in RAMF messages Feb 16, 2020

gnarea pinned this issue Feb 16, 2020

gnarea mentioned this issue Mar 22, 2020

Signing large RAMF payloads takes too long and requires too much memory #57

Open

gnarea unpinned this issue May 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support processing of large payloads in RAMF messages #14

Support processing of large payloads in RAMF messages #14

gnarea commented Jul 7, 2019 •

edited

Loading

Support processing of large payloads in RAMF messages #14

Support processing of large payloads in RAMF messages #14

Comments

gnarea commented Jul 7, 2019 • edited Loading

Executive summary

Description

Potential solutions

Option A: Process RAMF messages as streams

Option B: Chunk the RAMF messages

Option C: Optionally Detach the Payload

Other considerations

gnarea commented Jul 7, 2019 •

edited

Loading