Support processing of large payloads in RAMF messages #14
Labels
pkijs-compromises
Compromises made due to the use of PKI.js in the reference NodeJS implementation
spec-ramf
RAMF
Executive summary
Due to limitations in a 3rd-party library and to minimise complexity in the initial implementation of the protocol suite, messages are limited to payloads of up to 8 MiB to prevent larger messages from exhausting the memory available in couriers and gateways processing such messages. Until this is fixed, Relaynet service providers wishing to send larger messages will have to chunk the data and assemble it at the receiving end.
The objective of this issue to identify and implement a solution that makes it very easy to send and receive large messages, without requiring the service provider to do any data chunking -- it'd be done for them behind the scenes.
Description
This 8 MiB limit is partly arbitrary and could be slightly increased as a stopgap, but RAM and swap memory are limited resources, so no system can hold arbitrarily large values in memory. In addition to supporting payloads small enough that can be held in memory, we should support large payloads by streaming them.
At the moment, the message payload is contained in a CMS EnvelopedData value, which is in turn contained in a CMS SignedData value. We could do serialisation and deserialisation of CMS values using streams, which BouncyCastle supports but PKI.js doesn't, so generally speaking we'd have to either add streaming support to PKI.js (and other CMS libraries in platforms to be supported in the future) or simply detach the payload from CMS values.
Potential solutions
The underlying implementation will be irrelevant to service providers and couriers: They'll still get to produce, consume and transport potentially large messages. The options below explore how, at the network level, we could achieve that.
Option A: Process RAMF messages as streams
The payload size limit could be very large (like the original 4 GiB).
This would require the following changes to the RAMF spec:
Detach the ciphertext (
encryptedContentInfo.encryptedContent
) from the CMS EnvelopedData value.Place the CMS EnvelopedData value before the ciphertext so the symmetric key can be available before reading the ciphertext.
Reinstate the signature hashing algorithm as a RAMF message field so that a consumer can start computing the digest as the message is being received:
This will involve partially reverting d3bc4fa
We should also consider the implications of supporting large payloads when producing and verifying the digital signature. Algorithms like Ed25519/Ed448 can't work with streams, so we'll have to use their pre-hash variant instead.
This is the ideal solution in my opinion because I think it'd be easier to implement, but I know it's frown upon in the context of asynchronous messaging where messages are supposed to be small.
Option B: Chunk the RAMF messages
Endpoints and gateways would be responsible for splitting large messages into small chunks, in a way that's seamless to the service provider.
This is "better" or "more idiomatic" than Option A from an asynchronous messaging perspective, but I think will make it harder to implement due to the complexity in putting the pieces together at the receiving end.
Option C: Optionally Detach the Payload
When the payload is too big, the RAMF payload would be just a reference to an external value and its (SHA-256) digest. If the payload is to be encrypted, it'd be encrypted with a symmetric key that the recipient could decrypt from their
RecipientInfo
(as is always the case with EnvelopedData values).This is "the idiomatic approach" from an asynchronous messaging perspective, and should be easier to implement than Option B because there are only two pieces to put together: The RAMF message and its detached payload.
Other considerations
The text was updated successfully, but these errors were encountered: