Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support processing of large payloads in RAMF messages #14

Open
gnarea opened this issue Jul 7, 2019 · 0 comments
Open

Support processing of large payloads in RAMF messages #14

gnarea opened this issue Jul 7, 2019 · 0 comments
Labels
pkijs-compromises Compromises made due to the use of PKI.js in the reference NodeJS implementation spec-ramf RAMF

Comments

@gnarea
Copy link
Member

gnarea commented Jul 7, 2019

Executive summary

Due to limitations in a 3rd-party library and to minimise complexity in the initial implementation of the protocol suite, messages are limited to payloads of up to 8 MiB to prevent larger messages from exhausting the memory available in couriers and gateways processing such messages. Until this is fixed, Relaynet service providers wishing to send larger messages will have to chunk the data and assemble it at the receiving end.

The objective of this issue to identify and implement a solution that makes it very easy to send and receive large messages, without requiring the service provider to do any data chunking -- it'd be done for them behind the scenes.

Description

This 8 MiB limit is partly arbitrary and could be slightly increased as a stopgap, but RAM and swap memory are limited resources, so no system can hold arbitrarily large values in memory. In addition to supporting payloads small enough that can be held in memory, we should support large payloads by streaming them.

At the moment, the message payload is contained in a CMS EnvelopedData value, which is in turn contained in a CMS SignedData value. We could do serialisation and deserialisation of CMS values using streams, which BouncyCastle supports but PKI.js doesn't, so generally speaking we'd have to either add streaming support to PKI.js (and other CMS libraries in platforms to be supported in the future) or simply detach the payload from CMS values.

Potential solutions

The underlying implementation will be irrelevant to service providers and couriers: They'll still get to produce, consume and transport potentially large messages. The options below explore how, at the network level, we could achieve that.

Option A: Process RAMF messages as streams

The payload size limit could be very large (like the original 4 GiB).

This would require the following changes to the RAMF spec:

  1. Detach the ciphertext (encryptedContentInfo.encryptedContent) from the CMS EnvelopedData value.

  2. Place the CMS EnvelopedData value before the ciphertext so the symmetric key can be available before reading the ciphertext.

  3. Reinstate the signature hashing algorithm as a RAMF message field so that a consumer can start computing the digest as the message is being received:

    The algorithm MUST be valid per RS-018. This value MUST be DER-encoded as an ASN.1 Object Identifier; for example, SHA-256 (OID 2.16.840.1.101.3.4.2.1) would be encoded as 06 09 60 86 48 01 65 03 04 02 01. It MUST also have a fixed length of 16 octets, right padded with 0x00.

    This will involve partially reverting d3bc4fa

We should also consider the implications of supporting large payloads when producing and verifying the digital signature. Algorithms like Ed25519/Ed448 can't work with streams, so we'll have to use their pre-hash variant instead.

This is the ideal solution in my opinion because I think it'd be easier to implement, but I know it's frown upon in the context of asynchronous messaging where messages are supposed to be small.

Option B: Chunk the RAMF messages

Endpoints and gateways would be responsible for splitting large messages into small chunks, in a way that's seamless to the service provider.

This is "better" or "more idiomatic" than Option A from an asynchronous messaging perspective, but I think will make it harder to implement due to the complexity in putting the pieces together at the receiving end.

Option C: Optionally Detach the Payload

When the payload is too big, the RAMF payload would be just a reference to an external value and its (SHA-256) digest. If the payload is to be encrypted, it'd be encrypted with a symmetric key that the recipient could decrypt from their RecipientInfo (as is always the case with EnvelopedData values).

This is "the idiomatic approach" from an asynchronous messaging perspective, and should be easier to implement than Option B because there are only two pieces to put together: The RAMF message and its detached payload.

Other considerations

  • PKI.js (used by relaynet-core-js) doesn't support stream-based plaintexts (see: Support streams in CMS EnvelopedData and SignedData PeculiarVentures/PKI.js#237). Bouncy Castle (used by awala-jvm) does support streams.
  • Google PubSub, which we use extensively at Relaycorp for our own deployment of Awala-related infrastructure, has a message limit of 10MB.
  • We may want to introduce an extension to the delivery authorization to specify the maximum size of the payload. Likewise, we could introduce another extension to control the bandwidth the sender is allowed to use in a given period of time (analogous to the rate limiting extension).
  • Bindings will generally have to do chunking to process large messages. That's definitely true with generic L7 protocols like plain old HTTP (PoHTTP), gRPC (CogRPC) and WebSockets (PoWebSockets), but shouldn't be a problem if we introduce purpose-built L7 protocols like PoSocket and CoSocket.
  • Gateways and couriers are also likely to have to do chunking behind the scenes. Especially gateways, as they use brokers like NATS to wrap/unwrap cargos.
  • Whilst gateways must distribute their payloads in as many cargoes as necessary anyway, they may benefit from this solution.
gnarea added a commit that referenced this issue Jul 7, 2019
@gnarea gnarea changed the title Support large payloads in RAMF messages Support encryption and decryption of very large payloads in RAMF messages Jul 7, 2019
@gnarea gnarea changed the title Support encryption and decryption of very large payloads in RAMF messages Support processing of very large payloads in RAMF messages Jul 11, 2019
@gnarea gnarea added the spec-ramf RAMF label Dec 4, 2019
@gnarea gnarea added the pkijs-compromises Compromises made due to the use of PKI.js in the reference NodeJS implementation label Dec 26, 2019
gnarea added a commit that referenced this issue Feb 14, 2020
So that it can easily fit in memory.

See: #14
@gnarea gnarea changed the title Support processing of very large payloads in RAMF messages Support processing of large payloads in RAMF messages Feb 16, 2020
@gnarea gnarea pinned this issue Feb 16, 2020
@gnarea gnarea unpinned this issue May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pkijs-compromises Compromises made due to the use of PKI.js in the reference NodeJS implementation spec-ramf RAMF
Projects
None yet
Development

No branches or pull requests

1 participant