Protocol Specification

Todd L. Montgomery edited this page Feb 21, 2017 · 106 revisions

Introduction

Aeron tries to address the following:

  • high-throughput and low-latency communication for unicast and multicast
  • reliable multicast operation for modest receiver set size (< 100 receivers)
  • multiple transmission media support (UDP, InfiniBand, Shared Memory, etc.)
  • multiple streams that can provide different QoS
  • effective flow control for multicast and unicast
  • receiver application paced flow control on a per stream basis
  • easy monitoring of buffering on a per stream basis
  • flow control tied to message processing (as opposed to just delivery for processing later)

Aeron is influenced by several other modern protocols. Such as, but not limited to, SPDY, HTTP/2, IPv6, WebSocket.

Terminology

  • Transmission Media: Generic term used to indicate the media over which the protocol runs. Can be UDP, InfiniBand, shared memory, etc.
  • Media Driver: Driver for reading/writing from/to transmission media for Aeron.
  • Publisher: The client application which sends messages.
  • Subscriber: The client application which receives messages.
  • Sender: The media driver which sends the messages produced by the client publisher.
  • Receiver: The media driver which receives messages sent by the Sender.
  • Driver Subscription: The media driver in charge of message receipt. These messages are passed on to client Subscriber applications.
  • Session: A unique invocation of Aeron that identifies a single Publication and all Subscriptions for that Publication.
  • Session ID: A unique identifier for a Session.
  • Channel: A transmission media needs to have a means of identifying a flow of data and the addressing model of the media. For Aeron, this is called a Channel. For different transmission media, the channel is defined differently. In general, a URI is used for specifying a channel.
  • Physical Source: Source of a Session.
  • Physical Receiver: Receiver of a Session.
  • Stream: A Session carries sub-sessions within it. Streams are these sub-sessions.
  • Stream ID: A unique identifier for a Stream. A value of 0 is reserved.
  • Term: A section of data within a Stream. Each Term is associated with a Media Driver send and receive buffer. The length of a Term must be a power of two and must be the same length on both ends.
  • Term ID: A unique identifier for a Term within a Stream. Starts randomly. Must increase monotonically. Can wrap around. Can not go back to a wrapped value.
  • Term Offset: Identifier of a single byte within the Term. Always start at 0. This is the number of the byte within a given term starting from the beginning.
  • Frame: The unit of data for Aeron. Measured in bytes. The transmission media may include multiple Frames into a single packet of data for batching.
  • Message : The unit of data for the application(hence aka APDU Application Data Unit). A single Message may be fragmented over multiple Frames. Alternatively, a single Message may fit into a single Frame. A Message (all of its Fragments) must fit into a single Term.
  • Fragment: The unit of data for a fragmented Message that fits into a single Frame.

Design Assumptions

The Aeron protocol is designed to be run directly over many different types of transmission media, including shared memory/IPC, InfiniBand/RDMA, UDP, TCP, Raw IP, HTTP, WebSocket, BLE, etc. This means that the following assumptions are made:

  • Transmission Media may be a stream media, such as TCP or RDMA without inherent frame boundaries.
  • Transmission Media may have only unicast modes of operation.
  • Byte ordering of fields of length 16-bits and larger use Little Endian. This is for pure efficiency on performance sensitive platforms. Sub-byte ordering is not a concern as the byte is treated as the atomic unit.

Aeron is a transport protocol and may operate over unreliable media. For this reason, some additional assumptions must be embraced, which Aeron will detect and correct, such as:

  • Duplication of packets may occur
  • Packets may be lost
  • Packets may arrive out of order

Aeron assumes some operational conditions, such as:

  • Low number of Streams. As each Term is a buffer for the Media Driver, and a Stream has a number of terms, the number of assumed Streams is assumed to be less than 1000 or even less than 100.

Aeron is designed to work hand-in-hand with the underlying concurrent data structures it is founded upon. In this regard, the header layout has a dual purpose in that it is also the data structure framing layout. This leads to a symbiotic relationship between the data structure and the base protocol operation, framing, etc.

Session ID and Term ID Generation

Session IDs and initial Term IDs need to be generated in a pseudo-random manner. Term IDs will progress monotonically after generation, but need to start out randomly. Applications may set Session IDs to a specific value. However, it is up to the application to use a temporally unique Session ID in such cases.

Operation

General Header Format

Aeron Frames begin with a header. The specifics of the header change based on operational type, but the general layout is given below.

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |R|                       Frame Length                          |
    +---------------+---------------+-------------------------------+
    |    Version    |     Flags     |             Type              |
    +---------------+---------------+-------------------------------+
    |                       Depends on Type                        ...

  • Frame Length: (31 = max 2147483647 bytes) Length of Frame. Including header.
  • Version: (8) Current version is 0.
  • Flags: (8) Depend on Type.
  • Type: (16) Indicates the type of header and any format after Frame Length.

Header Types

Type Value Description
HDR_TYPE_PAD 0x0000 Padding: Padding Frame.
HDR_TYPE_DATA 0x0001 Data: Data Fragment.
HDR_TYPE_NAK 0x0002 NAK: Request retransmission.
HDR_TYPE_SM 0x0003 Status Messages: Feedback from subscription on window and buffer status.
HDR_TYPE_ERR 0x0004 Error: Error.
HDR_TYPE_SETUP 0x0005 Setup: Setup.
HDR_TYPE_RTTM 0x0006 RTT Measurement: Measuring RTT.
HDR_TYPE_EXT 0xFFFF Extension Header: Used to extend more options as well as extensions (TBD).

Stream Setup

Data flow in Aeron is uni-directional. Bi-directional communication is accomplished by establishing sessions in both directions separately.

The stream setup sequence varies by unicast vs. multicast transmission media and is intimately tied to Status Messages (SM).

  • For unicast

    1. Receiver listens on a given unicast channel.
    2. Sender sends a SETUP Frame to the Receiver. And waits for a Status Message to be sent back. If no response, the Sender then can retransmit the SETUP Frame until it gets a Status Message back.
    3. Receiver sees a SETUP Frame and sends unicast Status directly back to Sender with initial buffer status and reception window.
    4. Sender sees a Status Message and can commence streaming data and honouring reception window.
    5. NOTES: NAKs and SMs are always sent unicast back to Publications.
  • For multicast, there is an endpoint for data and an endpoint for control (NAKs and SMs). See UDP Multicast Mode of Operation for how these endpoints map onto UDP multicast addresses.

    1. Receivers listen on a given multicast endpoint for data and MAY listen on another for control.
    2. Senders listens on a given multicast endpoint for control and sends periodic Data Frames to the data endpoint. A Publication MUST NOT send Data Frames until it knows of at least one Subscription for the Stream ID via a Status Message.
    3. Receivers that see Data Frames for Stream IDs of interest to it, send back Status Messages on the control endpoint with the SETUP flag set to elicit a SETUP Frame from the sender. Receivers MUST NOT send a Status Message with the SETUP flag set more than a few times a second.
    4. Senders that receive a Status Message with the SETUP flag set should respond by sending a SETUP Frame to the control endpoint.
    5. Receiver sees a SETUP Frame and sends unicast directly back to Sender with initial buffer status and reception window.
    6. NOTES:
      1. Multicast setup is in general the same as unicast setup aside from also supporting joining existing data streams.
      2. NAKs and SMs are always sent to the control endpoint.
      3. Receivers may listen to the control frames and suppress NAK generation. But this is an implementation choice.
      4. Senders never need to listen to the data endpoint.

Setup Frame

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |R|                    Frame Length (= header length)           |
    +---------------+---------------+-------------------------------+
    |    Version    |    Flags      |        Type (=0x05)           |
    +---------------+---------------+-------------------------------+
    |R|                        Term Offset                          |
    +---------------------------------------------------------------+
    |                          Session ID                           |
    +---------------------------------------------------------------+
    |                           Stream ID                           |
    +---------------------------------------------------------------+
    |                       Initial Term ID                         |
    +---------------------------------------------------------------+
    |                        Active Term ID                         |
    +---------------------------------------------------------------+
    |                          Term Length                          |
    +---------------------------------------------------------------+
    |                              MTU                              |
    +---------------------------------------------------------------+
    |                              TTL                              |
    +---------------------------------------------------------------+

  • Frame Length: (32) Value is length of SETUP Frame Header.
  • Version: (8) Current version is 0.
  • Flags: (8) Depend on Type.
  • Type: (16) HDR_TYPE_SETUP
  • Term Offset: (31) Offset of the first byte of the stream to start reception on.
  • Session ID: (32) Session ID.
  • Stream ID: (32) Stream ID.
  • initial Term ID: (32) Term ID of first Term within the Stream that has been sent.
  • Active Term ID: (32) Term ID of latest Term within the Stream that has been sent.
  • Term Length: (32) Length of Term. Must be positive power of 2.
  • MTU: (32) Sender MTU length in bytes.
  • TTL: (32) Sender Multicast TTL.

Data Frame

Aeron uses the Data Header to hold all data. The data may represent a single APDU or a fragment.

Fragmentation & reassembly information is carried in each fragment via the B & E flags and the term offset. The B bit indicates this fragment begins an application APDU (or Message). The E bit indicates this fragment ends an APDU. For a self-contained APDU, the B and E bits will always be set together.

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |R|                    Frame Length (=data + header)            |
    +---------------+-+-+-----------+-------------------------------+
    |    Version    |B|E|    Flags  |        Type (=0x01)           |
    +---------------+-+-+-----------+-------------------------------+
    |R|                        Term Offset                          |
    +---------------------------------------------------------------+
    |                          Session ID                           |
    +---------------------------------------------------------------+
    |                           Stream ID                           |
    +---------------------------------------------------------------+
    |                            Term ID                            |
    +---------------------------------------------------------------+
    |                        Reserved Value                         |
    |                                                               |
    +---------------------------------------------------------------+
    |                             Data                             ...
   ...                                                              |
    +---------------------------------------------------------------+

  • Frame Length: (32) Value is length of data + length of Data Frame Header.
  • Version: (8) Current version is 0.
  • Flags: (8) Depend on Type.
    • (B)egin Message: Fragment begins an APDU.
    • (E)nd Message: Fragment ends an APDU.
  • Type: (16) HDR_TYPE_DATA
  • Term Offset: (31) Offset of the first byte of the frame header within the Term.
  • Session ID: (32) Session ID.
  • Stream ID: (32) Stream that message is for.
  • Term ID: (32) Term that message is for within the Stream.
  • Reserved Value: (64) Reserved value for application to use.
  • Data: (varies) Data Fragment or entire APDU.

NOTE: Multiple Senders sending redundant data is supported. Can be as simple as having each use the same Session ID, Stream ID, Term ID, and term offset. Then normal duplicate elimination works. TCP multipath capability is very much akin. Thus same mechanism can be used for providing multipath support. This means that applications need to be able to set Session ID, Stream ID, Term ID, and send consistently the same term offset and data length.

NOTE: The R-bit for Term Offset is to capture the effect that Term Offset must be a positive integer on languages not supporting unsigned 32-bit values naturally (such as Java) without resorting to 64-bit signed values.

NOTE: 0 Length Data Frame headers are Data Frames with 0 data bytes. These are to be used for heartbeat messages as well as for initial channel setup.

NOTE: Aeron aligns frames to a given frame boundary, currently 32 bytes. So, an individual frame might have a frame length value less than the given data transmitted on the wire. An example would be a message of length 19 data bytes would seem to be 64 bytes on the wire, but have a frame length of 51 (32 bytes header + 19 data bytes). The inclusion of this alignment padding on the wire is for efficiency of operation and reduced latency.

Padding Frames

At the end of a term, there may be a padding frame. A padding frame is always the length of a Data Frame without any data. However, the frame length field value will be the length of the padding. A padding frame may exist at the end of a set of frames within a single transmission media packet or it might be at the front.

Data Recovery via Retransmit Request

Data recovery in Aeron is negative acknowledgement (NAK) based. It is the subscriptions responsibility to request retransmission of missed data.

NOTE: For background on NAK processing dynamics, please see IETF RFC 5401 - Multicast Negative-Acknowledgment (NACK) Building Blocks. Aeron has adopted and adapted many of these aspects.

Aeron places very little requirements on NAK processing, but the following are general guidelines for how Aeron publications and subscriptions implementations should behave.

  1. When a Receiver notices missing data, it MUST send a NAK to the publication immediately (in the case of unicast) or after some delay (in the case of multicast).
  2. When a Sender receives a NAK, it MUST send the indicated Data Frame again immediately if possible. Retransmissions MAY be rate controlled based on the implementation.
  3. Senders MUST ignore NAKs for a particular Data Frame for a time after sending a retransmission.
     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |R|                    Frame Length (=header length)            |
    +---------------+---------------+-------------------------------+
    |   Version     |     Flags     |          Type (=0x02)         |
    +---------------+---------------+-------------------------------+
    |                          Session ID                           |
    +---------------------------------------------------------------+
    |                           Stream ID                           |
    +---------------------------------------------------------------+
    |                            Term ID                            |
    +---------------------------------------------------------------+
    |R|                        Term Offset                          |
    +---------------------------------------------------------------+
    |                            Length                             |
    +---------------------------------------------------------------+
  • Frame Length: (32) Value is 28.
  • Version: (8) Current version is 0.
  • Flags: (8) Reserved.
  • Type: (16) HDR_TYPE_NAK
  • Session ID: (32) Session for retransmission.
  • Stream ID: (32) Stream for retransmission.
  • Term ID: (32) Term ID for the Term to request retransmission for.
  • Term Offset: (31) Term Offset being requested.
  • Length: (32) Length of data being requested in bytes.

Status Messages

Aeron's Status Messages are used for both flow and congestion control. These are used to control feedback from subscriptions to publications as well as monitoring and status indicators. Users can utilize different strategies to how subscriptions send SMs and how the Receiver Window is managed.

Flow control in Aeron is Stream specific. When used over UDP or non-rate controlled media, each Stream has, in effect, a different QoS.

Aeron publications are dumb in that they only send as much as the subscriptions will allow on each Stream at any one time.

For more information on flow control, please see here.

Receiver Window

Central to the design of Aeron's flow and congestion control is the receiver message window. This is the number of bytes that a subscription is willing to immediately receive. A value of 0 means no data. A value of 1000 means 1000 bytes. This number does NOT count any retransmissions. This window is essentially the same flow control window used in TCP and can be managed in a similar manner.

On Aeron stream setup, Receivers send initial SMs to set initial window length. A suggested initial window length is limited to no more than 1/4th the Term length in bytes. Enough that fits into a single Data Frame. A media driver MAY be configured to set the initial window to 0 to prevent a publication from sending immediately.

Status Message Header

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |R|                 Frame Length (=header + data)               |
    +---------------+-+-------------+-------------------------------+
    |   Version     |S|    Flags    |          Type (=0x03)         |
    +---------------+-+-------------+-------------------------------+
    |                          Session ID                           |
    +---------------------------------------------------------------+
    |                           Stream ID                           |
    +---------------------------------------------------------------+
    |                      Consumption Term ID                      |
    +---------------------------------------------------------------+
    |R|                  Consumption Term Offset                    |
    +---------------------------------------------------------------+
    |                        Receiver Window                        |
    +---------------------------------------------------------------+
    |                          Receiver ID                          |
    |                                                               |
    +---------------------------------------------------------------+
    |                  Application Specific Feedback               ...
   ...                                                              |
    +---------------------------------------------------------------+

  • Frame Length: (32) Value is 28 + length of application specific data (if present)
  • Version: (8) Current version is 0.
  • Flags: (8) Reserved.
    • (S)ETUP: SETUP Flag
  • Type: (16) HDR_TYPE_SM
  • Session ID: (32) Session that SM pertains to.
  • Stream ID: (32) Stream that SM pertains to.
  • Consumption Term ID: (32) Term ID of last byte of complete data consumed by subscribers.
  • Consumption Term Offset: (31) Term Offset of last byte of complete data consumed by subscribers.
  • Receiver Window: (32) Subscription advertised window.
  • Receiver ID: (64) ID for this Receiver. Should be as unique as possible across media drivers (UUID). A media driver may use a single ID.
  • Application Specific Feedback: (Varies) Application feedback piggybacked with response. Not interpreted by Aeron. MAY not be supported by implementation.

WARNING: Applications may piggyback feedback to sources on SMs for application rate controls or other means. This data is unreliable. It will not be retransmitted if lost. This capability is implementation dependent and may not be supported.

RTT Measurement

Some transmission media, such as UDP, can have significant transmission delay as well as queuing delay. This delay may change rapidly during operation. Applications as well as some hanlding of congestion control may desire to measure RTT during operation and adjust behavior based on how RTT changes. A special frame is used for measuring RTT during operation.

RTT Measurement Header

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |R|               Frame Length (=header length)                 |
    +---------------+-+-------------+-------------------------------+
    |   Version     |R|    Flags    |          Type (=0x06)         |
    +---------------+-+-------------+-------------------------------+
    |                          Session ID                           |
    +---------------------------------------------------------------+
    |                           Stream ID                           |
    +---------------------------------------------------------------+
    |R|                      Echo Timestamp                         |
    |                                                               |
    +---------------------------------------------------------------+
    |R|                     Reception Delta                         |
    |                                                               |
    +---------------------------------------------------------------+
    |                          Receiver ID                          |
    |                                                               |
    +---------------------------------------------------------------+
  • Frame Length: (32) Value is 40
  • Version: (8) Current version is 0.
  • Flags: (8) Reserved.
    • (R)eply: Generate reply Flag
  • Type: (16) HDR_TYPE_RTTM
  • Session ID: (32) Session that RTT Measurement pertains to.
  • Stream ID: (32) Stream that RTT Measurement pertains to.
  • Echo Timestamp: (64) Timestamp to echo in a reply or the timestamp in the original RTT Measurement.
  • Reception Delta: (64) Time in nanoseconds between receiving original RTT Measurement and sending Reply RTT Measurement.
  • Receiver ID: (64) ID for this Receiver. Should be as unique as possible across media drivers (UUID). A media driver may use a single ID. May be 0 to signify all receivers.

Measuring RTT From Sender to Receiver

A Sender may measure RTT to a specific Receiver by sending an RTT Measurement frame to the Receiver via the Data channel with the R flag set and the Receiver ID set to the receiver to reply. The echo timestamp in the RTT Measurement should be the time of sending the RTT Measurement in nanoseconds from a useful epoch.

A Receiver upon receiving an RTT Measurement with the R flag set with a valid session Id, stream Id, and receiver Id, should send an RTT Measurement frame without the R flag set that echoes the Echo Timestamp field and Receiver ID field. The Reception Delta field should hold the time (in nanoseconds) between original RTT Measurement frame reception and generation and sending of reply RTT Measurement frame or 0 to indicate no significant time has elapsed

Measuring RTT from Receiver to Sender

A Receiver may measure RTT to a Sender by sending an RTT Measurement frame to the Sender via the Control channel with the R flag set and the Receiver ID set to the receiver to reply to. The echo timestamp in the RTT Measurement should be the time of sending the RTT Measurement in nanoseconds from a useful epoch.

A Sender upon receiving an RTT Measurement with the R flag set with a valid session Id, and stream Id, should send an RTT Measurement frame without the R flag set that echoes the Echo Timestamp field and Receiver ID field. The Reception Delta field should hold the time (in nanoseconds) between original RTT Measurement frame reception and generation and sending of reply RTT Measurement frame or 0 to indicate no significant time has elapsed.

Rate Limiting RTT Measurement

To avoid possible reflection attacks, RTT Measurements with the R flag set should be ignored for a short time after being processed by a Sender or Receiver.

One Way Latency Measurement

The RTT Measurement Header above may be used for measuring one way latency. In this case, the Sender bursts out a set of RTT Measurement frames with a new measure in each one. The Receiver ID field is set to 0. A Receiver receiving this set can then estimate the clock drift of the Sender. After this burst, a periodic frame can be sent for measuring one way latency. In all cases, the use of Receiver ID 0 signifies the use of one way latency measurement.

The duration, size, frequency, and repetition of bursts for accurate measurement and tracking of Sender clocks is left up to the implementation. But it should be noted that for initial measurement, data frames should NOT be sent at high rate.

Error Header

Aeron uses a generic error handling method similar to ICMP errors like Destination Unreachable, etc. The first set of bytes from the offending frame is included in the error message.

The sending of Error Headers is implementation dependent.

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |R|                 Frame Length (varies)                       |
    +---------------+---------------+-------------------------------+
    |   Version     |   Error Code  |         Type (=0x04)          |
    +---------------+---------------+-------------------------------+
    |R|             Frame Length of Offending Header (varies)       |
    +---------------------------------------------------------------+
    |                      Offending Header                        ...
    +---------------------------------------------------------------+
   ...                                                              |
    +---------------------------------------------------------------+
    |                         Error String                         ...
    +---------------------------------------------------------------+
   ...                                                              |
    +---------------------------------------------------------------+

  • Frame Length: (32) Value varies based on length of Offending Header, length of Error String, and length of Error Header itself.
  • Version: (8) Current version is 0.
  • Error Code: (8) Type of Error. May be specific to the Offending Header contents.
  • Type: (16) HDR_TYPE_ERR
  • Frame Length of Offending Header: (32) Length of the Offending Header
  • Offending Header: (Varies) Frame Header that generated error. Does not include any Data.
  • Error String: (Optional, Varies) Human reabable string for error. Length determined by Frame Length - Frame Length of Offending Header - Error Header length.

Lifetime & Heartbeats

Streams must be reclaimed after a period of inactivity. Heartbeats are Frames sent with no data, but the highest sent Term ID and Term Offset. They keep the Stream alive, but do not contain new data and can leverage the existing duplicate detection logic. Heartbeat message:

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |0|                    Frame Length (=0)                        |
    +---------------+-+-+-----------+-------------------------------+
    |    Version    |1|1| Flags(=0) |        Type (=0x01)           |
    +---------------+-+-+-----------+-------------------------------+
    |R|                        Term Offset                          |
    +---------------------------------------------------------------+
    |                          Session ID                           |
    +---------------------------------------------------------------+
    |                           Stream ID                           |
    +---------------------------------------------------------------+
    |                            Term ID                            |
    +---------------------------------------------------------------+

  • Frame Length: (32) Special value of 0.
  • Version: (8) Current version is 0.
  • Flags: (8) B=1,E=1, no other flags -> 0x11000000 = 192.
  • Type: (16) HDR_TYPE_DATA
  • Term Offset: (31) Offset of the first byte of the frame header within the Term.
  • Session ID: (32) Session ID.
  • Stream ID: (32) Stream that message is for.
  • Term ID: (32) Term that message is for within the Stream.

Heartbeats also inform subscriptions of the highest sent Term ID and Term Offset for determining missing data and thus initiating loss handling.

Heartbeats are sent only in absence of application data to send.

Analysis

Network Utilisation

Assumptions: (1) UDP, and (2) ignore Ethernet. IP = 20 bytes, UDP = 8 bytes, Aeron = 32 bytes (Data Frame)

  • Payload of 256 bytes + 60 bytes overhead = 81.0% efficiency
  • Payload of 1024 bytes + 60 bytes overhead = 94.6% efficiency

Epochs

31-bit unsigned (32-bit signed) data space per Term exhausts every 2 seconds at 1 GBps (8 Gbps) when at a constant rate. Thus Term IDs tick every 2 seconds. With that rate, 31-bit unsigned (32-bit signed) Term ID space is exhausted at the rate of 2386092.9 hours or 272 years.

Aeron Over UDP

Aeron functions over UDP in one of three modes. The first is Unicast (or point-to-point) mode. The second is Multicast mode. The third is Multi-Destination-Cast mode. These modes pertain to the uni-directional flow of data. It is quite possible that a combined bi-directional connection could be setup that is mixed mode. Unicast in one direction and Multicast in another.

Aeron over UDP may pack more than 1 Frame into a single datagram as it sees fit and the Frame will fit.

Channel designation for Aeron over UDP is based on the following URI scheme

aeron:udp?[interface=local-interface[:local-port]]|endpoint=subscription-address:subscription-port[|control=explicit-control-address:control-port]

Subscription address and port means slightly different things based on unicast vs. multicast operation.

Unicast Mode of Operation

Aeron Receivers listen on specific interface (IP address) and UDP port for Data Frames. Subscriptions send SMs and NAKs unicast back to the publication. Aeron Receivers must send SMs and NAKs back to the IP address of source of the Data Frames.

Aeron Senders send Data Frames to subscription IP address and UDP port. However, Senders may listen for SMs and NAKs on any port, including an ephemeral port.

Example URIs:

  1. aeron:udp?endpoint=192.168.0.3:4050: Subscription binds to 192.168.0.3 on port 4050. Publication sends data to 192.168.0.3 port 4050.

Multicast Mode of Operation

Aeron senders and receivers send to and listen on specific interface (IP address), specific IP multicast address/group, and destination UDP port for the data endpoint. Traffic on this endpoint is Data Frames. The IP multicast address must be odd.

Aeron senders and receivers sent to and listen on specific interface (IP address), specific IP multicast address/group, and destination UDP port for the control endpoint. Traffic on this endpoint is Status Messages and NAKs. The IP multicast address must be the next even address bigger than the data endpoint multicast address.

An example of the relationship between data and control multicast addresses would be, data on 224.10.9.7 and control on 224.10.9.8.

Example URIs:

  1. aeron:udp?endpoint=224.10.9.7:4050: Receivers join IP multicast address 224.10.9.7 on destination port 4050 for data. And will send SMs and NAKs to IP multicast address 224.10.9.8 destination port 4050. Sender joins IP multicast address 224.10.9.8 on destination port 4050 for SMs and NAKs. And will send data to 224.10.9.7 destination port 4050.

Multi-Destination-Cast Mode of Operation

In Multi-Destination-Cast (MDC) mode, a sender sends eplxicitly to a list of destinations and manages them as it would a multicast group, but uses unicast UDP instead of multicast addressing. This list of destinations may be manually controlled by the publication adding and removing destinations. Or it could be dynamic where destinations can add themselves and have themselves removed due to inactivity.

Manual Mode

Aeron Receivers listen on specific interface (IP address) and UDP port for Data Frames. Subscriptions send SMs and NAKs unicast back to the publication explicit control port. Aeron Receivers must send SMs and NAKs back to the IP address of the source of the Data Frames.

Aeron Senders send Data Frames to specific added destination IP address and UDP port. However, Senders may listen for SMs and NAKs on the explicit control port.

Example URIs:

  1. aeron:udp?control=192.168.0.3:4050|control-mode=manual: Publication binds to 192.168.0.3 on port 4050 for control frames (SMs and NAKs).
  2. aeron:udp?endpoint=192.168.0.4:4051: Subscription binds to 192.168.0.4 on port 4051.
  3. aeron:udp?endpoint=192.168.0.5:4052: Subscription binds to 192.168.0.5 on port 4052.
  4. Publication API adds aeron:udp?endpoint=192.168.0.4:4051: Publication sends data to 192.168.0.4 port 4051.
  5. Publication API adds aeron:udp?endpoint=192.168.0.5:4052: Publication sends data to 192.168.0.4 port 4051 as well as 192.168.0.5 port 4052.

Dynamic Mode

Aeron Receivers listen on specific interface (IP address) and UDP port for Data Frames. Subscriptions send SMs and NAKs unicast back to the publication explicit control port. Aeron Receivers must send SMs and NAKs back to the IP address of the publication explicit control.

Aeron Receivers MUST periodically send specially crafted SMs to the control IP and port of the publication in the absence of SMs being generated by data traffic. These specific SMs MUST have session Id value of 0, stream Id value of 0, and the Elicit SETUP flag set.

Aeron Senders send Data Frames to IP address and UDP ports that send SMs eliciting SETUP frames. This is a destination. A destination is to be considered active as long as SMs are seen from the destination IP address and UDP port. If a timeout period elapses without receiving an SM, then that destination can be removed. Senders MUST listen for SMs and NAKs on the explicit control port.

Example URIs:

  1. aeron:udp?control=192.168.0.3:4050: Publication binds to 192.168.0.3 on port 4050 for control frames (SMs and NAKs).
  2. aeron:udp?endpoint=192.168.0.4:4051|control=192.168.0.3:4050: Subscription binds to 192.168.0.4 on port 4051. Publication sends data to 192.168.0.4 port 4051
  3. aeron:udp?endpoint=192.168.0.5:4052|control=192.168.0.3:4050: Subscription binds to 192.168.0.5 on port 4052. Publication sends data to 192.168.0.5 port 4052

Aeron Over SHM (Shared Memory)

Aeron uses shared memory for passing messages from publisher to the Sender for transfer over the network media to other machines. Messages are passed via the log buffers. This same mechanism is used for exchanging messages between publishers and subscribers on the same machine. The protocol of exchange is by data frames written to the log. There is no need for control messages over SHM media.

TODO: Detail memory ordering semantics for the writing of messages.