Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CF should not require/hard-code use of the software bus for PDU transport #130

Open
jphickey opened this issue Dec 14, 2021 · 3 comments
Open

Comments

@jphickey
Copy link
Contributor

The CF application creates a stream of data PDUs during operation, which are intended to be (somehow) transported to the remote node. It is a point-to-point data flow.

Currently, CF assumes that the software bus will be used for this purpose. Although this is the existing/de-facto data transport mechanism provided by the framework, it is not an ideal fit at all. (Analogy of "when all you have is a hammer, everything looks like a nail" applies here).

  • Software bus is broadcast (1:N) where CF data flows are 1:1 in nature
  • Software bus has relatively small buffers, and is designed for minimizing latency and memory efficiency, not designed for bulk throughput
  • Software bus does not provide any sort of back-pressure capability (e.g. if a sender like CF is sending PDUs faster than the receiver can process them). Also note that it is not really practical for it to do so either, given it is a multicast design (1:N) - in a multicast, one would not stop sending just because one subscriber is not able to keep up.
  • Similarly, by forwarding/bridging data PDUs from the OS network buffers to software bus Buffers, it effectively defeats any backpressure capabilities of the underlying network protocol. For example, if a TCP connection were used for node-to-node transport over the physical network, this protocol will effectively throttle the sender to the rate that the receiver actually accepts the data through the use of ack's and sliding windows. This is determined on the receive side by how deep the buffer is, inside the network stack. By bridging the data to SB it essentially keeps this empty, and this gives a green-light for the sender to keep blasting data in. This makes it difficult, if not impossible, to tune the system for good throughput - it means the sender must be artificially held off without any real feedback.
  • Software bus is designed for commands and telemetry data, and all messages are assumed to be either a command or telemetry message. Therefore, CF must add a fake telemetry header on the PDUs it generates, and other entities must add a fake command header on the PDUs it generates, in order to maintain this pattern (or else it will break software bus APIs). This extra header is just unnecessary baggage, because SB is not designed for bulk data (i.e. this is where it is really contorting the problem domain to look more like a nail so the SB hammer will be able work with it).

While there may be valid reasons to use the Software Bus as a backhaul, it is certainly less than ideal and shouldn't be the only (hard-coded, forced) option. There should be mode to use an I/O layer and go direct to network, which will solve many of the throughput and performance tuning issues, as well as just being a far cleaner design.

If anything using SB for bulk data backhaul should be the undesirable fallback option (if nothing else better exists) rather than the primary/only option.

@jphickey
Copy link
Contributor Author

As far as I can tell, the CF requirements currently only call for use of SB as its backhaul (e.g. CF2001: "The CF application shall extract uplinked CFDP PDUs from SB messages.").

This is fine for slow/lazy/background data transfers, but for all reasons listed above - if performance and maximizing throughput become important to users - this requirement may have to change. SB simply isn't designed for bulk data transfer and maximizing throughput.

@skliper
Copy link
Contributor

skliper commented Dec 15, 2021

Agree completely, although currently the CF app is the slow/lazy/background data transfer sort of design, by design. There may be something entirely different for a system intended to do high performance data transfers (hardware/firmware acceleration, etc). Data recorder dumps and similar aren't the ideal use case for the current design. It'd be nice if CF has a clean break between the SB layer and PDU processing such that it could be replaced, but really I'd think much of that wouldn't end up in software if it's really a throughput issue.

@skliper skliper added this to the Draco milestone Jul 5, 2022
@dmknutsen dmknutsen assigned ghost Jul 19, 2022
@skliper
Copy link
Contributor

skliper commented Aug 9, 2022

@acudmore - I think for this one as long as the software bus use is well contained such that it could fairly easily be replaced we should be sufficient this round. Would be good to remove any SB msg artifacts relating to PDU transfer in the global that would preclude replacement or make things more complex.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants