Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PSA: Mention absence of, and/or future consideration of, congestion-based drop mechanisms #497

Open
jafingerhut opened this issue Nov 19, 2017 · 8 comments

Comments

@jafingerhut
Copy link
Collaborator

jafingerhut commented Nov 19, 2017

By this I mean that PSA currently has nothing (or very little) that would allow someone to implement something like Approximate Fair Drop (AFD), Random Early Detection (RED, WRED), etc. It specifies nothing that allows one to observe how deep queues are before making a drop or forward decision during ingress processing.

I am not saying that you couldn't take elements defined in PSA and create something like that out of it, e.g. maybe you could use a meter to roughly approximate the current depth of a queue for a constant bit rate output port. But that would not handle things like multiple class of service queues for that output port, or delays in sending traffic to the port due to features like Ethernet pause flow control. However, all the ways I am aware of would be on the order of work-arounds to the lack of visibility in such state of the underlying system.

Such capabilities could certainly be defined in PSA in the future, of course.

@samar-abdi
Copy link

There are use cases that require setting the ECN bits if a packet experienced congestion at the switch. A naive option might be to define a congestion bit in the egress input metadata. This bit is set if the packet experienced congestion in the PRE. The congestion parameters can be set by the PRE API. This metadata can be used in the egress to mark the ECN bits.

@jnfoster
Copy link
Collaborator

My preference would be to work out a general solution (e.g., read queue depths) than to bake in a congestion bit with target-specific semantics into PSA, even if the practical uses for the latter are clear and immediate. But as Andy's note indicates, this is not trivial...

@samar-abdi
Copy link

Yes, 1 bit is naive, but I would also be cautious against doing too much analysis in the dataplane: both for the sake of portability and performance. We can do very little in the egress: drop the packet or set some header field (ECN usually, but these might be user-defined in the future) to notify the destination host.

Another strawman is for the congestion metadata type to be a bit vector with reserved PSA-defined values (and room for growth). The problem is that today we cannot write a PSA-targeted P4 program that can do congestion notification.

We can have a lot more flexibility in the PRE API using which the controller can inspect the state of the queues and potentially take some congestion control steps. Similarly, we can also make the configuration parameters in PRE API more sophisticated so that the congestion metadata is set on different thresholds for different class of service, for example.

@jafingerhut
Copy link
Collaborator Author

PSA does already define ingress and egress timestamps, so if congestion experienced can reasonably be derived from (egress_timestamp - ingress_timestamp) for a packet, that is available. I have nothing against adding visibility to queue depths that a packet went through -- just that there has not been strong enough interest yet in adding it to PSA. We can always bring it up again in the meeting to see if that may have changed, with the P4 Apps working group getting further in defining INT.

@cc10512
Copy link
Contributor

cc10512 commented Jan 17, 2018

Is (egress_timestamp - ingress_timestamp) platform specific? I should think it depends on the speed at which the ports are configured, which we have managed to avoid until now. Even the timestamp resolution is an issue that we have not yet resolved.

I agree with @jafingerhut that this should be a topic of collaboration with the P4 Apps WG and we should reach out to them to see what are they thinking, for both of these issues.

@jnfoster
Copy link
Collaborator

jnfoster commented Jan 17, 2018 via email

@jafingerhut
Copy link
Collaborator Author

Even though (egress_timestamp - ingress_timestamp) has platform-dependent units (which is what PSA defines now), if there are congestion-detection algorithms that can be implemented solely in terms of latency that a packet experiences through a device, then the control plane can configure time thresholds in devices in their own custom units, if needed. That configuration could also be done independently per port, if desired, e.g. via a table with egress_port as a search key.

Using a Register instance or two, someone could even implement time-based averaging over many packets in the same 'flow', instead of only using the current packet's latency, which might be useful. (e.g. RED is based on time-averaged queue lengths rather than instantaneous queue lengths, to smooth out the effect of traffic bursts)

I am not saying that queue lengths are useless. I just wanted to point out that even though timestamp units are platform-dependent, that does not automatically make them useless for this purpose.

@jafingerhut jafingerhut self-assigned this Jan 21, 2018
@jafingerhut
Copy link
Collaborator Author

FYI the latest draft of the PSA specification has a few sentences in the "Open Issues" section describing possible future additions to PSA for enabling ECN to be implemented. I plan to leave this issue open as a reminder for future discussion on this topic, until we actually have a future PSA spec that has a way to implement these kinds of things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants