Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The definition of UNSAFE options should not be limited to options that modify user data #23

Closed
Mike-Heard opened this issue Nov 26, 2023 · 49 comments

Comments

@Mike-Heard
Copy link

Mike-Heard commented Nov 26, 2023

Tom Herbert has asserted that "an UNSAFE option should be any option that MUST be processed for the packet to processed correctly." Joe Touch disagrees, and contends that only those options that modify the user data should be considered UNSAFE.

I agree with Tom, and I propose that the text in draft-ietf-tsvwg-udp-options-28 Section 6 and Section 11 be modified as follows:

s/modify user data/modify user data (or the semantics thereof)/

See also https://mailarchive.ietf.org/arch/msg/tsvwg/JPoH-JsrObcQbdWfRDNmIThyOoY/ and other discussion in that mailing list thread.

@gorryfair
Copy link

I think this is helpful insight for me, and I would support a slight refocus to describe UNSAFE as based on the content OR /AND the processing of the surplus area, since such a datagram is processed correctly ONLY when the UDP Option is processed.

@jtouch
Copy link

jtouch commented Dec 21, 2023

I disagree; it is a fundamental design rule that options are optional for legacy receivers except where that's not possible - the only place where it's not possible is where user data has been altered by the options.

So far, the only rationale to relax that definition and use UNSAFE to "force" option processing seems based on AUTH and ACS; in both cases, AUTH or ACS could fail even when the data is valid, as when UDP content were changed en-route (similar to how some NATs translated content to account for in-band addresses). The key is that there is NO reason, IMO, that use of AUTH or ACS should not be backward compatible with legacy receivers - which is what happens if they're considered UNSAFE.

All option processing is always receiver opt-in; keeping these as SAFE allows that to be consistent across all options.

@Mike-Heard
Copy link
Author

There was another use case for UNSAFE with options that do not modify data but which must be known for the receiver to properly process that data that was in the reverenced mailing list message:

Consider, for instance, the proposal in draft-daiya-tsvwg-udp-options-protocol-number to define a UDP option that identifies an upper layer protocol implemented on top of UDP. It would be used when the UDP port number does not suffice for that purpose.. If that option is transmitted in the surplus area to a legacy UDP receiver, it will be ignored, with the possible result of dispatching the received packets to the wrong upper layer protocol. This is every bit as unsafe as dispatching UDP fragments or encrypted UDP packets to a legacy endpoint that does not understand them.

If that does not prove the need for the proposed change the skeptics, I believe that we are at an impasse on this issue, and I suggest that the chairs need to make a WG consensus call to settle the matter.

@jtouch
Copy link

jtouch commented Dec 26, 2023 via email

@jtouch
Copy link

jtouch commented Dec 26, 2023 via email

@tompandadev
Copy link

Riterating my points on the list why options such as ACS and Auth should be UNSAFE even though then don't explictily modify data.

  1. Applying the dictionary definitions of "unsafe" and "safe", a SAFE options wold be one that can be ignored without harmful effects and an UNSAFE option is one that cannot be ignored lest ignoring the option has the potential to bring harm. Harmful effects are not limited to just options that explicitly modify data.

  2. If an ACS is present in a packet is ignored by receiver then that is potentially harmful. The harm in this case is silently accepting corrupted data that woud have otherwise been detected had the receiver checked the CRC. Silent data corruption is harmful and cost the user time and money. Ignoring the Auth option carries a similar risk.

  3. Ignoring CRC and Auth the at a receiver is also hearmful to the sender. The computations for CRC or secure hash require significant comoute resources, when a receiver ignores these then the sender is wasting these resources with no benefit other than providing a false sense of safety.

  4. An established convention of IETF protcols is that intergrity checks and authentication may be optional for senders, but are not optional for receivers to check (this is established to avoid the harmful effects mentioned above and the fact that the sender is in the best position to evaluate whether optional checksums or CRCs are needed in packet);. If a checksum, CRC, or authentication is present in a packet is expected that a receiver validates it before accepting a packet. This is true for nearly all IETF protocols that define such constructs; UDP options draft is inconsistent with other IETF protocols in this regard. In particular, it's true for the UDP checksum. From RFC1122: "If a UDP datagram is received with a checksum that is non-zero and invalid, UDP MUST silently discard the datagram.", In the UDP options draft it's mentioned that the ACS is an strongly alternative to the UDP checksum, if a receiver may arbitraily ignore it then it is not a valid alterntive to UDP checksum and is not stronger than a UDP checksum.

  5. This has other effects, for instance RFC6936 might allow a the UDPv6 to be zero if the inner packet is covered by it's own integrity check. The ACS would be insufficient to meet this requirement since it can be ignored. So if a sender sets ACS in a UDPv6 tunneled packet they still need to set the UDP checksum. If the ACS is an UNSAFE option then it would be sufficient for meeting this requirement from RFC6936.

@jtouch
Copy link

jtouch commented Dec 28, 2023

The only possibly new point here is the definition of UNSAFE. The doc already makes it clear that this is relative to a legacy receiver, thus it is correct as defined by dictionaries.

I've already addressed other points, including the reason why that context (relative to legacy) is the appropriate driving factor.

@tompandadev
Copy link

tompandadev commented Dec 28, 2023 via email

@jtouch
Copy link

jtouch commented Dec 28, 2023

The issue was raised when the original design principles were presented. AT a minimum, it is directly addressed in the draft in principle 6:

  1. The UDP option mechanism and UDP options themselves should
    default to the same behavior experienced by a legacy receiver.

That principle was presented before the draft became a WG doc, notably in IETF 94 in 2015:

(UDP options) MUST be silently ignorable by legacy receivers

This thread also has several places where I have already addressed this.

In summary:

  • UNSAFE was never intended to force receivers to process an option whose content could be used by legacy endpoints
  • Forcing more options on receivers only serves to break legacy endpoints unnecessarily (while they can detect on-path errors for option-aware endpoints, they ALWAYS break ALL legacy endpoints)
  • those checksums CAN detect path errors, but they also can be falsely triggered by on-path rewriting

Joe

@tompandadev
Copy link

tompandadev commented Dec 28, 2023 via email

@jtouch
Copy link

jtouch commented Dec 29, 2023 via email

@tompandadev
Copy link

tompandadev commented Dec 29, 2023 via email

@jtouch
Copy link

jtouch commented Mar 3, 2024

Updated based on discusson on Feb 19 with Gorry Fairhurst and Mike Heard:
ACS and AUTH remain SAFE because they do not modify UDP user data and could be safely used with legacy endpoints. The default is to report errors but otherwise ignore because legacy apps and endpoints cannot do otherwise.
The default can be overridden, though (it is a default).
Addressed in -29.

@Mike-Heard
Copy link
Author

I think that the substantive issues are addressed and that the definition in -31 Section 12 is fine:

UNSAFE options are not safe to ignore [...]

However, I still see in -31 Section 10:

Options indicated by Kind values in the
range 192..255 are known as UNSAFE options because they do alter the
UDP data payload and thus would interfere with legacy endpoints.

I would still like to see the following change to the text in Section 10:

s/do alter the user data/modify the UDP data payload (or the semantics thereof)/

@tompandadev
Copy link

tompandadev commented Mar 5, 2024 via email

@jtouch
Copy link

jtouch commented Mar 5, 2024 via email

@jtouch
Copy link

jtouch commented Mar 5, 2024

Tom,

Your 5 concerns raised on 12/28 were considered but did not raise new issues.

We did add the explanation of legacy applications that rely on NAT-like rewriting as a case where the current default creates the intended effect. Protecting default UDP behavior has always been the primary and most significant consideration.

@tompandadev
Copy link

tompandadev commented Mar 5, 2024 via email

@jtouch
Copy link

jtouch commented Mar 5, 2024 via email

@tompandadev
Copy link

tompandadev commented Mar 5, 2024 via email

@jtouch
Copy link

jtouch commented Mar 5, 2024

Tom,

I agree that documentation needs to highlight these issues. As I noted, such documentation (RFC or otherwise) would be something we should expect option users to read, but cannot expect legacy designers to have read.

We tell people to read the spec every day and twice on Tuesday, so to speak. Implementers get it wrong all the time, and some do so deliberately/defiantly. There's no utility in optimizing "laws for the lawless", though.

This default is designed to protect legacy users. That comes at an expense to new users, but that's the trade that is knowingy being made here.

Joe

@tompandadev
Copy link

tompandadev commented Mar 5, 2024 via email

@jtouch
Copy link

jtouch commented Mar 5, 2024

Tom,

No, not all users of TCP read RFC793, but they really ought to read the man pages about TCP socket opts and maybe a book before the start fiddling around with options. We can't fix ignorance except with education. Protocols are not child-safe.

Yes, I'm saying that we cannot have silent, default, desired behavior for both legacy and new users - they want different things. And legacy users prevail for legacy protocols.

Every option is unsafe by your definition - any time a sender insists that the receiver accept it. That's not how protocols work - that's not how TCP works. Senders don't force receivers to accept options - they offer, and the receiver decides. We're doing on a packet level what TCP does at the connection establishment level - let the receiver decide.

But we're rehashing old arguments; it's clear we disagree. But a few others discussed this and this is what rough consensus looks like. It's not unanimous.

Joe

@tompandadev
Copy link

tompandadev commented Mar 5, 2024 via email

@jtouch
Copy link

jtouch commented Mar 5, 2024

I did not declare rough consensus; I note this is an example of what it looks like.

I didn't decide how to move forward; see my initial note indicating how this update came about:

Updated based on discusson on Feb 19 with Gorry Fairhurst and Mike Heard.

Gorry is one of the chairs.

Consensus call can be done on the list, where we all can participate, not just those who attend in person.

@Mike-Heard
Copy link
Author

With some reluctance, I am reopening this issue owing to the lengthy discussion that has taken place after closed it. That being said, most of the recent comments actually are not really addressed to this issue at all, but rather to Issue Issue #21.

I'll respond to this comment, which is directly relevant to this issue, in due course.

Thank you.

@Mike-Heard Mike-Heard reopened this Mar 5, 2024
@tompandadev
Copy link

tompandadev commented Mar 5, 2024 via email

@Mike-Heard
Copy link
Author

[ ... ]
Updated based on discusson on Feb 19 with Gorry Fairhurst and Mike Heard.

I can't find this discussion in the tsvwg archives nor in this bug report.
Where did it happen?

Skype call between the co-chair and the document editor, which I was invited to join.

Mike Heard

@moeller0
Copy link

Does a private skype call allow the desired public scrutiny of the sausage making inherent in an IETF RFC?

@moeller0
Copy link

I think that moving the discussion off list to github issues does mainly reduce the number of folks contributing which IMHO might speed up the process, but does it improve the quality of the final product? And does it foster the generation and assessment of (qualified) consensus?

@jtouch
Copy link

jtouch commented Mar 10, 2024

RFC sausage making happens off-list and off-meeting all the time, unless you're prepared to record and post every hallway chat and lunch discussion at the IETF itself, not to mention other subsets, such as with document shepherds, chairs, and authors that happen all the time.

Github is linked on the TSVWG main pages and open to the public. A weekly digest is posted to the list as well, as just occurred this morning.

I believe github has improved the process, notably when there are many open issues for a document. It encourages separate threads of discussion that are much easier for everyone to track and participate in. Any one of us can open and close a github issue, but the actual changes are visible in the posted draft versions, even if github isn't being tracked.

Consensus remains determined by the WG chair, as usual.

@Mike-Heard
Copy link
Author

Does a private skype call allow the desired public scrutiny of the sausage making inherent in an IETF RFC?

It is no different than the occasional use of off-list emails or -- as mentioned above -- hallway discussions. Sometimes such discussions can bring technical issues that are blocking consensus into sharper relief than mailing list exchanges, where subtleties have a way of getting buried. The focus of the skype call was to close as many open github issues as possible. Please look at the diff from -28 to -31 and the github archive to judge the success of that effort and to frame your comments if you don't agree with the outcome.

I think that moving the discussion off list to github issues does mainly reduce the number of folks contributing which IMHO might speed up the process, but does it improve the quality of the final product? And does it foster the generation and assessment of (qualified) consensus?

It's probably true that fewer folks see the discussion when it takes place on github, and it takes extra effort to track it. On the other had, keeping the threads is focused is very useful. Unfortunately, we are violating that objective now by discussing something other that the subject of this issue, which is "The definition of UNSAFE options should not be limited to options that modify user data." I shall return to that issue momentarily, and I implore folks to stay on topic going forward. Issues specific to APC and AUTH should be discussed under Issue #21. Thank you.

@gorryfair
Copy link

We're heading to a WGLC - people implementing the spec or designing protocols that will use this spec will have a chance to comment on the definition of UNSAFE, during that stage. I'd recommend closing this issue, since we do now have updated text:
"UNSAFE options are not safe to ignore and can be used unidirectionally or without soft-state confirmation of UDP option capability."

@tompandadev
Copy link

tompandadev commented Mar 10, 2024 via email

@Mike-Heard
Copy link
Author

Mike Heard wrote:

I think that the substantive issues are addressed and that the definition in -31 Section 12 is fine:

UNSAFE options are not safe to ignore [...]

However, I still see in -31 Section 10:

Options indicated by Kind values in the
range 192..255 are known as UNSAFE options because they do alter the
UDP data payload and thus would interfere with legacy endpoints.

I would still like to see the following change to the text in Section 10:

s/do alter the user data/modify the UDP data payload (or the semantics thereof)/

Joe Touch replied:

Can you give an example of what how semantics would vary?

Sure, draft-daiya-tsvwg-udp-options-next-header is a live example, as I mentioned previously in this thread. The intent of the draft is to use a UDP option to discriminate between transport protocols on an arbitrary port. Here is one of the possible use cases that it mentions:

For instance, when a new transport protocol other than QUIC is
developed and used that is based on UDP and works as a transport for
HTTP, the server will not be able to instantly identify whether QUIC
is used as the transport protocol or the new one is used.

If the proposed Next Header option is transmitted as a safe option in the surplus area, it will be ignored by a legacy UDP implementation, with the possible result of dispatching received packets to the wrong upper layer protocol. To me, at least, that seems is every bit as unsafe as dispatching UDP fragments or encrypted UDP packets to a legacy endpoint that does not understand them.

See also https://mailarchive.ietf.org/arch/msg/tsvwg/JPoH-JsrObcQbdWfRDNmIThyOoY/ and other discussion in that mailing list thread.

Gorry Fairhust writes:

We're heading to a WGLC - people implementing the spec or designing protocols that will use this spec will have a chance to comment on the definition of UNSAFE, during that stage. I'd recommend closing this issue, since we do now have updated text: "UNSAFE options are not safe to ignore and can be used unidirectionally or without soft-state confirmation of UDP option capability."

That's the text in Section 12, and as I said above, I agree that it satisfactorily addresses the substantive issues that have been discussed. However, I still believe that the explanatory text in Sections 10 and 11 is too narrow (thanks to Tom Herbert for pointing out the latter). So, once again, I will close this issue, but with the following comment (revised from the version above):

I would like to see the following change to the text in Section 10:

s/do alter the user data/do alter the user data (or the semantics thereof)/

and the following change to the text in Section 11:

s/which modify UDP user data/which modify UDP user data (or the semantics thereof)/

The above changes would address usages such as that in draft-daiya-tsvwg-udp-options-next-header without changing what is said in Section 12.

This can be discussed further during WGLC if necessary.

Thanks,

Mike Heard

@jtouch
Copy link

jtouch commented Mar 10, 2024

I disagree with "alter the semantics" as too vague and implying what could amount to arbitrary interpretation variations.

E.g., next-header might alter semantics, but it also might not - it might simply change which upper layer service handles the data. I have issues with that as being a useful UDP option at all.

The case under key consideration here is ACS, which neither alters the data nor its semantics, however.

So what we have is an argument entirely based on, AFAICT, an option that might not be approved and another case to which it does not apply.

@tompandadev
Copy link

tompandadev commented Mar 10, 2024 via email

@jtouch
Copy link

jtouch commented Mar 10, 2024

Everything risks "not doing what the sender asked" - otherwise the sender would not have included the option. By your logic, there would be no SAFE options.

I do agree that APC might be renamed "additional" rather than "alternate", as it is not intended to be replacement for UDP CS (because legacy endpoints can't know it's there and know that the packet has a safety check and because it fails to cover the IP pseudo header and UDP header). It would thus be appropriate to use only with UDP CS. That can be made more clear in -32 if so.

@tompandadev
Copy link

tompandadev commented Mar 10, 2024 via email

@jtouch
Copy link

jtouch commented Mar 10, 2024

We agree, but UDP options are not intended to optimize for a kind of statefulness that UDP doesn't support. TCP does what you're expecting, though.

We probably ought to rename "safe" as "legacy-compatible"; the word "safe" seems to carry baggage that the definition in the doc (from nearly the beginning of the origin of these options) is not exactly consistent with.

Joe

@Mike-Heard
Copy link
Author

We probably ought to rename "safe" as "legacy-compatible"; the word "safe" seems to carry baggage that the definition in the doc (from nearly the beginning of the origin of these options) is not exactly consistent with.

It's not clear to me that such a change would help in any way to resolve the disagreements lately aired here. I, for one, have always seen "legacy-compatible" and "safe" as meaning the same thing (I would, for instance, see the protocol number option in draft-daiya-tsvwg-udp-options-next-header as NOT being legacy compatible, as least as defined in that draft). The UDP options spec is likely to suffer some destabilization from such a move, and while that would be temporary, I would see that as another unwelcome delay at this late date.

Understood that one's mileage may vary on this point.

@gorryfair
Copy link

I'd personally agree that APC could be renamed "additional" rather than "alternate".

I'm less sure we need to change UNSAFE - a clear definition ought to be enough to say refers to whether it is compatible with legacy use.

@gorryfair gorryfair reopened this Mar 11, 2024
@gorryfair
Copy link

Editor to consider if the change to definition of APC ought to be made.

@tompandadev
Copy link

tompandadev commented Mar 11, 2024 via email

@jtouch
Copy link

jtouch commented Mar 13, 2024

"best effort" seems redundant given UDP is an unreliable, stateless protocol. Additional, alternate, supplemental, etc. - any of those seem appropriate.

@tompandadev
Copy link

tompandadev commented Mar 13, 2024 via email

@jtouch
Copy link

jtouch commented Mar 13, 2024

All SAFE options are "receiver may ignore" and still get the user data. There's no reason to call this one out.

Additionally, there's no reason to recommend a CRC that could just as easily be ignored by a receiver. Any data check that needs to be reliable needs to, well, BE RELIABLE - which involves coordinating state between the endpoints. That state can easily be used to override the default for APC. We make that point clear in many places - that stateful behavior requires the user create and manage that state, as this is a user protocol.

@tompandadev
Copy link

tompandadev commented Mar 13, 2024 via email

@jtouch
Copy link

jtouch commented Mar 13, 2024

There are plenty of examples of extensions that are only ever silently ignored - new IPv4 options come to mind in particular, or new TCP options that would affect the SYN itself. Although those examples can be added to the general discussion, it's no more useful to call out APC as "hey this will not work as CRITICALLY if you don't override the default" than any other option whose use might be considered "critical" - which includes ALL options, SAFE ones too. I could create a protocol that won't work unless TIME is supported - if I do, it's my obligation to coordinate that with the receiver.

APC does not weaken a UDP packet, other than by reducing its payload size (as all our options do). Yes, using APC and not UDP CS would be a mistake (and that can be emphasized in the text) - not just because it can be silently ignored, but because coverage is different and because legacy path devices could misinterpret CS==0 as having no payload protection at all.

But this does NOT diverge from the semantics of any protocol being extended using a mechanism that was not anticipated with defined behavior for unknown codepoints when the core protocol was released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants