Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Appendix on RFC6570-derived behavior + allowReserved (3.0.4) #3818

Merged
merged 8 commits into from
Jun 10, 2024

Conversation

handrews
Copy link
Member

@handrews handrews commented May 17, 2024

Rendering with all currently open 3.0.4 changes (as of the timestamp of this edit - use the dropdown on the word "edited" in the bar for this comment) can be found here.

NOTE: The rendering will have mismatched table of contents lettering, which will be fixed when I merge the various appendix PRs and resolve conflicts along the way.

Addresses:

This is the first of several appendixes to be added to clarify the most obscure details of Parameter Object and Encoding Object serialization. Future PRs will add more on serializing headers and cookies, and on the incredibly convoluted topic of percent-encoding.

Why an appendix? Because these are really fiddly details, and technically they all follow from the existing requirements. There will be other clarifications in the main text. I just need to get this really fiddly stuff out of the way so it's easier to write the major parts.

Why not put this on the Learn site? Because this is part of a larger set of serialization issues that took me several weeks to sort out, despite being very comfortable researching and reconciling RFCs in general, and being very familiar with the relevant RFCs in particular. It's too high of a bar to relegate to secondary documentation. It needs clarity in the spec, IMNSHO. [EDIT: However, the directly RFC6570-analogus parts are probably the most straightforward part of this- the complexity is in the exceptions and in related areas that I have not yet posted as PRs.]

This clarifies the correspondence between OAS fields and RFC6570 operators, and acknowledges that some field values and combinations do not have analogues. It provides further guidance for how to use RFC6570 implementations to support these configurations.

This includes a SHOULD directive regarding using RFC6570 expansion with the non-RFC6570 styles, as the use of "explode" and "allowReserved" does not otherwise make any sense. It perhaps could be a MUST.

@handrews handrews added clarification requests to clarify, but not change, part of the spec param serialization Issues related to parameter and/or header serialization labels May 17, 2024
@handrews handrews added this to the v3.0.4 milestone May 17, 2024
@handrews handrews requested a review from a team May 17, 2024 15:46
@handrews handrews changed the title Appendix on RFC6570-derived behavior Appendix on RFC6570-derived behavior (3.0.4) May 17, 2024
versions/3.0.4.md Outdated Show resolved Hide resolved
Copy link
Contributor

@ralfhandl ralfhandl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, one phrasing suggestion

ralfhandl
ralfhandl previously approved these changes May 23, 2024
@handrews
Copy link
Member Author

@darrelmiller after thinking about your comment in OAI/learn.openapis.org#100 I'm wondering if maybe I'm not giving the right guidance on the combination of style: form, allowReserved: true.

In this PR, I basically suggest using single-variable templates like "{+foo}" and assembling the query string with the results. Alternatively, you could assemble a template using a mixture of RFC6570's ?, &, and + operators, and pass that whole thing through to an RFC6570s process. Which you could also do for pipeDelimited, spaceDelimited, and deepObject.

I'm going to move this to a draft PR because I think this needs some consideration.

style | form | `?`
allowReserved | `false` | _n/a_
allowReserved | `true` | `+`
explode | `false` | _n/a_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RFC 6570 refers to explode as a modifier because it is a suffix added to the variable as opposed to an operator which appears as a prefix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes. I was just going by the characters, not the placement, but you're right that it should be more clear.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the headings to make things more clear in the latest reworking of this PR.

type: string
```

This example is equivalent to RFC6570's `{?foo*,bar}`, and ***NOT*** `{?foo*}{&bar}`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{?foo*}{&bar}
Is this even a valid URI Template? If foo is empty/undefined then the resulting href would be invalid.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's kind of my point. I mean, it's a valid URI Tempate in the sense that it's syntactically legal (I think?) it just might not produce a valid URI. The URI Template spec is pretty intentionally blind to the context outside of the variables.

But perhaps I could note that that's why you need to do it the {?foo*,bar} way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've addressed this in the latest, much-expanded version.

This aligns allowReserved with style by similarly correlating it
with RFC6570 operators.  This will make it easier to write a more
in-depth explanation of the process in an appendix.
@handrews handrews force-pushed the rfc6570-304 branch 3 times, most recently from 68b4fa2 to 8a73f7c Compare May 25, 2024 21:49
This is the one of several appendixes to be added to clarify
the most obscure details of Parameter Object and Encoding Object
serialization.

This clarifies the correspondence between OAS fields and RFC6570
operators, and acknowledges that some field values and combinations
do not have analogues.  It provides further guidance for how to
use RFC6570 implementations to support these configurations.

This includes a SHOULD directive regarding using RFC6570 expansion
with the non-RFC6570 styles, as the use of "explode" and
"allowReserved" does not otherwise make any sense.  It perhaps
could be a MUST.

Examples are included to show both typical usage, and how to
work around the lack of exact RFC6570 equivalences for certain
configurations.
@handrews
Copy link
Member Author

Based on @darrelmiller 's comments and trying to think this through further, I have significantly expanded this and changed the recommendation regarding how to construct query strings with configurations that don't quite match RFC6570.

I've also rewritten the allowReserved section and incorporated that aspect in the appendix. I had planned to address that in another appendix on percent-encoding, but this allows readers to find all query string-related RFC6570 content in one place, and will let the percent-encoding appendix focus on the confusing interoperability landscape there.

Finally, the appendix now has several examples showing both easy and difficult cases.

@handrews handrews changed the title Appendix on RFC6570-derived behavior (3.0.4) Appendix on RFC6570-derived behavior + allowReserved (3.0.4) May 25, 2024
@@ -1068,11 +1068,13 @@ Field Name | Type | Description
---|:---:|---
<a name="parameterStyle"></a>style | `string` | Describes how the parameter value will be serialized depending on the type of the parameter value. Default values (based on value of `in`): for `query` - `form`; for `path` - `simple`; for `header` - `simple`; for `cookie` - `form`.
<a name="parameterExplode"></a>explode | `boolean` | When this is true, parameter values of type `array` or `object` generate separate parameters for each value of the array or key-value pair of the map. For other types of parameters this property has no effect. When [`style`](#parameterStyle) is `form`, the default value is `true`. For all other styles, the default value is `false`.
<a name="parameterAllowReserved"></a>allowReserved | `boolean` | Determines whether the parameter value SHOULD allow reserved characters, as defined by [RFC3986](https://tools.ietf.org/html/rfc3986#section-2.2) `:/?#[]@!$&'()*+,;=` to be included without percent-encoding. This property only applies to parameters with an `in` value of `query`. The default value is `false`.
<a name="parameterAllowReserved"></a>allowReserved | `boolean` | When this is true, parameter values are serialized using reserved expansion, as defined by [RFC6570](https://datatracker.ietf.org/doc/html/rfc6570#autoid-20), which allows [RFC3986's reserved character set](https://datatracker.ietf.org/doc/html/rfc3986#autoid-13), as well as percent-encoded triples, to pass through unchanged, while still percent-encoding all other disallowed characters (including `%` outside of percent-encoded triples). Applications are still responsible for percent-encoding reserved characters that are not allowed in the query string, or have a meaning other than their literal value (e.g. `+` as an escape for the space character in `application/x-www-form-urlencoded` query strings). This property only applies to parameters with an `in` value of `query`. The default value is `false`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applications are still responsible for percent-encoding reserved characters that are not allowed in the query string

I'm not sure how to interpret this. Is it saying applications are responsible for characters that only reserved in the query string? Or is it talking about parts of the query string that are are not parameter values?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The upshot is that if you use allowReserved: true, you still need to manually percent-encode the following reserved characters:

  • # because it is the delimiter between the query and fragment
  • [ and ] because while they are reserved, they are forbidden in the query by RFC3986 (yes, deepObject examples are wrong, see Appendix for percent-encoding concerns (3.0.4) #3859 for more on this)
  • + because form-urlencoded treats it as a space
  • & and = because form-urlencoded uses them as delimiters

You can leave any of the others unencoded, and you MUST leave a + that represents a space unencoded. Which maybe I did not show in the example and should have... hm....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@darrelmiller I'm open to suggestion on how to better state this. At one point I had a short paragraph after thist table going into a bit more detail, and then moved it all to the appendix. Perhaps a short paragraph is warranted, though.

* any parameter name that is not a legal RFC6570 variable name (alphanumerics, `_`, and percent-encoded characters)

The Parameter Object's `name` field has a much more permissive syntax than [RFC6570 variable name syntax](https://www.rfc-editor.org/rfc/rfc6570#section-2.3).
A parameter name that includes characters outside of the allowed RFC6570 variable character set MUST be percent-encoded before it can be used in a URI Template, and MUST set `allowReserved: true` to avoid double percent-encoding.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is necessary to set allowReserved to true.

            var template = new UriTemplate("http://example.org/{?foo%2Fid}");
            template.SetParameter("foo%2Fid", "bar/");
            var uriString = template.Resolve();
            Assert.Equal("http://example.org/?foo%2Fid=bar%2F", uriString);

I tried this using my Tavis.UriTemplates library and it passed without having to use allowReserved.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I tried two Python implementations and they agree, sse also this comment about the larger context

versions/3.0.4.md Outdated Show resolved Hide resolved
The exact variable names are unimportant, but in this example "f" and "w" are meant to suggest "formulas" and "words", while "n" and "v" suggest "name" and "value":

```urlencoded
?{+fn0}={+fv0}&{+fn1}={+fv1}&{+fn2}={+fv2}&{wn}={wv0} {wv1} {wv2}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't clear to me why we need the + for the formula names. Is this intended to cover the deep object scenario where someone uses [ ] characters in the names.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not, it was about names that aren't valid template variable names. Which would include the deepObject-style names and, yeah, that could be interesting, but... UGH. deepObject. idek know what to do about that. See #1322 - I can't entirely figure out how it's even supposed to work. I think I'm going to ignore it for now as I'll need to fix several deepObject things as it is. I've been putting it off to the last thing.

Manually constructed URI Template:

```urlencoded
?{wn}={wv0} {wv1}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the advantage of that manually constructed URI template over

?words={wv0} {wv1}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good question and relates to your other two comment about not needing to set allowReserved for names with percent encoding and not needing the + on the formula names.

My original thought was "simulate the normal behavior of parameter names in form by putting them through the same process as the values" which led to "if you need pre-percent encoded values in names then you'll have to set allowReserved: true to get them.

But since that seems not to be true (two Python URI Template implementations both agree with you on that), then there's no need to use allowReserved and therefore no need to make the names template variables since they have to be pre-encoded anyway. It's simpler to just copy them as part of the literal portion of the template. I'll make an update taking all of this into account.

```
The `/` and the pre-percent-encoded `%2B` have been left alone, but the disallowed `^` character (inside a value) and space characters (in the template but outside of the expanded variables) were percent-encoded.

#### Undefined Values and Manual URI Template Construction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really interesting exploration of doing "late generation" of URI templates based on the parameter values to work around the features in OpenAPI that are not supported by 6570.

Co-authored-by: Darrel <darrmi@microsoft.com>
* Paramter names don't need to be modeled as template variables.
* Include an example of a parameter name that needs encoding.
* Add more details to the fixed field entries for allowReserved
@handrews
Copy link
Member Author

@darrelmiller I've added another commit that hopefully addresses all of your concerns.

versions/3.0.4.md Outdated Show resolved Hide resolved
Co-authored-by: Ralf Handl <ralf.handl@sap.com>
@handrews
Copy link
Member Author

@ralfhandl I've started a pre-release checklist and put the RFC links on it:

I'm not going to change them in this PR because it would just be added noise as we don't know the target yet (and I do not want to hold this PR up for that decision- we need to do that all at once, like fixing the section links).

@ralfhandl ralfhandl self-requested a review May 29, 2024 09:32
ralfhandl
ralfhandl previously approved these changes May 29, 2024
@handrews
Copy link
Member Author

This now has 2 tsc approvals, one of which was cleared by the merge conflict resolution which was trivial. So I'm going to merge this.

@handrews handrews merged commit 39448ef into OAI:v3.0.4-dev Jun 10, 2024
1 check passed
@handrews handrews deleted the rfc6570-304 branch June 10, 2024 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clarification requests to clarify, but not change, part of the spec param serialization Issues related to parameter and/or header serialization
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants