Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve RFC 3986 conformance in QueryStringFormattingOptions #68

Closed
2 tasks done
matt-holden opened this issue Oct 20, 2017 · 7 comments
Closed
2 tasks done

Improve RFC 3986 conformance in QueryStringFormattingOptions #68

matt-holden opened this issue Oct 20, 2017 · 7 comments

Comments

@matt-holden
Copy link
Contributor

Summary

I'd like to propose making it easier to follow RFC 3986 URI-Encoding rules "to the letter."

Section 3.2 identifies the following general delimiters and subdelimiters:

  • General: :#[]@?/
  • Subdelimiters: !$&'()*+,;=

(Section 3.4 says that "?" and "/" should NOT be escaped in query strings, however.)

To get Conduit to percent-encode all these characters, you would need to (some) of these characters need to the percentEncodedReservedCharacterSet property of QueryStringFormattingOptions, which is non-obvious.

It should be noted that Alamofire always percent-encodes these 16 characters (the 18 delimiters, minus ? and /). While "matching Alamofire" isn't a goal of Conduit, I'd argue that its percent-encoding behavior is, to a degree, battle-tested and expected in iOS apps.

Notes/thoughts/ramblings

URI percent-encoding behavior seems to vary from place to place across platforms/frameworks/languages. Obviously, it's a bit of a rat's nest. A few examples:

function fixedEncodeURIComponent(str) {
  return encodeURIComponent(str).replace(/[!'()*]/g, function(c) {
    return '%' + c.charCodeAt(0).toString(16);
  });
}

Suggestions/thoughts

  1. Define static constants of CharacterSet representing the most common patterns that can be passed to QueryStringFormattingOptions

  2. Make QueryStringFormattingOptions default behavior match a stringent implementation of RFC 3986.

  3. Implement CustomDebugStringConvertible on QueryStringFormattingOptions. The debug output can include the list of characters that will be percent-encoded in a query string.

  4. The Conduit documentation for percentEncodedReservedCharacterSet references section 6.2.2, which, as best I can tell, is not pertinent to the topic of percent-encoding query strings. I believe this may be a typo.


If any action items spring from this issue & its discussion, I'd love to tackle the implementation.

@matt-holden matt-holden changed the title Improve RFC 3986 conformance in QueryStringFormattingOptions Improve RFC 3986 conformance in QueryStringFormattingOptions Oct 20, 2017
@matt-holden
Copy link
Contributor Author

ping ;)

@matt-holden
Copy link
Contributor Author

ping ping? :) @JoshHenderson @johnhammerlund

I can close this if it's not wanted/needed, or no longer relevant

@eneko
Copy link
Contributor

eneko commented Jan 23, 2019

Hi @matt-holden ! Sorry, we haven't followed back with you on this.

Personally, I would like for Conduit to be as flexible as possible, so this seems like a great addition.

Have you done any work at all on this area? I haven't had a chance to use Conduit with other APIs, might be neat to find some examples we could use on unit/integration tests.

@matt-holden
Copy link
Contributor Author

@eneko I hacked on this a bit back when I opened the issue, but as I recall I didn't come up with a clean solution right away.

I might have some time this morning to tinker with this again. You might see a PR -- or you might not :P

@johnhammerlund
Copy link
Contributor

johnhammerlund commented Jan 23, 2019

@matt-holden After circling back, I've been trying to grasp the stated issue -- RCF 3986 section 3.4 suggests that by not percent-encoding "?" and "/" in the query component, it could provide additional usability for deserialization of query components that may include a URL. However, the downside is that certain deserializers may also fail to recognize the delimiter hierarchy. The tradeoff is only offered as a suggestion and not a requirement, which may be the source of confusion among the various implementations:

The characters slash ("/") and question mark ("?") may represent data within the query component. Beware that some older, erroneous implementations may not handle such data correctly when it is used as the base URI for relative references (Section 5.1), apparently because they fail to distinguish query data from path data when looking for hierarchical separators. However, as query components are often used to carry identifying information in the form of "key=value" pairs and one frequently used value is a reference to another URI, it is sometimes better for usability to avoid percent-encoding those characters.

As for the reference to section 6.2.2, that indeed does look like a typo (should refer to section 2)

@matt-holden
Copy link
Contributor Author

@johnhammerlund I remember feeling like I did a really good job documenting the issue at the time -- but I'm having a hard time remembering exactly what I was driving at as well :)

@johnhammerlund
Copy link
Contributor

@matt-holden honestly, some of these RFC's really tend to leave edges of implementation open for interpretation. Now that I better understand this (after over a year 🤦‍♂️ ) I believe Conduit indeed conforms to RFC 3986, but the ambiguity remains nonetheless. I do like suggestions 1, 3, and 4, and would happy to review a PR!

@eneko eneko closed this as completed Jun 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants