Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Percent encode unexpectedly encodes numbers #3

Closed
DanTup opened this issue Jul 23, 2016 · 3 comments
Closed

Percent encode unexpectedly encodes numbers #3

DanTup opened this issue Jul 23, 2016 · 3 comments

Comments

@DanTup
Copy link
Contributor

@DanTup DanTup commented Jul 23, 2016

I'm integrating with the twitter API which describes that data should be "percent encoded". In C# I found that the method that matched Twitters docs was Uri.EscapeDataString. I'm trying to do the same thing in Dart and discovered that none of them methods on the Uri class behaved correctly, but someone pointed me at this package.

This seems to work great for all the characters I've tried, except numbers:

// Dart:
print(percent.encode("1".codeUnits)); // outputs %31
// C#:
Console.WriteLine(Uri.EscapeDataString("1")); // outputs 1

The C# version seems to match twitter definition of "percent encoding" but I'm struggling to find detailed docs on whether this is a standard and if so, which is correct. I'm raising it here in the hope that either a) it's a bug and can be fixed or b) there's another method I want that behaves as C# does/twitter expects.

@DanTup DanTup changed the title Percent encode (unexpectedly?) encodes numbers Percent encode unexpectedly encodes numbers Jul 23, 2016
@DanTup
Copy link
Contributor Author

@DanTup DanTup commented Jul 23, 2016

On Google+ someone pointed out that the Dart docs say this is compatible with RFC 3986, which contains:

2.3. Unreserved Characters

Characters that are allowed in a URI but do not have a reserved purpose are called unreserved. These include uppercase and lowercase letters, decimal digits, hyphen, period, underscore, and tilde.

 unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"

[...]

For consistency, percent-encoded octets in the ranges of ALPHA (%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D), period (%2E), underscore (%5F), or tilde (%7E) should not be created by URI producers

This leads us to believe this is a bug in the implementation and they should not be encoded.

@lrhn
Copy link
Member

@lrhn lrhn commented Jul 25, 2016

Definitely a bug. The encoder contains the following code to detect characters that should not be encoded:

if ((letter >= $a && letter <= $z) ||
        byte == $dash ||
        byte == $dot ||
        byte == $underscore ||
        byte == $tilde) {

It clearly fails to catch digits. Shame on whoever reviewed this!

@DanTup
Copy link
Contributor Author

@DanTup DanTup commented Jul 26, 2016

@lrhn I've sent #4 to extend tests and fix this but not sure if I need to get it reviewed on the Chromium code review site?

@nex3 nex3 closed this in #4 Aug 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

2 participants
You can’t perform that action at this time.