Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

potPourriB.dhallb: Fix encoding of query string #655

Merged
merged 2 commits into from Jul 21, 2019

Conversation

sjakobi
Copy link
Collaborator

@sjakobi sjakobi commented Jul 19, 2019

Original URL (potPourriA.dhall):

https://-._~%2C!$&'*+;=:@0abc1--12a------a-a--a-0/foo?/-._~%2C!$&'*+;=:@/?

JSON representation of the original binary:

[
    24,
    null,
    0,
    1,
    null,
    "-._~%2C!$&'*+;=:@0abc1--12a------a-a--a-0",
    "foo",
    "/-._~%2C!$&%27*+;=:@/?"
]

JSON representation of the new binary:

[
    24,
    null,
    0,
    1,
    null,
    "-._~%2C!$&'*+;=:@0abc1--12a------a-a--a-0",
    "foo",
    "/-._~%2C!$&'*+;=:@/?"
]

Diff:

9c9
<     "/-._~%2C!$&%27*+;=:@/?"
---
>     "/-._~%2C!$&'*+;=:@/?"

The query string was incorrectly percent-encoded.

Original URL (potPourriA.dhall):

    https://-._~%2C!$&'*+;=:@0abc1--12a------a-a--a-0/foo?/-._~%2C!$&'*+;=:@/?

JSON representation of the original binary:

    [
        24,
        null,
        0,
        1,
        null,
        "-._~%2C!$&'*+;=:@0abc1--12a------a-a--a-0",
        "foo",
        "/-._~%2C!$&%27*+;=:@/?"
    ]

JSON representation of the new binary:

    [
        24,
        null,
        0,
        1,
        null,
        "-._~%2C!$&'*+;=:@0abc1--12a------a-a--a-0",
        "foo",
        "/-._~%2C!$&'*+;=:@/?"
    ]

Diff:

    9c9
    <     "/-._~%2C!$&%27*+;=:@/?"
    ---
    >     "/-._~%2C!$&'*+;=:@/?"

The query string was incorrectly percent-encoded.
@singpolyma
Copy link
Collaborator

singpolyma commented Jul 19, 2019 via email

@sjakobi
Copy link
Collaborator Author

sjakobi commented Jul 19, 2019

+ should be percent-encoded because some servers interpret it as a space‎, following the CGI spec

That's an unrelated potential standard change, right?

Currently + is allowed as a sub-delim:

pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
query = *( pchar / "/" / "?" )
pct-encoded = "%" HEXDIG HEXDIG
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
; this is the RFC3986 sub-delims rule, without "(", ")" or ","
; see comments above the `http-raw` rule above
sub-delims = "!" / "$" / "&" / "'" / "*" / "+" / ";" / "="

@philandstuff
Copy link
Collaborator

Query strings should be treated as opaque strings, and not decoded or encoded. We have tests/parser/success/unit/import/urls/escapedQueryA.dhall to verify this behaviour.

It makes sense for us to decode and encode path segments, because we convert local paths to URL paths, which is an operation which involves manipulating the contents of a path. But we don't ever do any manipulation of the contents of a query string, so we should just pass whatever the user gave us around in unmodified form. Introducing percent-decoding or -encoding here just feels unnecessary and a source of subtle bugs; especially since (as you point out) the rules for percent-encoding and -decoding query strings are not the same as for paths.

sjakobi added a commit to dhall-lang/dhall-haskell that referenced this pull request Jul 19, 2019
Corresponding change to the standard:
dhall-lang/dhall-lang#627

The potPourri parser test remains disabled due to
dhall-lang/dhall-lang#655.

Fixes #1110.
mergify bot pushed a commit to dhall-lang/dhall-haskell that referenced this pull request Jul 20, 2019
Corresponding change to the standard:
dhall-lang/dhall-lang#627

The potPourri parser test remains disabled due to
dhall-lang/dhall-lang#655.

Fixes #1110.
@sjakobi
Copy link
Collaborator Author

sjakobi commented Jul 20, 2019

Note that, before #627, the ' in the query string was not percent-encoded. So I really think that this patch is a fix for a small regression – I really don't want to argue for a standard change here.

@Gabriella439
Copy link
Contributor

@sjakobi: My understanding is that what is currently in master is what follows the standard since the standard does not specify to percent-encode query fragments. So without a change to the standard then I believe we should not change the test result.

Also, in general the current standardization trend is towards preserving the original URL as much as possible (i.e. not percent-encoding the query fragment). See #581 for the roadmap of URL-related changes

@sjakobi
Copy link
Collaborator Author

sjakobi commented Jul 20, 2019

My understanding is that what is currently in master is what follows the standard since the standard does not specify to percent-encode query fragments.

No. As I tried to point out in the PR description, potPourriB.dhallb in master incorrectly uses percent encoding in the query string. The incorrect percent encoding was introduced in #627.

This PR will fix this.

@sjakobi sjakobi mentioned this pull request Jul 20, 2019
@Gabriella439
Copy link
Contributor

@sjakobi: Oh, I misread. Yeah, then this looks good to me

@sjakobi sjakobi merged commit f4ee1fb into master Jul 21, 2019
@sjakobi sjakobi deleted the sjakobi/fix-potPourri-dhallb branch July 21, 2019 10:55
sjakobi added a commit to dhall-lang/dhall-haskell that referenced this pull request Jul 21, 2019
sjakobi added a commit to dhall-lang/dhall-haskell that referenced this pull request Jul 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants