Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate aligning application/x-www-form-urlencoded with URL's query #18

Closed
annevk opened this issue May 9, 2015 · 14 comments · Fixed by #495
Closed

Investigate aligning application/x-www-form-urlencoded with URL's query #18

annevk opened this issue May 9, 2015 · 14 comments · Fixed by #495

Comments

@annevk
Copy link
Member

annevk commented May 9, 2015

There's already some differences with respect to escaping and #17 will enlarge that gap. Ideally they are the same I think.

@annevk
Copy link
Member Author

annevk commented Aug 14, 2015

@ziyunfei
Copy link

@annevk Both Firefox and Chrome escape \r and \n to "%0D%0A" when form submitting, should this spec align with them?
screenshot:

test script:

for (let i = 0; i < 128; i++) {
  let iframe = document.createElement("iframe")
  iframe.src = navigator.vendor === "Google Inc." ? "/" : "data:text/html,bar"
  iframe.onload = function() {
    iframe.contentDocument.body.innerHTML = `
      <form id=form>
        <textarea id=textarea name=name></textarea>
      </form>
    `
    let char = String.fromCodePoint(i)
    iframe.contentWindow.textarea.value = char
    iframe.onload = function() {
      let htmlFormEncoded = iframe.contentWindow.location.href.split("name=")[1]
      let paramsObject = new URLSearchParams()
      paramsObject.set("", char)
      let URLSearchParamsSerialized = paramsObject.toString().slice(1)
      if (htmlFormEncoded !== URLSearchParamsSerialized) {
        console.error(`${URLSearchParamsSerialized} vs ${htmlFormEncoded}`)
      }
    }
    iframe.contentWindow.form.submit()
  }
  document.body.appendChild(iframe)
}

@annevk
Copy link
Member Author

annevk commented May 12, 2016

@annevk
Copy link
Member Author

annevk commented May 12, 2016

Oh sorry, you mean that they turn a linebreak into CR LF. That is interesting and yes, maybe we should define that. Could you open a new issue for that?

@annevk
Copy link
Member Author

annevk commented Dec 29, 2016

@ziyunfei I had another look and the linebreak behavior appears to be defined by step 4 (labeled "End") at https://html.spec.whatwg.org/multipage/forms.html#constructing-the-form-data-set.

@annevk
Copy link
Member Author

annevk commented Dec 29, 2016

Query state encodes: 0x00-0x20, 0x22, 0x23, 0x3C, 0x3E, and 0x7E-Inf.

urlencoded encodes: 0x00-0x19, 0x20 (as +, not %20), 0x21-0x29, 0x2B-0x2C, 0x2F, 0x40, 0x5B-0x5E, 0x60, 0x7B-Inf.

Getting these aligned seems hard, though it would be good for URLSearchParams...

@domenic
Copy link
Member

domenic commented Mar 2, 2018

The fact that URLSearchParams uses the application/x-www-form-urlencoded parser whereas the URL parser uses a different parser has unintuitive results when comparing url.search and url.searchParams:

let a = 'http://localhost:9999/segment?foo=bar/baz? boo';
let b = 'http://localhost:9999/segment?foo=bar%2Fbaz%3F%20boo';

let urlA = new URL(a);
let urlB = new URL(b);

// Not equal:
console.log(urlA.search); // "?foo=bar/baz?%20boo"
console.log(urlB.search); // "?foo=bar%2Fbaz%3F%20boo"

// Equal:
console.log(urlA.searchParams.get("foo")); // "bar/baz? boo"
console.log(urlB.searchParams.get("foo")); // "bar/baz? boo"

// Equal, but both different from search:
console.log(urlA.searchParams.toString()); // "foo=bar%2Fbaz%3F+boo"
console.log(urlB.searchParams.toString()); // "foo=bar%2Fbaz%3F+boo"

Perhaps, if we can't get full alignment, at least URLSearchParams should use the URL parser/serializer, since it's so closely tied to URLs and their parsed representations?

@annevk
Copy link
Member Author

annevk commented Mar 2, 2018

I think having URLSearchParams do the same thing as <form> is rather useful though, e.g., for request bodies.

@domenic
Copy link
Member

domenic commented Mar 2, 2018

I'd rather leave that to a separate class, e.g. FormData, or I guess since that name is taken, URLEncodedFormData.

Maybe just use FormData but add FormData.fromURLEncoded() and formData.toURLEncoded() methods methods.

@annevk
Copy link
Member Author

annevk commented Mar 2, 2018

It seems a little late given that fetch() et al already support this?

@domenic
Copy link
Member

domenic commented Mar 2, 2018

I doubt people are using URLSearchParams with fetch that much on data that encodes differently and against servers that don't decode the differently-encoded data to the same result.

After all, if you're making a compatibility argument, then we have no hope of fixing the OP, or changing URL encoding in general.

@annevk
Copy link
Member Author

annevk commented Mar 2, 2018

Yeah, I haven't been hopeful about this issue because of that. Changing application/x-www-form-urlencoded seems tricky. Changing the URL parser part seems somehow more doable as that has changed so much already and is somewhat inconsistent across implementations.

@annevk
Copy link
Member Author

annevk commented May 9, 2018

Note that they cannot ever be identical, as searchParams has to escape & and = (due to it manipulating individual pairs) and search does not.

I think I reached the point where I think we should close this, but it's probably worth ensuring there's sufficient test coverage first and having another look at the results in browsers.

@domenic
Copy link
Member

domenic commented May 9, 2018

That's not what we mean by identical. We mean that url.search === url.searchParams.toString() after parsing a URL.

I'd strongly object to closing this; I'd like to make a concerted effort to fix this in browsers. If it's not fixable, then we need something new like url.realSearchParams that's actually fit for its purpose.

annevk added a commit that referenced this issue May 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

3 participants