Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URL parameters containing russian/chinese language characters are not escaped properly #8321

Closed
czardoz opened this issue Aug 29, 2016 · 10 comments
Labels
http Issues or PRs related to the http subsystem. querystring Issues and PRs related to the built-in querystring module.

Comments

@czardoz
Copy link

czardoz commented Aug 29, 2016

  • Version: All versions, tested on v6.3.1
  • Platform: All platforms, tested on Ubuntu x86_64 and Windows x86_64
  • Subsystem: http, querystring

Node is not able to properly send query string parameters with russian language characters.
Example:

var http = require('http');

http.get('http://httpbin.org/get?search=обязательный&a=b c', function (response) {
    // Continuously update stream with data
    var body = '';
    response.on('data', function (d) {
        body += d;
    });
    response.on('end', function () {
        console.log(body);
    });
});

Results in the output:

{
  "args": {
    "a": "b c", 
    "search": ">1O70B5;L=K9"
  }, 
  "headers": {
    "Host": "httpbin.org"
  }, 
  "origin": "106.51.38.69", 
  "url": "http://httpbin.org/get?search=>1O70B5%3BL=K9&a=b c"
}

Note that the space between b and c was correctly escaped by Node, but the first parameter ("search") sent garbage. Ideally, that should be escaped as well. Chrome XMLHttpRequest handles the encoding correctly.

@addaleax addaleax added the http Issues or PRs related to the http subsystem. label Aug 29, 2016
@Fishrock123
Copy link
Member

Fishrock123 commented Aug 29, 2016

Oh I remember this, there's an older issue for it. IIRC it's not allowed in the spec and is a potential security issue of we do.

Edit: hmmm, maybe I'm remembering #1693 ... this could be different.

cc @nodejs/http probably

@jasnell
Copy link
Member

jasnell commented Aug 29, 2016

This is one of the several issues we have in the current url.parse implementation relating to extended character support. The new WHATWG URL parser should handle this without problem but it's going to take a bit too work that in still. I'll investigate to see how difficult it would be to fix this in the current impl.

@czardoz
Copy link
Author

czardoz commented Aug 29, 2016

I think the problem lies here:

QueryString.escape = function(str) {
, specifically in the multi-byte handling.

In the meantime, is there any workaround for this?

@vkurchatkin
Copy link
Contributor

In the meantime, is there any workaround for this?

Well, you can alway manually encode query parameters

@czardoz
Copy link
Author

czardoz commented Aug 29, 2016

@vkurchatkin That results in double-encoding of parameters. Unfortunately, there does not seem to be a way to turn off encoding in the http.request() API.

@bnoordhuis
Copy link
Member

@czardoz I'm 98% sure you can work around that by using a { host: '...', path: '...' } options object.

@jasnell
Copy link
Member

jasnell commented Aug 29, 2016

Something like this:

var http = require('http');

var options = {
  host: 'httpbin.org',
  path: `/get?search=${encodeURIComponent('обязательный')}&a=b%20c`
};

http.get(options, function (response) {
    // Continuously update stream with data
    var body = '';
    response.on('data', function (d) {
        body += d;
    });
    response.on('end', function () {
        console.log(body);
    });
});

@TimothyGu
Copy link
Member

Just to confirm, the new WHATWG URL API does handle this correctly:

> process.versions
{ http_parser: '2.7.0',
  node: '7.0.0',
  v8: '5.4.500.36',
  uv: '1.9.1',
  zlib: '1.2.8',
  ares: '1.10.1-DEV',
  icu: '57.1',
  modules: '51',
  openssl: '1.0.2j' }
> new url.URL('http://httpbin.org/get?search=обязательный&a=b c')
URL {
  href: 'http://httpbin.org/get?search=%D0%BE%D0%B1%D1%8F%D0%B7%D0%B0%D1%82%D0%B5%D0%BB%D1%8C%D0%BD%D1%8B%D0%B9&a=b%20c',
  protocol: 'http:',
  hostname: 'httpbin.org',
  pathname: '/get',
  search: '?search=%D0%BE%D0%B1%D1%8F%D0%B7%D0%B0%D1%82%D0%B5%D0%BB%D1%8C%D0%BD%D1%8B%D0%B9&a=b%20c'
}

@dougwilson
Copy link
Member

Until the big is fixed in node.js core, you may be able to blindly toss the URL into the https://www.npmjs.com/package/encodeurl module to get it encoded (and without encountering double-encoding).

@ChALkeR ChALkeR added the querystring Issues and PRs related to the built-in querystring module. label Nov 7, 2016
@Trott
Copy link
Member

Trott commented Jul 15, 2017

This issue has been inactive for sufficiently long that it seems like perhaps it should be closed. Feel free to re-open (or leave a comment requesting that it be re-opened) if you disagree. I'm just tidying up and not acting on a super-strong opinion or anything like that.

@Trott Trott closed this as completed Jul 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
http Issues or PRs related to the http subsystem. querystring Issues and PRs related to the built-in querystring module.
Projects
None yet
Development

No branches or pull requests

10 participants