Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uri module doesn't properly handle escape sequences in JSON #7586

Closed
benhoyt opened this issue May 28, 2014 · 4 comments
Closed

uri module doesn't properly handle escape sequences in JSON #7586

benhoyt opened this issue May 28, 2014 · 4 comments
Labels
bug This issue/PR relates to a bug.

Comments

@benhoyt
Copy link

benhoyt commented May 28, 2014

Issue Type:

Bug Report

Ansible Version:

ansible 1.5.4

Environment:

Ubuntu 12.04

Summary:

uri module calls content.decode('unicode_escape') on response, which means escape sequences in JSON responses aren't handled properly

Steps To Reproduce:

You can test this by running this ansible command line against a test URL I uploaded:

ansible localhost -m uri -a "url=https://s3.amazonaws.com/oresource/python_version.json return_content=true"
Expected Results:

Output should have a "json" key with the PythonVersion string decoded properly.

Actual Results:
localhost | success >> {
    "accept_ranges": "bytes",
    "changed": false,
    "content": "{\"PythonVersion\": \"2.7.6 (default, May 16 2014, 16:48:38) \n[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)]\"}",
    "content_length": "100",
    "content_location": "https://s3.amazonaws.com/oresource/python_version.json",
    "content_type": "application/json",
    "date": "Wed, 28 May 2014 21:00:47 GMT",
    "etag": "\"2ad455693e539d42df070a367a8f16a3\"",
    "last_modified": "Wed, 28 May 2014 20:52:45 GMT",
    "redirected": false,
    "server": "AmazonS3",
    "status": 200,
    "x_amz_id_2": "Li22a6y8Vb+mYo8AZTN04QolNK+jlJRD8O3nS/rsmUxTpQMAT0ie8K6rbXOf6sDE",
    "x_amz_request_id": "E077ABA55DC7E255"
}
Further comments

Note that the output does not include the "json" key due to the above bug.

I've been trying to use the "uri" module and hitting an HTTP endpoint which returns some JSON with header Content-Type: application/json. However, one of the strings in my JSON has a line feed (\n control character) in it. This is correctly escaped, and is valid JSON. However, the "uri" module does the following on the response content (see here):

unicode(content.decode('unicode_escape'))

This decodes the \n string to a literal line feed, and causes the json.loads() further down to fail with an exception like:

ValueError: Invalid control character at: line 1 column 58 (char 58)

There's a catch-all exception around the json.loads() that silently ignores this error without setting the .json key.

So two things:

  1. Why does the uri module do this extra (un)escaping, effectively breaking the content in this case?

  2. This issue was hardish to debug as the json.loads() has a catch-all except around it. (Catch-all except clauses are bad practice for this reason.) I recommend changing it to a "except ValueError, error" and then call module.fail_json() with a string representation of the error in the response.

@jimi-c jimi-c added P3 labels May 29, 2014
@mpdehaan mpdehaan added P4 and removed P3 labels Jun 1, 2014
@benhoyt
Copy link
Author

benhoyt commented Jun 20, 2014

Any suggestions for a fix for this, apart from removing \n's from our JSON?

@ianba
Copy link

ianba commented Jul 2, 2014

This bug seems bad as it:

  1. Prevents very simple valid json such as { "foo": "\"bar\"" } from being able to be parsed.
  2. Simply silently fails to provide the parsed json dictionary in such cases, causing processes that might later use that data to fail with a undefined variable error, even though the response appears to contain perfectly valid json. This is very likely to baffle the user.
  3. If one does not control the webservice, there appears to be no workaround, except to use an alternative to the uri module.

Seems like the right thing to do here is either to not try to decode the content and let the user do so as appropriate, or to at least look at the charset in the Content-Type header and use that to decode.

darkk added a commit to darkk/ansible that referenced this issue Jul 3, 2014
…nsible#7586

Regression potential:
 - `raw_content` is written to `dest` file instead of decoded `content`
 - `raw_content` doubles module reply
@mpdehaan
Copy link
Contributor

Hi!

Thanks very much for your interest in Ansible. It sincerely means a lot to us.

On September 26, 2014, due to enormous levels of contribution to the project Ansible decided to reorganize module repos, making it easier
for developers to work on the project and for us to more easily manage new contributions and tickets.

We split modules from the main project off into two repos, http://github.com/ansible/ansible-modules-core and http://github.com/ansible/ansible-modules-extras

If you would still like this ticket attended to, and believe this problem or idea is still present in the latest version of Ansible (1.7.2) or the development branch, we will need your help in having it reopened in one of the two new repos, and instructions are provided below.

We apologize that we are not able to make this transition happen seamlessly, though this is a one-time change and your help is greatly appreciated --
this will greatly improve velocity going forward.

Both sets of modules will ship with Ansible, though they'll receive slightly different ticket handling.

To locate where a module lives between 'core' and 'extras'

Additionally, should you need more help with this, you can ask questions on:

Thank you very much!

@abadger
Copy link
Contributor

abadger commented May 14, 2015

This should have been fixed in ansible/ansible-modules-core#537 however there are some other issues with that code. I've just submitted ansible/ansible-modules-core#1331 to try to fix those issues. If anyone is still interested in this, please test that PR to be sure it doesn't cause a regression in this fix.

abadger pushed a commit that referenced this issue May 14, 2015
…7586

Regression potential:
 - `raw_content` is written to `dest` file instead of decoded `content`
 - `raw_content` doubles module reply
@ansibot ansibot added bug This issue/PR relates to a bug. and removed bug_report labels Mar 6, 2018
@ansible ansible locked and limited conversation to collaborators Apr 25, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug This issue/PR relates to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants