Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using a json.JSONDecoder fails when simplejson is present #4842

Open
ricellis opened this issue Oct 22, 2018 · 15 comments
Open

Using a json.JSONDecoder fails when simplejson is present #4842

ricellis opened this issue Oct 22, 2018 · 15 comments

Comments

@ricellis
Copy link

ricellis commented Oct 22, 2018

It appears that when simplejson is in the environment requests preferentially imports it. However, the arguments accepted by JSONDecoder are inconsistent between the standard library json and simplejson leading to errors when using the standard json.JSONDecoder when simplejson is in the environment.

Your documentation for response.json() says:

Parameters: | **kwargs – Optional arguments that json.loads takes.

The documentation for json.loads says:

json.loads(s, *, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)

and

To use a custom JSONDecoder subclass, specify it with the cls kwarg; otherwise JSONDecoder is used. Additional keyword arguments will be passed to the constructor of the class.

I expected to be able to use a custom json.JSONDecoder without issue and indeed this works on Python 2.7, but in Python 3.7 it fails.

I can see that the issue of preferentially importing simplejson has been raised a few times e.g.
#2516
#3052

and that it is slated for removal in Requests 3, which I guess will resolve this issue.

If it is not possible to resolve this issue some other way in Requests 2.x then it would be nice if the documentation around response.json() was updated to make it clear that the arguments to json.loads could be either simplejson.loads or the standard lib json.loads depending on the environment since the preferential import is effectively undocumented and non-obvious.

Expected Result

A successful JSON load when using r.json(cls=json.JSONDecoder).

Actual Result

    print(r.json(cls=json.JSONDecoder))
  File ".../python3.7/site-packages/requests/models.py", line 897, in json
    return complexjson.loads(self.text, **kwargs)
  File ".../python3.7/site-packages/simplejson/__init__.py", line 535, in loads
    return cls(encoding=encoding, **kw).decode(s)
TypeError: __init__() got an unexpected keyword argument 'encoding'

Reproduction Steps

pip install requests
pip install simplejson
import json
import requests

r = requests.get('http://localhost:5984')
print(r.json(cls=json.JSONDecoder))

System Information

Note this issue does not occur in Python 2.7.x.

$ python -m requests.help
{
  "chardet": {
    "version": "3.0.4"
  },
  "cryptography": {
    "version": ""
  },
  "idna": {
    "version": "2.7"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.7.0"
  },
  "platform": {
    "release": "18.0.0",
    "system": "Darwin"
  },
  "pyOpenSSL": {
    "openssl_version": "",
    "version": null
  },
  "requests": {
    "version": "2.20.0"
  },
  "system_ssl": {
    "version": "1000210f"
  },
  "urllib3": {
    "version": "1.24"
  },
  "using_pyopenssl": false
}
@ricellis ricellis changed the title Using encoding and a json.JSONDecoder fails when simplejson is present Using a json.JSONDecoder fails when simplejson is present Oct 22, 2018
@nateprewitt
Copy link
Member

Hi @ricellis, thanks for revisiting this. I don't believe we'll be seeing this removed in the near future until 3.0.0 is ready. If you'd like to make a small addendum to the documentation about this, we'd be happy to review it.

@ricellis
Copy link
Author

Thanks @nateprewitt I can update the docs, but would you be willing to accept a patch that protects people from the error if they are expecting to be using json, but simplejson happens to be present in the env (probably unbeknownst to them).

For example adding a check to Response.json() that uses the built-in json package if cls extends the built-in json.decoder.JSONDecoder. Something like this https://gist.github.com/ricellis/0690ebfc7f1a56df64df87c7a5251e77

@sigmavirus24
Copy link
Contributor

Your proposed change breaks the deterministic behaviour of that function which is not ideal in the slightest. I don't think we can accept that.

@ricellis
Copy link
Author

breaks the deterministic behaviour of that function

It's annoying that the json() method could end up using a different module to what is used elsewhere, but that seems better than an error to me. It is deterministic though, the outputs are governed by the inputs and I think it only impacts the case that is currently broken:

simplejson cls current proposal
yes extends json.JSONDecoder Error Decoded by json using cls
yes None Decoded by simplejson Decoded by simplejson
no extends json.JSONDecoder Decoded by json using cls Decoded by json using cls
no None Decoded by json Decoded by json

not ideal in the slightest

It was just a first stab at something that might work and I was trying to contain the change to just the specific area of the problem.

I agree it is not the cleanest thing to do, but neither is swapping the JSON module with an API incompatible one. That's where this issue is different from the others that have been reported previously around the preferential import - this has identified a place where simplejson and json are not API compatible yet requests exposes that API to the end user through the Response.json() kwargs.

That in itself might not be a problem when the requests caller is also in control of the module environment. As you can see from our linked issue though the problem for us comes when a user of our library includes another module that itself has a dependency on simplejson. We are using the requests API in good faith, but it is broken by a transitive dependency over which we have no control. I think the discussion around simplejson that has preceded this issue didn't have to account for that.

Perhaps someone can think of a better option than my proposal or maybe an option to disable the simplejson preferential import could be revisited in light of this particular scenario?

Or if you're not interested in fixing 2.x at all then just say so and I'll submit a docs patch and spend my time implementing a workaround in our library instead.

@ricellis
Copy link
Author

ricellis commented Nov 7, 2018

@nateprewitt @sigmavirus24 any thoughts on this ^?

swhmirror pushed a commit to SoftwareHeritage/swh-core that referenced this issue Oct 25, 2019
When the simplejson module is present in the Phython environment, requests will
use it to decode json but a call to response.json(cls=...) will raise an exception
as the arguments accepted by JSONDecoder are inconsistent between the standard library
json and simplejson (more details here: psf/requests#4842).

So ensure to use standard json module for decoding response texts.
@SharpEdgeMarshall
Copy link

Simplejson is slower than stdlib json implementation in python 3, why don't you want to allow to override the default behavior at least?

@ruuda
Copy link

ruuda commented Jul 29, 2020

I ran into this problem when we added a third-party library that had a transitive dependency on simplejson. That consequently broke all of our error handling, because requests now started throwing simplejson.errors.JSONDecodeError instead of json.JSONDecodeError that we had catches for. Judging from the referenced issues above, more users are affected.

@verata-veritatis
Copy link

Having the same issue as the user above. requests should just be using one or the another. Using either or conditionally only creates confusion.

@BarryThrill
Copy link

Any update :)?

@rwst
Copy link

rwst commented Dec 12, 2021

I came here because berserk uses request, and this issue comes up. Can you plz give a sttatus update?

@rwst
Copy link

rwst commented Dec 12, 2021

See also #3052

@krystofernewman
Copy link

Our team has run into a similar issue, where we have simplejson installed but we want to use a custom json encoder class in the prepare_body method which now fails

@nateprewitt
Copy link
Member

Hi @krystofernewman, it doesn't sound like your issue is related to this one directly. We're specifically discussing simplejson's impact on Response.json(). There are likely similar issues on the json Request argument but that's a topic that would be handled separately.

simplejson support was added in a very different time (almost a decade ago) but has been the default for the majority of the library's life. We agreed some time ago that a documentation update noting this would be accepted but cannot change the behavior until a major version bump.

@krystofernewman
Copy link

krystofernewman commented Jan 31, 2022

@nateprewitt gotcha thanks for the response, I can open a separate issue for the impact of simplejson on the encoding of the request's body preparation if that works

eli-schwartz added a commit to eli-schwartz/mythtv that referenced this issue Jan 8, 2024
This uses the requests module and converts requests responses to json
using requests' own `.json()` method on responses. For incomprehensible
reasons, requests has spent about a decade using either simplejson or
the standard library's json module more or less at will, and returning
either one or the other exception types. They don't know why they use
simplejson, we don't know why they use simplejson. In requests 3 (which
will be released in the Year Of The Linux Desktop or when pigs fly,
whichever one comes later) simplejson is dropped entirely.

There are innumerable issues discussing the problem on the requests
bugtracker, with the general consensus being that it's better to
randomly return either one of two different libraries and two different
library return types in errors -- because it was historically done that
way and people might be depending on it. ??????

Bugs:

psf/requests#710
psf/requests#2516
psf/requests#3052
psf/requests#4169
psf/requests#4842
psf/requests#5794
psf/requests#6084

The awkward workaround is to guarantee that requests' silent behavior of
using simplejson *if it is installed* is forcibly triggered by forcibly
depending on simplejson, and then catching the simplejson exception.

The better solution here is pretty simple: do not rely on the requests
module's automatic json conversion, this is as simple as using the
already-imported json module and calling json.loads() on the retrieved
content.

Fixes: 1df343e
Reimplements: bb154a8
rcrdnalor pushed a commit to MythTV/mythtv that referenced this issue Jan 8, 2024
This uses the requests module and converts requests responses to json
using requests' own `.json()` method on responses. For incomprehensible
reasons, requests has spent about a decade using either simplejson or
the standard library's json module more or less at will, and returning
either one or the other exception types. They don't know why they use
simplejson, we don't know why they use simplejson. In requests 3 (which
will be released in the Year Of The Linux Desktop or when pigs fly,
whichever one comes later) simplejson is dropped entirely.

There are innumerable issues discussing the problem on the requests
bugtracker, with the general consensus being that it's better to
randomly return either one of two different libraries and two different
library return types in errors -- because it was historically done that
way and people might be depending on it. ??????

Bugs:

psf/requests#710
psf/requests#2516
psf/requests#3052
psf/requests#4169
psf/requests#4842
psf/requests#5794
psf/requests#6084

The awkward workaround is to guarantee that requests' silent behavior of
using simplejson *if it is installed* is forcibly triggered by forcibly
depending on simplejson, and then catching the simplejson exception.

The better solution here is pretty simple: do not rely on the requests
module's automatic json conversion, this is as simple as using the
already-imported json module and calling json.loads() on the retrieved
content.

Fixes: 1df343e
Reimplements: bb154a8
@olejorgenb
Copy link

olejorgenb commented Aug 27, 2024

Another incompatible change in simplejson: simplejson/simplejson#324 (eg.: by adding a allow_nan parameter and defaulting it to False in the decoder)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants